Chapter 6 Eigenvalues and Eigenvectors

Chapter 6
Eigenvalues and Eigenvectors
Po-Ning Chen, Professor
Department of Electrical and Computer Engineering
National Chiao Tung University
Hsin Chu, Taiwan 30010, R.O.C.
6.1 Introduction to eigenvalues
6-1
Motivations
• The static system problem of Ax = b has now been solved, e.g., by GaussJordan method or Cramer’s rule.
• However, a dynamic system problem such as
Ax = λx
cannot be solved by the static system method.
• To solve the dynamic system problem, we need to find the static feature
of A that is “unchanged” with the mapping A. In other words, Ax maps to
itself with possibly some stretching (λ > 1), shrinking (0 < λ < 1), or being
reversed (λ < 0).
• These invariant characteristics of A are the eigenvalues and eigenvectors.
Ax maps a vector x to its column space C(A). We are looking for a v ∈ C(A)
such that Av aligns with v. The collection of all such vectors is the set of eigenvectors.
6.1 Eigenvalues and eigenvectors
6-2
Conception (Eigenvalues and eigenvectors): The eigenvalue-eigenvector
pair (λi, v i ) of a square matrix A satisfies
Av i = λi v i (or equivalently (A − λi I)v i = 0)
where v i = 0 (but λi can be zero), where v 1, v 2, . . . , are linearly independent.
How to derive eigenvalues and eigenvectors?
• For 3 × 3 matrix A, we can obtain 3 eigenvalues λ1, λ2, λ3 by solving
det(A − λI) = −λ3 + c2 λ2 + c1 λ + c0 = 0.
• If all λ1, λ2, λ3 are unequal, then we continue to derive:









(A − λ1I)v 1 = 0
v 1 = the only basis of the nullspace of (A − λ1I)
Av 1 = λ1v 1
Av 2 = λ2v 2 ⇔ (A − λ2I)v 2 = 0 ⇔ v 2 = the only basis of the nullspace of (A − λ2I)






(A − λ3I)v 3 = 0
v 3 = the only basis of the nullspace of (A − λ3I)
Av 3 = λ3v 3
and the resulting v 1, v 2 and v 3 are linearly independent to each other.
• If A is symmetric, then v 1, v 2 and v 3 are orthogonal to each other.
6.1 Eigenvalues and eigenvectors
6-3
• If, for example, λ1 = λ2 = λ3, then we derive:
(A − λ1I)v 1 = 0
{v 1, v 2} = the basis of the nullspace of (A − λ1I)
⇔
(A − λ3I)v 3 = 0
v 3 = the only basis of the nullspace of (A − λ3I)
and the resulting v 1, v 2 and v 3 are linearly independent to each other
when the nullspace of (A − λ1I) is two dimensional.
• If A is symmetric, then v 1, v 2 and v 3 are orthogonal to each other.
– Yet, it is possible that the nullspace of (A − λ1I) is one dimensional, then
we can only have v 1 = v 2.
– In such case, we say the repeated eigenvalue λ1 only have one eigenvector.
In other words, we say A only have two eigenvectors.
– If A is symmetric, then v 1 and v 3 are orthogonal to each other.
6.1 Eigenvalues and eigenvectors
6-4
• If λ1 = λ2 = λ3, then we derive:
(A − λ1I)v 1 = 0 ⇔ {v 1, v 2, v 3} = the basis of the nullspace of (A − λ1I)
• When the nullspace of (A − λ1I) is three dimensional, then v 1, v 2 and v 3
are linearly independent to each other.
• When the nullspace of (A − λ1I) is two dimensional, then we say A only
have two eigenvectors.
• When the nullspace of (A − λ1I) is one dimensional, then we say A
only have one eigenvector. In such case, the “matrix-form eigensystem”
becomes:


λ1 0 0
A v 1 v 1 v 1 = v 1 v 1 v 1  0 λ1 0 
0 0 λ1
• If A is symmetric, then distinct eigenvectors are orthogonal to each
other.
6.1 Eigenvalues and eigenvectors
6-5
Invariance of eigenvectors and eigenvalues.
•
Property 1: The eigenvectors stay the same for every power of A.
The eigenvalues equal the same power of the respective eigenvalues.
I.e., An v i = λniv i.
Av i = λiv i =⇒ A2v i = A(λiv i) = λi Av i = λ2i v i
•
Property 2: If the nullspace of An×n consists of non-zero vectors, then 0 is the
eigenvalue of A.
Accordingly, there exists non-zero v i to satisfy Av i = 0 · v i = 0.
6.1 Eigenvalues and eigenvectors
6-6
Property 3: Assume with no loss of generality |λ1| > |λ2| > · · · > |λk |. For
any vector x = a1v 1 + a2v 2 + · · · + ak v k that is a linear combination of all
eigenvectors, the normalized mapping P = λ11 A (namely,
•
1
1
1
P v1 =
A v1 =
(Av 1) = λ1v 1 = v 1)
λ1
λ1
λ1
converges to the eigenvector with the largest absolute eigenvalue (when being
applied repeatedly).
I.e.,
1 k
1 k
k
k
k
lim P x = lim k A x = lim k a1A v 1 + a2A v 2 + · · · + ak A v k
k→∞
k→∞ λ1
k→∞ λ1
1
k
k
k k
= lim k a1λ1 v 1 + a2λ2 v 2 + · · · + λk A v k
k→∞ λ1
= a1 v 1 .
steady state
transient states
6.1 How to determine the eigenvalues?
We wish to find a non-zero vector v to satisfy Av = λv; then
(A − λI)v = 0, where v = 0
⇔
det(A − λI) = 0.
So by solving det(A − λI) = 0, we can obtain all the eigenvalues of A.
0.5 0.5
Example. Find the eigenvalues of A =
.
0.5 0.5
Solution.
0.5 − λ 0.5
0.5 0.5 − λ
= (0.5 − λ)2 − 0.52 = λ2 − λ = λ(λ − 1) = 0
⇒ λ = 0 or 1.
det(A − λI) = det
6-7
6.1 How to determine the eigenvalues?
6-8
Proposition: Projection matrix (defined in Section 4.2) has eigenvalues either 1
or 0.
Proof:
• A projection matrix always satisfies P 2 = P . So P 2v = P v = λv.
• By definition of eigenvalues and eigenvectors, we have P 2v = λ2v.
• Hence, λv = λ2v for non-zero vector v, which immediately implies λ = λ2.
• Accordingly, λ is either 1 or 0.
2
Proposition: Permutation matrix has eigenvalues satisfying λk = 1 for some
integer k.
Proof:
• A purmutation matrix always satisfies P k+1 = P for some integer k.


 
0 0 1
v3
Example. P = 1 0 0. Then, P v = v1 and P 3v = v. Hence, k = 3.
v2
0 1 0
• Accordingly, λk+1v = λv, which gives λk = 1 since the eigenvalue of P cannot
be zero.
2
6.1 How to determine the eigenvalues?
6-9
Proposition: Matrix
αmAm + αm−1 Am−1 + · · · + α1A + α0 I
has the same eigenvectors as A, but its eigenvalues become
αmλm + αm−1 λm−1 + · · · + α1λ + α0 ,
where λ is the eigenvalue of A.
Proof:
• Let v i and λi be the eigenvector and eigenvalue of A. Then,
αm Am + αm−1 Am−1 + · · · + α0I v i = αm λm + αm−1 λm−1 + · · · + α0 v i
• Hence, v i and αmλm + αm−1 λm−1 + · · · + α0 are the eigenvector and eigenvalue of the polynomial matrix.
2
6.1 How to determine the eigenvalues?
6-10
Theorem (Cayley-Hamilton): For a square matrix A, define
f (λ) = det(A − λI) = λn + αn−1 λn−1 + · · · + α0.
(Suppose A has linearly independent eigenvectors.) Then
f (A) = An + αn−1 An−1 + · · · + α0I = all-zero n × n matrix.
Proof: The eigenvalues of f (A) are all zeros and the eigenvectors of f (A) remain
the same as A. By definition of eigen-system, we have


0
f (λ1) 0 · · ·


 0 f (λ2) · · ·
0 
f (A) v 1 v 2 · · · v n = v 1 v 2 · · · v n  ..
...
...  .
...

 .
0
0 · · · f (λn )
2
Corollary (Cayley-Hamilton): (Suppose A has linearly independent eigenvectors.)
(λ1I − A)(λ2I − A) · · · (λnI − A) = all-zero matrix
Proof: f (λ) can be re-written as f (λ) = (λ1 − λ)(λ2 − λ) · · · (λn − λ).
2
6.1 How to determine the eigenvalues?
6-11
(Problem 11, Section 6.1) Here is a strange fact about 2 by 2 matrices with eigenvalues λ1 = λ2: The columns of A − λ1I are multiples of the eigenvector x2. Any
idea why this should be?
0 0
Hint: (λ1I − A)(λ2I − A) =
implies
0 0
(λ1I − A)w1 = 0 and (λ1I − A)w2 = 0
where (λ2I − A) = w1 w2 . Hence,
the columns of (λ2I − A) give the eigenvectors of λ1 if they are non-zero vectors
and
the columns of (λ1I − A) give the eigenvectors of λ2 if they are non-zero vectors .
So, the (non-zero) columns of A − λ1I are (multiples of ) the eigenvector x2.
6.1 Why Gauss-Jordan cannot solve Ax = λx?
6-12
• The forward elimination may change the eigenvalues and eigenvectors?
1 −2
Example. Check eigenvalues and eigenvectors of A =
.
2 5
Solution.
– The eigenvalues of A satisfy det(A − λI) = (λ − 3)2 = 0.
1 0 1 −2
– A = LU =
. The eigenvalues of U apparently satisfy
2 1 0 9
det(U − λI) = (1 − λ)(9 − λ) = 0.
– Suppose u1 and u2 are the eigenvectors of U , respectively corresponding
to eigenvalues 1 and 9. Then, they cannot be the eigenvectors of A since if
theywere,

1 0


3u
u1 =⇒ u1 = 0
=
Au
=
LU
u
=
Lu
=

1
1
1
 1
2 1
2

1
0



3u2 = Au2 = LU u2 = 9Lu2 = 9 2 1 u2 =⇒ u2 = 0.
Hence, the eigenvalues are nothing to do with pivots (except for a triangular A).
6.1 How to determine the eigenvalues? (Revisited)
6-13
Solve det(A − λI) = 0.
• det(A − λI) is a polynomial of order n.

a1,3
a1,1 − λ a1,2
 a
a2,2 − λ a2,3
 2,1

f (λ) = det(A − λI) = det  a3,1
a3,2
a3,3 − λ
 ..
...
.
..
 .
an,1
an,2
an,3

···
a1,n

···
a2,n 


···
a3,n 

...
...

· · · an,n − λ
= (a1,1 − λ)(a2,2 − λ) · · · (an,n − λ) + terms of order at most (n − 2)
(By Leibniz formula)
= (λ1 − λ)(λ2 − λ) · · · (λn − λ)
Observations
• The coefficient of λn is (−1)n, provided n ≥ 1.
"
"
• The coefficient of λn−1 is ni=1 λi = ni=1 ai,i = trace(A), provided n ≥ 2.
• ...
• The coefficient of λ0 is
#n
i=1 λi
= f (0) = det(A).
6.1 How to determine the eigenvalues? (Revisited)
6-14
These observations make easy the finding of the eigenvalues of 2 × 2 matrix.
1 1
Example. Find the eigenvalues of A =
.
4 1
Solution.
λ1 + λ2 = 1 + 1 = 2
•
λ1λ2 = 1 − 4 = −3
=⇒ (λ1 − λ2)2 = (λ1 + λ2)2 − 4λ1λ2 = 16
=⇒ λ1 − λ2 = 4
=⇒ λ = 3, −1.
2
1 1
Example. Find the eigenvalues of A =
.
2 2
Solution.
λ1 + λ2 = 3
•
λ1 λ2 = 0
=⇒ λ = 3, 0.
2
6.1 Imaginary eigenvalues
6-15
In some cases, we have to allow imaginary eigenvalues.
• In order to solve polynomial equations f (λ) = 0, the mathematician was forced
to image that there exists a number x satisfying x2 = −1 .
• By this technique, a polynomial equations of order n have exactly n (possibly
complex, not real) solutions.
Example. Solve λ2 + 1 = 0. =⇒ λ = ±i.
2
• Based on this, to solve the eigenvalues, we were forced to accept imaginary
eigenvalues.
0 1
Example. Find the eigenvalues of A =
.
−1 0
Solution. det(A − λI) = λ2 + 1 = 0 =⇒ λ = ±i.
2
6.1 Imaginary eigenvalues
6-16
Proposition: The eigenvalues of a symmetric matrix A (with real entries) are
real, and the eigenvalues of a skew-symmetric (or antisymmetric) matrix B are
pure imaginary.
Proof:
• Suppose Av = λv. Then,
=⇒
=⇒
=⇒
=⇒
=⇒
=⇒
=⇒
Av ∗ = λ∗v ∗ (A real)
(Av ∗)Tv = λ∗(v ∗)Tv (Equivalently, v · Av ∗ = v · λ∗(v ∗))
(v ∗)TAT v = λ∗(v ∗)Tv
(v ∗)TAv = λ∗(v ∗)Tv (Symmetry means AT = A)
(v ∗)Tλv = λ∗(v ∗)T v (Av = λv)
λv2 = λ∗v2 (Eigenvector must be non-zero, i.e., v2 = 0)
λ = λ∗
λ real
6.1 Imaginary eigenvalues
6-17
• Suppose Bv = λv. Then,
=⇒
=⇒
=⇒
=⇒
=⇒
=⇒
=⇒
Bv ∗ = λ∗v ∗ (B real)
(Bv ∗)Tv = (λ∗v ∗)Tv
(v ∗)TB Tv = λ∗(v ∗)Tv
(v ∗)T(−B)v = λ∗(v ∗)Tv (Skew-symmetry means B T = −B)
(v ∗)T(−λ)v = λ∗(v ∗)Tv (Bv = λv)
(−λ)v2 = λ∗v2 (Eigenvector must be non-zero, i.e., v2 = 0)
−λ = λ∗
λ pure imaginary
2
6.1 Eigenvalues/eigenvectors of inverse
6-18
For invertible A, the relation between eigenvalues and eigenvectors of A and A−1
can be well-determined.
Proposition: The eigenvalues and eigenvectors of A−1 are
1
1
1
, v1 ,
, v2 , . . . ,
, vn ,
λ1
λ2
λn
where {(λi, v i)}ni=1 are the eigenvalues and eigenvectors of A.
Proof:
• The eigenvalues of invertible A must be non-zero because det(A) =
#n
i=1 λi
= 0.
• Suppose Av = λv, where v = 0 and λ = 0. (I.e., λ and v are eigenvalue and
eigenvector of A.)
• So, Av = λv =⇒ A−1(Av) = A−1(λv) =⇒ v = λA−1 v =⇒
1
λv
= A−1v.
2
Note: The eigenvalues of Ak and A−1 are respectively λk and λ−1 with the same
eigenvectors as A, where λ is the eigenvalue of A.
6.1 Eigenvalues/eigenvectors of inverse
6-19
Proposition: The eigenvalues of AT are the same as the eigenvalues of A. (But
they may have different eigenvectors.)
Proof: det(A − λI) = det ((A − λI)T) = det (AT − λI T ) = det (AT − λI).
2
Corollary: The eigenvalues of an invertible (real-valued) matrix A satisfying
A−1 = AT are on the unit circle of the complex plain.
Proof: Suppose Av = λv. Then,
=⇒
=⇒
=⇒
=⇒
=⇒
=⇒
Av ∗ = λ∗v ∗ (A real)
(Av ∗)Tv = λ∗(v ∗)T v
(v ∗)TAT v = λ∗(v ∗)Tv
(v ∗)TA−1 v = λ∗(v ∗)Tv (AT = A−1)
(v ∗)T λ1 v = λ∗(v ∗)Tv (A−1 v = λ1 v)
2
∗
2
2
1
v
=
λ
v
(Eigenvector
must
be
non-zero,
i.e.,
v
= 0)
λ
λλ∗ = |λ|2 = 1
0 1
that satisfies AT = A−1.
−1 0
Solution. det(A − λI) = λ2 + 1 = 0 =⇒ λ = ±i.
2
Example. Find the eigenvalues of A =
2
6.1 Determination of eigenvectors
6-20
• After the identification of eigenvalues via det(A−λI) = 0, we can determine
the respective eigenvectors using the nullspace technique.
• Recall (from Slide 3-42) how we completely solve
Bn×nv n×1 = (An×n − λIn×n )v n×1 = 0n×1 .
Answer:
R = rref(B) =
Ir×r
Fr×(n−r)
(with no row exchange)
0(n−r)×r 0(n−r)×(n−r)
−Fr×(n−r)
⇒ Nn×(n−r) =
I(n−r)×(n−r)
Then, every solution v for Bv = 0 is of the form
(null)
v n×1 = Nn×(n−r) w (n−r)×1 for any w ∈ n−r .
Here, we usually find the (n − r) orthonormal bases for the null space
as the representative eigenvectors, which are exactly the (n − r) columns of
Nn×(n−r) (with proper normalization and orthogonalization).
6.1 Determination of eigenvectors
(Problem 12, Section 6.1) Find three eigenvectors for
matrices have λ = 1 and 0):

0.2
Projection matrix
P = 0.4
0
6-21
this matrix P (projection

0.4 0
0.8 0 .
0 1
If two eigenvectors share the same λ, so do all their linear combinations. Find an
eigenvector of P with no zero components.
Solution.
• det(P − λI) = 0 gives λ(1 − λ)2 = 0; so λ = 0, 1, 1.


1 − 12 0
• λ = 1: rref(P − I) = 0 0 0;
0 0 0
1
1 
 
0
0
2
2
1 so F1×2 = − 2 0 , and N3×2 = 1 0, which implies eigenvectors = 1 and 0 .
0 1
0
1
6.1 Determination of eigenvectors
6-22

1 2 0
• λ = 0: rref(P ) = 0 0 0 (with no row exchange);
0 0 1
 
 
 
−2
−2
2
so F = −, and N3×1 =  1 , which implies eigenvector =  1  .
0
0
0

2
6.1 MATLAB revisited
6-23
In MATLAB:
• We can find the eigenvalues and eigenvectors of a matrix A by:
[V D]= eig(A); % Find the eigenvalues/vectors of A
• The columns of V are the eigenvectors.
• The diagonals of D are the eigenvalues.
Proposition:
• The eigenvectors corresponding non-zero eigenvalues are in C(A).
• The eigenvectors corresponding zero eigenvalues are in N (A).
Proof: The first one can be seen from Av = λv and the second one can be proved
by Av = 0.
2
6.1 Some discussions on problems
6-24
(Problem 25, Section 6.1) Suppose A and B have the same eigenvalues λ1, . . . , λn
with the same independent eigenvectors x1, . . . , xn. Then A = B. Reason:
Any vector x is a combintion c1x1 + · · · + cn xn. What is Ax? What is Bx?
Thinking over Problem 25: Suppose A and B have the same eigenvalues and
eigenvectors (not necessarily independent). Can we claim that A = B.
Answer to the thinking: Not necessarily.
2 −2
1 −1
As a counterexample, both A =
and B =
have eigenvalues 0, 0
1 −1
2 −2
1
and single eigenvector
but they are not equal.
1
If however the eigenvectors span the n-dimensional space (such as
them and they are linearly independent), then A = B.

λ1 0 · · ·

 0 λ ···
Hint for Problem 25: A x1 · · · xn = x1 · · · xn  .. ..2 . .
.
. .
0 0 ···
This important fact will be re-emphasized in Section 6.2.
there are n of

0

0
... .

λn
6.1 Some discussions on problems
6-25
(Problem 26, Section 6.1) The block B has eigenvalues 1, 2 and C has eigenvalues
3, 4 and D has eigenvalues 5, 7. Find the eigenvalues of the 4 by 4 matrix A:


0 1 3 0


B C
 −2 3 0 4 
A=
=
.
0 D
 0 0 6 1
0 0 1 6
B C
Thinking over Problem 26: The eigenvalues of
are the eigenvalues of B
0 D
and D because we can show
B C
det
= det(B) · det(D).
0 D
(See Problems 23 and 25 in Section 5.2.)
6.1 Some discussions on problems
6-26
(Problem 23, Section 5.2) With 2 by 2 blocks in 4 by 4 matrices, you cannot always
use block determinants:
$
$
$
$
$A B $
$A B $
$
$
$
$
$ 0 D $ = |A| |D| but $C D$ = |A| |D| − |C| |B|.
(a) Why is the first statement true? Somehow B doesn’t enter.
Hint: Leibniz formula.
(b) Show by example that equality fails (as shown) when C enters.
(c) Show by example that the answer det(AD − CB) is also wrong.
6.1 Some discussions on problems
6-27
−1
(Problem
25,
Section
5.2)
Block
elimination
subtracts
CA
times the first
row A B from the second row C D . This leaves the Schur complement
D − CA−1B in the corner:
I
0 A B
A
B
=
.
−CA−1 I C D
0 D − CA−1 B
Take determinants of these block matrices to prove correct rules if A−1 exists:
$
$
$A B $
−1
$
$
$C D $ = |A| |D − CA B| = |AD − CB| provided AC = CA.
−1
−1
Hint: det(A) · det(D − CA B) = det A(D − CA B) .
6.1 Some discussions on problems
6-28
(Problem 37, Section 6.1)
(a) Find the eigenvalues and eigenvectors of A. They depend on c:
.4 1 − c
A=
.
.6 c
(b) Show that A has just one line of eigenvectors when c = 1.6.
(c) This is a Markov matrix when c = .8. Then An will approach what matrix
A∞ ?
Definition (Markov matrix): A Markov matrix is a matrix with positive
entries, for which every column adds to one.
• Note that some researchers define the Markov matrix by replacing positive by
non-negative; thus, they will use the term positive Markov matrix. The below
observations hold for Markov matrices with “non-negative entries.”
6.1 Some discussions on problems
6-29
• 1 must be one of the eigenvalues of a Markov matrix.
   
1
1
Proof: A and AT have the same eigenvalues, and AT  ...  =  ... .
1
1
2
• The eigenvalues of a Markov matrix satisfy |λ| ≤ 1.
"n
Proof: A v = λv implies i=1 ai,j vi = λvj ; hence, by letting vk satisfying
|vk | = max1≤i≤n |vi|, we have
$ n
$
n
n
$%
$ %
%
$
$
|λvk | = $
ai,k vi$ ≤
ai,k |vi | ≤
ai,k |vk | = |vk |
$
$
T
i=1
which implies the desired result.
i=1
i=1
2
6.1 Some discussions on problems
6-30
(Problem 31, Section 6.1) If we exchange rows 1 and 2 and columns 1 and 2, the
eigenvalues don’t change. Find eigenvectors of A and B for λ1 = 11. Rank one
gives λ2 = λ3 = 0.




1 2 1
6 3 3
A = 3 6 3 and B = P AP T = 2 1 1 .
4 8 4
8 4 4
Thinking over Problem 31: This is sometimes useful in determining the eigenvalues and eigenvectors.
• If Av = λv and B = P AP −1 (for any invertible P ), then λ and u = P v are
the eigenvalue and eigenvector of B.
Proof: Bu = P AP −1u = P Av = P (λv) = λP v = λu.


0 1 0
For example, P = 1 0 0. In such case, P −1 = P = P T.
0 0 1
2
6.1 Some discussions on problems
6-31
With the fact above, we can further claim that
Proposition. The eigenvalues of AB and BA are equal if one of A and B is
invertible.
Proof: This can be proved by (AB) = A(BA)A−1 (or (AB) = B −1(BA)B). Note
that AB has eigenvector Av or (B −1v) if v is the eigenvector of BA.
2
6.1 Some discussions on problems

6-32

λ1 0 · · · 0


 0 λ2 · · · 0 
The eigenvectors of a diagonal matrix Λ =  .. .. . .
 are apparently
. ... 
. .
0 0 · · · λn
 
0
 ... 
 
0
 
 
ei = 1 position i , i = 1, 2, . . . , n
 
0
 .. 
.
0
Hence, A = SΛS −1 have the same eigenvalues as Λ and have eigenvectors v i =
Sei. This implies that
S = v1 v2 · · · vn
where {v i} are the eigenvectors of A.
What if S is not invertible? Then, we have the next theorem.
6.2 Diagnolizing a matrix
6-33
A convenience for the eigen-system analysis is that we can diagonalize a matrix.
Theorem. A matrix A can be written as

λ1

 0
A v 1 v 2 · · · v n = v 1 v 2 · · · v n  ..
.
=S
0
0
λ2
...
0

··· 0

··· 0 
. . . ... 

· · · λn
=Λ
where {(λi, v i)}ni=1 are the eigenvalue-eigenvector pairs of A.
Proof: The theorem holds by definition of eigenvalues and eigenvectors.
Corollary. If S is invertible, then
S −1 AS = Λ equivalently A = SΛS −1 .
2
6.2 Diagnolizing a matrix
This makes easy the computation of polynomial with argument A.
Proposition (recall Slide 6-9): Matrix
αm Am + αm−1 Am−1 + · · · + α0I
has the same eigenvectors as A, but its eigenvalues become
αm λm + αm−1 λm−1 + · · · + α0,
where λ is the eigenvalue of A.
Proposition:
αm Am + αm−1 Am−1 + · · · + α0I = S αm Λm + αm−1 Λm−1 + · · · + α0 S −1
Proposition:
Am = SΛm S −1
6-34
6.2 Diagnolizing a matrix
6-35
Exception
• It is possible that an n × n matrix does not have n eigenvectors.
1 −1
Example. A =
.
1 −1
Solution.
– λ1 = 0 is apparently an eigenvalue for a non-invertible matrix.
– The second eigenvalue is λ2 = trace(A) − λ1 = [1 + (−1)] − λ1 = 0.
1
– This matrix however only has one eigenvector
(or some researchers may
1
say “two repeated eigenvectors”).
– In such case, we still have
1 −1 1 1
1 1 0 0
=
1 −1 1 1
1 1 0 0
A
but S has no inverse.
S
S
Λ
2
In such case, we cannot preform SAS −1 ; so we say A cannot be diagonalized.
6.2 Diagnolizing a matrix
6-36
When will S be guaranteed to be invertible?
One easy answer: When all eigenvalues are distinct (and there are n eigenvalues).
Proof: Suppose for a matrix A, the first k eigenvectors v 1, . . . , v k are linearly independent, but the (k +1)th eigenvector is dependent on the previous k eigenvectors.
Then, for some unique a1, . . . , ak ,
v k+1 = a1v 1 + · · · + ak v k ,
which implies
λk+1v k+1 = Av k+1 = a1Av 1 + . . . + ak Av k = a1λ1v 1 + . . . + ak λk v k
λk+1v k+1 = a1λk+1v 1 + . . . + ak λk+1v k
Accordingly, λk+1 = λi for 1 ≤ i ≤ k; i.e., that v k+1 is linearly dependent on
v 1, . . . , v k only occurs when they have the same eigenvalues. (The proof is not
yet completed here! See the discussion and Example below.)
The above proof said that as long as we have the (k + 1)th eigenvector, then it
is linearly dependent on the previous k eigenvectors only when all eigenvalues are
equal! But sometimes, we do not guarantee to have the (k + 1)th eigenvector.
6.2 Diagnolizing a matrix

5

0
Example. A = 
−1
1
4
1
−1
1
2
−1
3
−1
6-37

1

−1
.
0
2
Solution.
• We have four eigenvalues 1, 2, 4, 4.
• But we can only have three eigenvectors for 1, 2 and 4.
• From the proof in the previous page, the eigenvectors for 1, 2 and 4 are linearly
indepedent because 1, 2 and 4 are not equal.
2
Proof (Continue): One condition that guarantees to have n eigenvectors is that
we have n distinct eigenvalues, which now complete the proof.
2
Definition (GM and AM): The number of linearly independent eigenvectors
for an eigenvalue λ is called its geometric multiplicity; the number of its
appearance in solving det(A − λI) is called its algebraic multiplicity.
Example. In the above example, GM=1 and AM=2 for λ = 4.
6.2 Fibonacci series
6-38
Eigen-decomposition is useful in solving Fibonacci series
Problem: Suppose Fk+2 = Fk+1 + Fk with initially F1 = 1 and F0 = 0. Find F100.
Answer:
k+1 Fk+2
1 1 Fk+1
Fk+2
1 1
F1
•
=
=⇒
=
Fk+1
Fk
Fk+1
F0
1 0
1 0
• So,
99 F100
1 1
1
1
=
= SΛ99 S −1
1 0
0
0
F99
 √ 
99
√
√
√
√ −1 1+ 5
1+ 5 1− 5
1+
5
5
1−
0
1
 2

2
2
2
2
√ 99
=

0
1− 5
1
1
1
1
0
2
 √ 
99
√
√ √ 1+ 5
1−
5
1+ 5 1− 5
0
1
1
1
−
2


2
2
2
√
√ 99 √ √ =

1+ 5
1− 5
0
1− 5
1
1
−1 1+2 5
−
0
2
2
2
6.2 Fibonacci series
6-39

 √ 100 √ 100
1+ 5
1− 5
2

 2
=  √ 99 √ 99  1+ 5
2
= √
1+ 5
2
1− 5
2

1
−
√
1− 5
2
√
√
1+ 5
2
100
1+ 5
2
√ 99
1+ 5
2


−
−
1
−
√ 1− 5
2
1
−1

√ 100
1− 5
2

√ 99  .
1− 5
2
2
6.2 Generalization to uk = Ak u0
6-40
• A generalization of the solution to Fibonacci series is as follows.
Suppose u0 = a1v 1 + a2v 2 + · · · + anv n .
Then uk = Ak u0 = a1Ak v 1 + a2Ak v 2 + · · · + anAk v n
= a1λk1 v 1 + a2λk2 v 2 + . . . + anλknv n.
Examination:
k 1 1
F1
F
uk = k+1 =
= Ak u0
Fk
F0
1 0
and
√ √ 1+
5
1− 5
1
1
1
2
2
u0 =
− √ √ = √ √ 1+ 5
1+ 5
0
1
1
− 1−2 5
− 1−2 5
2
2
a1
So
u99 =
a2
√ 99
1+ 5
2
F100
= √ √ 1+ 5
F99
− 1− 5
2
2
√ 1+ 5
2
1
−
√ 99
1− 5
2
√ 1+ 5
2
−
√ 1− 5
2
√ 1− 5
2
.
1
6.2 More applications of eigen-decomposition
6-41
(Problem 30, Section 6.2) Suppose the same S diagonalizes both A and B. They
have the same eigenvectors in A = SΛ1 S −1 and B = SΛ2 S −1 . Prove that AB =
BA.
Proposition: Suppose both A and B can be diagnosed, and one of A and B
have distinct eigenvalues. Then, A and B have the same eigenvectors if, and only
if, AB = BA.
Proof:
• (See Problem 30 above) If A and B have the same eigenvectors, then
AB = (SΛAS −1 )(SΛB S −1 ) = SΛA ΛB S −1 = SΛB ΛA S −1 = (SΛB S −1 )(SΛAS −1 ) = BA.
• Suppose without loss of generality that the eigenvalues of A are all distinct.
Then, if AB = BA, we have for given Av = λ v and u = Bv ,
Au = A(Bv ) = (AB)v = (BA)v = B(Av ) = B(λv ) = λ(Bv ) = λu.
Hence, u = Bv and v are both the eigenvectors of A corresponding to the
same eigenvalue λ.
6.2 More applications of eigen-decomposition
6-42
"n
Let u = i=1 aiv i, where {v i}ni=1 are linearly independent eigenvectors of A.
"n
"n
Au = i=1 aiAv i = i=1 aiλi v i
=⇒ ai(λi − λ) = 0 for 1 ≤ i ≤ n.
"
Au = λu
= ni=1 aiλv i
This implies
u =
%
aiv i = av = Bv .
i:λi =λ
Thus, v is an eigenvector of B.
2
6.2 Problem discussions
6-43
(Problem 32, Section 6.2) Substitute A = SΛS −1 into the product (A − λ1 I)(A −
λ2I) · · · (A − λnI) and explain why this produces the zero matrix. We are substituting the matrix A for the number λ in the polynomial p(λ) = det(A − λI).
The Cayley-Hamilton Theorem says that this product is always p(A) =zero
matrix, even if A is not diagonalizable.
Thinking over Problem 32: The Cayley-Hamilton Theorem can be easily proved
if A is diagonalizable.
Corollary (Cayley-Hamilton): Suppose A has linearly independent eigenvectors.
(λ1I − A)(λ2I − A) · · · (λnI − A) = all-zero matrix
Proof:
(λ1I − A)(λ2I − A) · · · (λnI − A)
= S(λ1I − Λ)S −1 S(λ2I − Λ)S −1 · · · S(λn I − Λ)S −1
= S (λ1I − Λ) (λ2I − Λ) · · · (λnI − Λ) S −1
(1, 1)th entry= 0 (2, 2)th entry= 0
(n, n)th entry= 0
−1 = S all-zero entries S = all-zero entries
2
6.3 Applications to differential equations
6-44
We can use eigen-decomposition to solve




a1,1u1(t) + a1,2u2(t) + · · · + a1,nun(t)
u1(t)




du
d  u2(t)
 a2,1u1(t) + a2,2u2(t) + · · · + a2,nun(t) 
= Au = 
 . =

...
dt  .. 
dt


un (t)
an,1u1(t) + an,2u2(t) + · · · + an,n un(t)
where A is called the companion matrix.
By differential equation technique, we know that the solution is of the form
u = eλtv
for some constant λ and constant vector v.
du
= Au?
dt
du
= λeλtv = Au = Aeλt v
Solution: Take u = eλtv into the equation:
dt
=⇒ λv = Av.
Question: What are all λ and v that satisfy
The answer are all eigenvalues and eigenvectors.
6.3 Applications to differential equations
6-45
du
Since
= Au is a linear system, a linear combination of solutions is still a
dt
solution.
Hence, if there are n eigenvectors, the complete solution can be represented as
c1eλ1 tv 1 + c2eλ2 tv 2 + · · · + cn eλntv n
where c1 , c2, . . . , cn are determined by the initial conditions.
Here, for convenience of discussion at this moment, we assume that A gives us
exactly n eigenvectors.
6.3 Applications to differential equations
It is sometimes convenient to re-express the solution of
initial condition u(0) as eAt u(0), i.e.,
6-46
du
= Au for a given
dt
c1eλ1 tv 1 + c2eλ2 tv 2 + · · · + cn eλnt v n

 
λ1 t
0 ··· 0
e
c1



 0 eλ2 t · · · 0   c2 

Λt −1
At
= v 1 v 2 · · · v n  ..
S
Sc
=
e
u(0)
=
Se



.
.
.
.
..
. . ..   .. 
 .
u(0)
eAt
S
λn t
0
0 ··· e
cn
where we define



eλ 1 t 0






 0 eλ 2 t

Λt −1



Se S = S  ...
...

At
e 
0
0



∞

%

1


(At)k ,


k!
k=0
···
0
···
...
0
...


 −1
S ,


if S −1 exists
· · · eλnt
holds no matter whether S −1 exist or not
6.3 Applications to differential equations
and hence u(0) = Sc. Note that
∞
∞ k
∞ k
%
%
%
1
t
t
(At)k =
Ak =
(SΛk S −1 ) = S
k!
k!
k!
k=0
k=0
&
k=0
∞ k
%
t
k=0
k!
6-47
'
Λk
S −1 = SeΛt S −1 .
Key to remember: Again, we define by convention that


0
f (λ1) 0 · · ·


0  −1
 0 f (λ2) · · ·
f (A) = S  ..
...
...  S .
...
 .

0
0 · · · f (λn )

So,
e
At


= S

e
λ1 t
0
...
0
0
eλ 2 t
...
0

··· 0

· · · 0  −1
S .
. . . ... 

· · · eλ n t
6.3 Applications to differential equations
6-48
We can solve the second-order equation in the same manner.
Example. Solve my + by + ky = 0.
Answer:
• Let z = y . Then, the problem is reduced to
y = z
du
= Au
=⇒
dt
z = − mk y − mb z
y
0
1
.
and A =
with u =
z
− mk − mb
• The complete solution for u is therefore
c1eλ1 tv 1 + c2eλ2 tv 2.
2
6.3 Applications to differential equations
6-49
Example. Solve y + y = 0 with initial y(0) = 1 and y (0) = 0.
Solution.
• Let z = y . Then, the problem is reduced to
y = z
du
= Au
=⇒
dt
z = −y
y
0 1
with u =
and A =
.
z
−1 0
• The eigenvalues and eigenvectors of A are
−ı
ı
ı,
and
−ı,
.
1
1
• The complete solution for u is therefore
y
1
−ı
ı
ıt −ı
−ıt ı
=
c
+
c
with
initially
=
c
+
c
.
e
e
1
2
1
2
y
1
1
0
1
1
So, c1 =
ı
2
and c2 = − 2ı . Thus,
ı
1
1
ı
y(t) = eıt(−ı) − e−ıt ı = eıt + e−ıt = cos(t).
2
2
2
2
2
6.3 Applications to differential equations
6-50
In practice, we will use a discrete approximation to approximate a continuous
function. There are however three different discrete approximations.
For example, how to approximate y (t) = −y(t) by a discrete system?
Yn+1 − 2Yn + Yn−1
=
(∆t)2
Yn+1 −Yn
∆t
−Yn−1 Forward approximation
n−1
− Yn−Y
∆t
=
∆t
−Yn
Centered approximation
−Yn+1 Backward approximation
Let’s take forward approximation as an example, i.e.,
Yn+1 −Yn
n−1
− Yn−Y
∆t
∆t
Zn
Zn−1
∆t
Thus,
y (t) = z(t)
z (t) = −y(t)
= −Yn−1.


 Yn+1 − Yn = Zn
∆t
≈
Z
− Zn

 n+1
= −Yn
∆t
≡
Yn+1
1 ∆t Yn
=
Zn+1
Zn
−∆t 1
un+1
A
un
6.3 Applications to differential equations
Then, we obtain
1
ı
n
1
0
−
1
1
(1
+
ı∆t)
2
2
un = An u0 =
0
0
(1 − ı∆t)n 12 2ı
ı −ı
and
(1 + ı∆t)n + (1 − ı∆t)n 2 n/2
Yn =
cos(n · ∆θ)
= 1 + (∆t)
2
where ∆θ = tan−1 (∆t).
Problem: |Yn| → ∞ as n large.
6-51
6.3 Applications to differential equations
6-52
Backward approximation will leave to Yn → 0 as n large.
y (t) = z(t)
z (t) = −y(t)


 Yn+1 − Yn = Zn+1
∆t
≈
Zn+1 − Zn


= −Yn+1
∆t
1 −∆t Yn+1
Y
≡
= n
Zn+1
Zn
∆t 1
Ã−1
un+1
un
A solution to the problem: Interleave the forward approximation with
the backward approximation.

Yn+1 − Yn


= Zn
y (t) = z(t)
1
0
Y
1
∆t
Yn
n+1
∆t
≡
=
≈
Z
− Zn
Zn
∆t 1 Zn+1
0 1

z (t) = −y(t)
 n+1
= −Yn+1
∆t
un+1
un
We then perform this so-called leapfrog method. (See Problems 28 and 29.)
−1 Yn+1
1 ∆t Yn
Yn
1 0
1
∆t
=
=
Zn+1
Zn
0 1
∆t 1
−∆t 1 − (∆t)2 Zn
un+1
un
|eigenvalues|=1 if ∆t≤2
un
6.3 Applications to differential equations
6-53
(Problem 28, Section 6.3) Centering y = −y in Example 3 will produce Yn+1 −
2Yn + Yn−1 = −(∆t)2Yn. This can be written as a one-step difference equation for
U = (Y, Z):
1 0 Yn+1
1 ∆t Yn
Yn+1 = Yn + ∆t Zn
=
.
∆t 1 Zn+1
0 1
Zn+1 = Zn − ∆t Yn+1
Zn
Invert the matrix on the left side to write this as Un+1 = AUn . Show that detA =
1. Choose the large time step ∆t = 1 and find the eigenvalues λ1 and λ2 = λ̄1 of
A:
1 1
A=
has |λ1| = |λ2| = 1. Show that A6 is exactly I.
−1 0
After 6 steps to t = 6, U6 equals U0. The exact y = cos(t) returns to 1 at t = 2π.
6.3 Applications to differential equations
6-54
(Problem 29, Section 6.3) That centered choice (leapfrog method ) in Problem 28√is
very successful for small time steps ∆t. But find the eigenvalues of A for ∆t = 2
and 2:
√ 1 2
1√
2
and A =
.
A=
−2 −3
− 2 −1
Both matrices have |λ| = 1. Compute A4 in both cases and find the eigenvectors
of A. That value ∆t = 2 is at the border of instability. Time steps ∆t > 2 will
lead to |λ| > 1, and the powers in Un = An U0 will explode.
Note You might say that nobody would compute with ∆t > 2. But if an atom
vibrates with y = −1000000y, then ∆t > .0002 will give instability. Leapfrog
has a very strict stability limit. Yn+1 = Yn + 3Zn and Zn+1 = Zn − 3Yn+1 will
explode because ∆t = 3 is too large.
6.3 Applications to differential equations
6-55
A better solution to the problem: Mix the forward approximation
with the backward approximation.

Yn+1 − Yn Zn+1 + Zn


∆t
∆t
=
y (t) = z(t)
Y
Yn
1
−
1
n+1
2
2
∆t
2
≡
=
≈
∆t
∆t
Z
− Zn
Yn+1 + Yn
1
Z
−
1 Zn

n+1
z (t) = −y(t)
 n+1
2
2
=−
∆t
2
un+1
un
We then perform this so-called trapezoidal method. (See Problem 30.)
∆t 2
∆t −1
∆t
1
1− 2
∆t
Yn+1
1
Yn
Yn
1 −2
2
∆t 2
= ∆t
=
∆t 2
∆t
Zn+1
1
− 2 1 Zn
Zn
−∆t
1− 2
2
1 + 2
un+1
un
|eigenvalues|=1 for all ∆t>0
un
6.3 Applications to differential equations
6-56
(Problem 30, Section 6.3) Another good idea for y = −y is the trapezoidal method
(half forward/half back): This may be the best way to keep (Yn, Zn ) exactly on a
circle.
1
∆t/2 Yn
1 −∆t/2 Yn+1
=
.
Trapezoidal
−∆t/2 1
Zn+1
Zn
∆t/2
1
(a) Invert the left matrix to write this equation as Un+1 = AUn . Show that A is an
orthogonal matrix: ATA = I. These points Un never leave the circle.
A = (I − B)−1(I + B) is always an orthogonal matrix if B T = −B (See the
proof on next page).
(b) (Optional MATLAB) Take 32 steps from U0 = (1, 0) to U32 with ∆t = 2π/32. Is
U32 = U0? I think there is a small error.
6.3 Applications to differential equations
Proof:
A = (I − B)−1(I + B)
⇒ (I − B)A = (I + B)
⇒ AT (I − B)T = (I + B)T
⇒ AT (I − B T ) = (I + B T)
⇒ AT (I + B) = (I − B)
⇒ AT = (I − B)(I + B)−1
⇒ AT A =
=
=
=
=
(I
(I
(I
(I
I
− B)(I + B)−1(I − B)−1 (I + B)
− B)[(I − B)(I + B)]−1(I + B)
− B)[(I + B)(I − B)]−1(I + B)
− B)(I − B)−1(I + B)−1 (I + B)
6-57
6.3 Applications to differential equations
6-58
Question: What if the number of eigenvectors is smaller than n?
Recall that it is convenient to say that the solution of du/dt = Au is eAt u(0)
for some constant vecor u(0).
Conveniently, we can presume that eAtu(0) is the solution of du/dt = Au even if
A does not have n eigenvectors.
Example. Solve y − 2y + y = 0 with initially y(0) = 1 and y (0) = 0.
Answer.
• Let z = y . Then, the problem is reduced to
y = z
du
0 1 y
y
=
= Au
=
=⇒
−1
2
z
z
dt
z = −y + 2z
A
• The solution is still eAt u(0) but
eAt = SeΛt S −1
because there is only one eigenvector for A (it doesn’t matter whether we regard
this case as “S does not exist” or we regard this case as “S exists but has no
inverse”).
6.3 Applications to differential equations
0 1 1 1
1 1 1 0
=
−1 2 1 1
1 1 0 1
6-59
A
S
S
and eAt = eλIte(A−λI)t = eλIte(A−I)t
Λ
?
eAt = eIt+(A−I)t = eIte(A−I)t (since (It)((A − I)t) = ((A − I)t)(It). See below.)
'& ∞
'
&∞
% 1
% 1
k k
I t
(A − I)k tk
=
k!
k!
k=0
k=0
' '& 1
'
&& ∞
∞
% 1
% 1
%
1 k k
Can we use eAt =
tk I
(A − I)k tk
A t ?
=
k!
k!
k!
k=0
k=0
k=0
where we know I k = I for every k ≥ 0 and (A − I)k =all-zero matrix for
k ≥ 2. This gives
t At
e = e I (I + (A − I)t) = et (I + (A − I)t)
and
1
1
−
t
1
u(t) = eAt u(0) = eAt
= et (I + (A − I)t)
= et
0
−t
0
Note that eA eB , eB eA and eA+B may not be equal except AB = BA!
2
6.3 Applications to differential equations
6-60
Properities of eAt
• The eigenvalues of eAt are eλt for all eigenvalues λ of A.
For example, eAt = et (I + (A − I)t) has repeated eigenvalues et, et .
• The eigenvectors of eAt remain the same as A.
For example, eAt
1
= et (I + (A − I)t) has only one eigenvector
.
1
• The inverse of eAt always exist (hint: eλt = 0), and is equal to e−At .
For example, e−At = e−Ite(I−A)t = e−t (I − (A − I)t).
• The transpose of eAt is
&∞
'T & ∞
'
%
%
AtT
1 k k
1 Tk k
T
=
=
A t
(A ) t = eA t.
e
k!
k!
k=0
k=0
T
T
Hence, if AT = −A (skew-symmetric), then eAt eAt = eA teAt = I; so, eAt
is an orthogonal matrix (Recall that a matrix Q is orthogonal if QT Q = I).
6.3 Quick tip
6-61
• For a 2 × 2 matrix A =
a1,2
.
eigenvalue λ1 is
λ1 − a1,1
a1,1 a1,2
, the eigenvector corresponding to the
a2,1 a2,2
a1,2
a1,1 − λ1
a1,2
0
a1,2
=
=
(A − λ1I)
?
λ1 − a1,1
a2,1
a2,2 − λ1 λ1 − a1,1
where ?= 0 because the two row vectors of (A − λ1I) are parallel.
This is especially useful when solving the differential equation because a1,1 = 0
and a1,2 = 1; hence, the eigenvector v 1 corresponding to the eigenvalue λ1 is
1
.
v1 =
λ1
Solve y − 2y + y = 0. Let z = y . Then, the problem is reduced to
y = z
du
0 1 y
1
y
=
=
Au
=⇒
v
=
=
=⇒
1
−1 2 z
z
1
dt
z = −y + 2z
A
6.3 Problem discussions
6-62
Problem 6.3B and Problem 31: A convenient way to solve d2u/dt = −A2u.
(Problem 31, Section 6.3) The cosine of a matrix is defined like eA, by copying
the series for cos t:
1
1
1
1
cos t = 1 − t2 + t4 − · · · cos A = I − A2 + A4 − · · ·
2!
4!
2!
4!
(a) If Ax = λx, multiply each term times x to find the eigenvalue of cos A.
π π
(b) Find the eigenvalues of A =
with eigenvectors (1, 1) and (1, −1). From
π π
the eigenvalues and eigenvectors of cos A, find that matrix C = cos A.
(c) The second derivative of cos(At) is −A2 cos(At).
d2 u
u(t) = cos(At)u(0) solve 2 = −A2u starting from u (0) = 0.
dt
Construct u(t) = cos(At)u(0) by the usual three steps for that specific A:
1. Expand u(0) = (4, 2) = c1x1 + c2x2 in the eigenvectors.
2. Multiply those eigenvectors by
3. Add up the solution u(t) = c1
and
x1 + c2
(instead of eλt).
x2. (Hint: See Slide 6-48.)
6.3 Problem discussions
6-63
(Problem 6.3B,
√ Section 6.3) Find the eigenvalues and eigenvectors of A and write
u(0) = (0, 2 2, 0) as a combination of the eigenvectors. Solve both equations
u = Au and u = Au:




−2
1
0
−2 1 0
2
du 
u 
d
 u with du (0) = 0.
=
1
−2
1
= 1 −2 1  u and
dt
dt2
dt
0 1 −2
0 1 −2
The 1, −2, 1 diagonals make A into a second difference matrix (like a second
derivative).
u = Au is like the heat equation ∂u/∂t = ∂ 2u/∂x2.
Its solution u(t) will decay (negative eigenvalues).
u = Au is like the wave equation ∂ 2u/∂t2 = ∂ 2u/∂x2.
Its solution will oscillate (imaginary eigenvalues).
6.3 Problem discussions
6-64
d2 u
= Au.
Thinking over Problem 6.3B and Problem 31: How about solving
dt
1/2
Solution. The answer should be eA tu(0). Note that A1/2 has the same eigenvectors as A but its eigenvalues are the square root of A.

 1/2
0
eλ 1 t 0 · · ·


1/2
λ2 t

1/2
···
0 
0 e
 S −1 .
eA t = S 
... 
...
 ...
...


0
0
1/2

· · · eλn
−2
As an example from Problem 6.3B, A =  1
0
√
The eigenvalues of A are −2 and −2 ± 2.
So,
 √
eı 2 t √ 0
√

A1/2 t
ı
2−
2 t

=S 0 e
e
0
0
t

1 0
−2 1 .
1 −2

√0
e
ı
√
2+ 2 t
 −1
S .

6.3 Problem discussions
6-65
Final reminder on the approach delivered in the textbook
• In solving
du(t)
= At with initial u(0), it is convenient to find the solution as
dt
u(0) = Sc = c1v 1 + · · · + cn v n
u(t) = SeΛt c = c1eλ1 v 1 + · · · + cn eλn v n
This is what the textbook always does in Problems.
6.3 Problem discussions
6-66
You should practice these problems by yourself !
Problem 15: How to solve
du
= Au − b (for invertible A)?
dt
(Problem 15, Section 6.3) A particular solution to du/dt = Au − b is up = A−1 b,
if A is invertible. The usual solutions to du/dt = Au give un. Find the complete
solution u = up + un :
du
du
1 0
4
=u−4
(b)
=
u−
.
(a)
6
1 1
dt
dt
Problem 16: How to solve
du
= Au − ect b (for non-eigenvalue c of A)?
dt
(Problem 16, Section 6.3) If c is not an eigenvalue of A, substitute u = ect v and
find a particular solution to du/dt = Au−ectb. How does it break down when c is
an eigenvalue of A? The “nullspace” of du/dt = Au contains the usual solutions
eλit xi .
Hint: The particular solution v p satisfies (A − cI)v p = b.
6.3 Problem discussions
6-67
Problem 23: eAeB , eB eA and eA+B are not necessarily equal. (They are equal
when AB = BA.)
(Problem 23, Section 6.3) Generally eA eB is different from eB eA. They are both
different from eA+B . Check this using Problems 21-2 and 19. (If AB = BA, all
these are the same.)
1 4
0 −4
1 0
A=
B=
A+B =
0 0
0 0
0 0
Problem 27: An interesting brain-storming problem!
(Problem 27, Section 6.3) Find a solution x(t), y(t) that gets large as t → ∞. To
avoid this instability a scientist exchanges the two equations:
dx/dt = 0x − 4y
dy/dt = −2x + 2y
becomes
dy/dt = −2x + 2y
dx/dt = 0x − 4y
−2 2
Now the matrix
is stable. It has negative eigenvalues. How can this be?
0 −4
Hint: The matrix is not the right one to be used to describe the transformed linear
equations.
6.4 Symmetric matrices
6-68
Definition (Symmetric matrix): A matrix A is symmetric if A = AT.
• When A is symmetric,
– its eigenvalues are all reals;
– it has n orthogonal eigenvectors (so it can be diagonalized). We can then
normalize these orthogonal eigenvectors to obtain an orthonormal basis.
Definition (Symmetric diagonalization): A symmetric matrix A can be
written as
A = QΛQ−1 = QΛQT with Q−1 = QT
where Λ is a diagonal matrix with real eigenvalues at the diagonal.
6.4 Symmetric matrices
6-69
Theorem (Spectral theorem or principal axis theorem) A symmetric
matrix A with distinct eigenvalues can be written as
A = QΛQ−1 = QΛQT with Q−1 = QT
where Λ is a diagonal matrix with real eigenvalues at the diagonal.
Proof:
• We have proved on Slide 6-16 that a symmetric matrix has real eigenvalues
and a skew-symmetric matrix has pure imaginary eigenvalues.
• Next, we prove that eigenvectors are orthogonal, i.e., v T1 v 2 = 0, for unequal
eigenvalues λ1 = λ2.
Av 1 = λ1v 1 and Av 2 = λ2v 2
=⇒ (λ1v 1)Tv 2 = (Av 1)T v 2 = v T1 AT v 2 = v T1 Av 2 = v T1 λ2v 2
which implies
(λ1 − λ2)v T1 v 2 = 0.
Therefore, v T1 v 2 = 0 because λ1 = λ2.
2
6.4 Symmetric matrices
6-70
• A = QΛQT implies that
A = λ1q 1q T1 + · · · + λnq nq Tn
= λ1 P 1 + · · · + λn P n
Recall from Slide 4-37, the projection matrix Pi onto a unit vector q i is
Pi = q i (qTi q i)−1 q Ti = q i q Ti
So, Ax is the sum of projections of vector x onto each eigenspace.
Ax = λ1P1x + · · · + λnPnx.
The spectral theorem can be extended to a symmetric matrix with repeated
eigenvalues.
To prove this, we need to introduce the famous Schur’s theorem.
6.4 Symmetric matrices
6-71
Theorem (Schur’s Theorem): Every square matrix A can be factorized into
A = QT Q−1 with T upper triangular and Q† = Q−1 ,
where “†” denotes Hermitian transpose operation (and Q and T can generally
have complex entries!)
Further, if A has real eigenvalues (and hence has real eigenvectors), then Q and T
can be chosen real.
Proof: The existence of Q and T such that A = QT Q† and Q† = Q−1 can be
proved by induction.
• The theorem trivially holds when A is a 1 × 1 matrix.
• Suppose that Schur’s Theorem is valid for all (n − 1) × (n − 1) matrices. Then,
we claim that Schur’s Theorem will hold true for all n × n matrices.
– This is because we can take t1,1 and q 1 to be the eigenvalue and unit
eigenvector of An×n (as there must exist at least one pair of eigenvalue and
eigenvector for An×n ). Then, choose any unit p2, . . ., pn such that they
together with q 1 span the n-dimensional space, and
P = q 1 p2 · · · pn
is a (unitary) orthonormal matrix, satisfying P † = P −1.
6.4 Symmetric matrices
– We derive

6-72

q †1
 †  p2 
Aq
P †AP =  .. 
Ap
·
·
·
Ap
1
2
n
.
p†n
 
q †1
 † p 
=  ..2  t1,1q 1 Ap2 · · · Apn
.
p†n


t1,1 t1,2 · · · t1,n


=  0..

. Ã(n−1)×(n−1)
0
where t1,j = q †1Apj and
Ã(n−1)×(n−1)

p†2Ap2
(since A
q1
eigenvector
p†2Apn

···
 ..
...  .
...
= .

†
†
pnAp2 · · · pnApn
=
t1,1 q 1)
eigenvalue
6.4 Symmetric matrices
6-73
– Since Schur’s Theorem is true for any (n − 1) × (n − 1) matrix, we can find
Q̃(n−1)×(n−1) and T̃(n−1)×(n−1) such that
Ã(n−1)×(n−1) = Q̃(n−1)×(n−1) T̃(n−1)×(n−1) Q̃†(n−1)×(n−1)
and
Q̃†(n−1)×(n−1) = Q̃−1
(n−1)×(n−1) .
– Finally, define
Qn×n
1
0T
=P
0 Q̃(n−1)×(n−1)


and
Tn×n
They satisfy that
Q†n×n Qn×n =
t1,1 [t1,2 · · · t1,n]Q̃(n−1)×(n−1)


=  0..
.
T̃(n−1)×(n−1)
.
0
T
1
0
1
0
= In×n
P †P
†
0 Q̃(n−1)×(n−1)
0 Q̃(n−1)×(n−1)
T
6.4 Symmetric matrices
and
Qn×n Tn×n =
=
=
=
=
6-74


t
[t
·
·
·
t
]
Q̃
1,n
1,1 1,2
(n−1)×(n−1)
1
0T
0

Pn×n
 ..

T̃(n−1)×(n−1)
.
0 Q̃(n−1)×(n−1)
0


t1,1 [t1,2 · · · t1,n]Q̃(n−1)×(n−1)


Pn×n  0..

. Q̃(n−1)×(n−1) T̃(n−1)×(n−1)
0


t1,1 [t1,2 · · · t1,n]Q̃(n−1)×(n−1)


Pn×n  0..

. Ã(n−1)×(n−1) Q̃(n−1)×(n−1)
0


t1,1 t1,2 · · · t1,n 0T
0
 1
Pn×n  ..

. Ã(n−1)×(n−1) 0 Q̃(n−1)×(n−1)
0
T
1
0
†
Pn×n (Pn×n
= An×n Qn×n
An×n Pn×n )
0 Q̃(n−1)×(n−1)
6.4 Symmetric matrices
6-75
• It remains to prove that if A has real eigenvalues, then Q and T can be chosen
real, which can be similarly proved by induction. Suppose “If A has real
eigenvalues, then Q and T can be chosen real.” is true for all (n − 1) × (n − 1)
matrices. Then, the claim should be true for all n × n matrices.
– For a given An×n with real eigenvalues (and real eigenvectors), we can
certainly have the real t1,1 and q 1, and so are p2, . . . , pn. This makes real
the resultant Ã(n−1)×(n−1) and t1,2, . . . , t1,n.
The eigenvector associated with a real eigenvalue can be chosen real. For
complex v, and real λ and A, by its definition,
Av = λv
is equivalent to
A · Re{v} = A · λ · Re{v} and A · Im{v} = A · λ · Im{v}.
6.4 Symmetric matrices
6-76
– The proof is completed by noting that the eigenvalues of Ã(n−1)×(n−1) ,
satisfying


t1,1 t1,2 · · · t1,n


P †AP =  0..
,
. Ã(n−1)×(n−1)
0
are also the eigenvalues of An×n ; hence, they are all reals. So, by the validity
of the claimed statement for (n − 1) × (n − 1) matrices, the existence of
real Q̃(n−1)×(n−1) and real T(n−1)×(n−1) satisfying
Ã(n−1)×(n−1) = Q̃(n−1)×(n−1) T̃(n−1)×(n−1) Q̃†(n−1)×(n−1)
and
is confirmed.
Q̃†(n−1)×(n−1) = Q̃−1
(n−1)×(n−1)
2
6.4 Symmetric matrices
6-77
Two important facts:
• P †AP and A have the same eigenvalues but possibly different eigenvectors. A
simple proof is that for v = P v ,
(P †AP )v = P †Av = P †(λv) = λP †v = λv .
• (Section 6.1: Problem 26 or see slide 6-25)
B C
det(A) = det
= det(B) · det(D)
0 D
Thus,

t1,1 t1,2 · · · t1,n


P †AP =  0..

. Ã(n−1)×(n−1)
0

implies that the eigenvalues of Ã(n−1)×(n−1) should be the eigenvalues of
P †An×n P .
6.4 Symmetric matrices
6-78
Theorem (Spectral theorem or principal axis theorem) A symmetric
matrix A (not necessarily with distinct eigenvalues) can be written as
A = QΛQ−1 = QΛQT with Q−1 = QT
where Λ is a diagonal matrix with real eigenvalues at the diagonal.
Proof:
• A symmetric matrix certainly satisfies Schur’s Theorem.
A = QT Q† with T upper triangular and Q† = Q−1 .
• A has real eigenvalues. So, both T and Q are reals.
• By AT = A, we have
AT = QT TQT = QT QT = A.
This immediately gives
TT = T
which implies the off-diagonals are zeros.
6.4 Symmetric matrices
• By AQ = QT , equivalently,
6-79


Aq 1 = t1,1q 1



Aq = t q
2,2 2
2
...




Aq = t q
n,n n
n
we know that Q is the matrix of eigenvectors (and there are n of them) and T
is the matrix of eigenvalues.
2
This result immediately indicates that a symmetric A can always be diagonalized.
Summary
• A symmetric matrix has real eigenvalues and n real orthogonal eigenvectors.
6.4 Problem discussions
6-80
(Problem 21, Section 6.4)(Recommended) This matrix M is skew-symmetric and
also
. Then all its eigenvalues are pure imaginary and they also have
|λ| = 1. (M x = x for every x so λx = x for eigenvectors.) Find all
four eigenvalues from the trace of M :


0 1 1 1

1 
−1 0 −1 1 
M=√ 
 can only have eigenvalues ı or − ı.
3 −1 1 0 −1
−1 −1 1 0
Thinking over Problem 21: The eigenvalues of an orthogonal matrix satisfies
|λ| = 1.
Proof:
Qv2 = (Qv)†Qv = v †Q† Qv = v † v = v2
implies
λv = v.
2
|λ| = 1 and λ pure imaginary implies λ = ±ı.
6.4 Problem discussions
6-81
(Problem 15, Section 6.4) Show that A (symmetric but complex) has only
one line of eigenvectors:
ı 1
is not even diagonalizable: eigenvalues λ = 0, 0.
A=
1 −ı
AT = A is not such a special property for complex matrices. The good property
is ĀT = A (Section 10.2). Then all λ’s are real and eigenvectors are orthogonal.
Thinking over Problem 15: That a symmetric matrix A satisfying AT = A has
real eigenvalues and n orthogonal eigenvectors is only true for real symmetric
matrices.
For a complex matrix A, we need to rephrase it as “A Hermitian symmetric
matrix A satisfying A† = A has real eigenvalues and n orthogonal eigenvectors.”
Inner product and norm for complex vectors:
v · w = w†v
and
v2 = v †v
6.4 Problem discussions
6-82
Proof of the red-color claim (in the previous slide): Suppose Av = λv. Then,
Av = λv
=⇒ (Av)†v = (λv)†v
=⇒
=⇒
=⇒
=⇒
=⇒
(i.e., (Āv̄)Tv = (λ̄v̄)Tv)
v †A† v = λ∗v †v
v †Av = λ∗v † v
v †λv = λ∗v †v
λv2 = λ∗v2
λ = λ∗
2
(Problem 28, Section 6.4) For complex matrices, the symmetry AT = A that
produces real eigenvalues changes to ĀT = A. From
the
det(A − λI) = 0, find
eigenvalues of the 2 by 2 “Hermitian” matrix A = 4 2 + ı; 2 − ı 0 = ĀT . To
see why eigenvalues are real when ĀT = A, adjust equation (1) of the text to
Āx̄ = λ̄x̄. (See the green box above.)
6.4 Problem discussions
6-83
(Problem 27, Section 6.4)
Take
(MATLAB)
two
symmetric matrices with different
1 0
8 1
eigenvectors, say A =
and B =
. Graph the eigenvalues λ1(A + tB)
0 2
1 0
and λ2(A + tB) for −8 < t < 8. Peter Lax says on page 113 of Linear Algebra
that λ1 and λ2 appear to be on a collision course at certain values of t. “Yet at
the last minute they turn aside.” How close do they come?
Correction for Problem 27: The problem should be “. . . Graph the eigenvalues
λ1 and λ2 of A + tB for −8 < t < 8. . . ..”
Hint: Draw the pictures of λ1(t) and λ2(t) with respect to t and check λ1(t)−λ2(t).
6.4 Problem discussions
6-84
(Problem 29, Section 6.4) Normal matrices have ĀTA = AĀT . For real matrices, ATA = AAT
includes symmetric, skew symmetric, and orthogonal matrices. Those have real λ, imaginary λ
and |λ| = 1. Other normal matrices can have any complex eigenvalues λ.
Key point: Normal matrices have n orthonormal eigenvectors. Those vectors xi
probably will have complex components. In that complex case orthogonality means x̄Ti xj = 0
as Chapter 10 explains. Inner products (dot products) become x̄T y.
The test for n orthonormal columns in Q becomes Q̄T Q = I instead of QT Q = I.
A has n orthonormal eigenvectors (A = QΛQ̄T ) if and only if A is normal.
(a) Start from A = QΛQ̄T with Q̄T Q = I. Show that ĀT A = AĀT .
(b) Now start from ĀT A = AĀT . Schur found A = QT Q̄T for every matrix A, with a triangular
T . For normal matrices we must show (in 3 steps) that this T will actually be diagonal.
Then T = Λ.
Step 1. Put A = QT Q̄T into ĀT A = AĀT to find T̄ TT = T T̄ T.
a b
Step 2: Suppose T =
has T̄ TT = T T̄ T. Prove that b = 0.
0 d
Step 3: Extend Step 2 to size n. A normal triangular T must be diagonal.
6.4 Problem discussions
6-85
Important conclusion from Problem 29: A matrix A has n orthogonal eigenvectors if, and only if, A is normal.
Definition (Normal matrix): A matrix A is normal if A†A = A†A.
Proof:
• Schur’s theorem: A = QT Q† with T upper triangular and Q† = Q−1.
• A† A = AA† =⇒ T † T = T T † =⇒ T diagonal =⇒ Q is the matrix of
eigenvectors by AQ = QT .
2
6.5 Positive definite matrices
6-86
Definition (Positive definite): A symmetric matrix A is positive definite if its eigenvalues are all positive.
• The above definition only applies to a symmetric matrix because a non-symmetric
matrix may have complex eigenvalues (which cannot be compared with zero)!
Properties of positive definite (symmetric) matrices
•
Equivalent Definition: (symmetric) A is positive definite if, and only if,
xT Ax > 0 for all non-zero x.
xT Ax is usually referred to as the quadratic form!
The proofs can be found in Problems 18 and 19.
(Problem 18, Section 6.5) If Ax = λx then xTAx =
prove that λ > 0.
. If xTAx > 0,
6.5 Positive definite matrices
6-87
(Problem 19, Section 6.5) Reverse Problem 18 to show that if all λ > 0 then
xT Ax > 0. We must do this for every nonzero x, not just the eigenvectors.
So write x as a combination of the eigenvectors and explain why all “cross
terms” are xTi xj = 0. Then xT Ax is
(c1x1 +· · ·+cn xn )T(c1λ1x1 +· · ·+cn λn xn) = c21 λ1xT1 x1 +· · ·+c2n λnxTn xn > 0.
Proof (Problems 18):
– (only if part : Problem 19) Av i = λiv i implies v Ti Av i = λiv Ti v i > 0 for all
eigenvalues {λi}ni=1 and eigenvectors {v i}ni=1. The proof is completed by
"
noting that with x = ni=1 ci v i and {v i }ni=1 orthogonal,

& n
' n
n
%
%
%
T
T 
x (Ax) =
ci v i
cj λj v j  =
c2i λiv i2 > 0.
i=1
j=1
i=1
– (if part : Problem 18) Taking x = v i, we obtain that v Ti (Av i) = v Ti (λv i ) =
λiv i2 > 0, which implies λi > 0.
2
6.5 Positive definite matrices
6-88
• Based on the above proposition, we can conclude similar to two positive scalars
(as “if a and b are both positive, so is a + b”) that
Proposition: If A and B are both positive definite, so is A + B.
• The next property provides an easy way to construct positive definite matrices.
T
Proposition: If An×n = Rn×m
Rm×n and Rm×n has linearly independent
columns, then A is positive definite.
Proof:
– x non-zero =⇒ Rx non-zero.
– Then, xT (RTR)x = (Rx)T(Rx) = Rx2 > 0.
2
6.5 Positive definite matrices
6-89
Since (symmetric) A = QΛQT , we can choose R = QΛ1/2QT , which requires
R to be a square matrix. This observation gives another equivalent definition
of positive definite matrices.
Equivalent Definition: An×n is positive definite if, and only if, there exists
Rm×n with independent columns such that A = RTR.
6.5 Positive definite matrices
•
6-90
Equivalent Definition: An×n is positive definite if, and only if, all pivots
are positive.
Proof:
– (By LDU decomposition,) A = LDLT, where D is a diagonal matrix with
pivots as diagonals, and L is a lower triangular matrix with 1 as diagonals.
Then,
xT Ax = xT (LDLT)x = (LTx)TD(LTx)
T 
l1 x
T
T
= y Dy where y = L x =  ... 
lTnx
= d1y12 + · · · + dnyn2
T 2
T 2
= d1 l 1 x + · · · + dn l n x .
– So, if all pivots are positive, xTAx > 0 for all non-zero x. Conversely, if
xT Ax > 0 for all non-zero x, which in turns implies y TDy > 0 for all
non-zero y (as L is invertible), then pivot di must be positive for the choice
of yj = 0 except for j = i .
2
6.5 Positive definite matrices
6-91
An extension of the previous proposition is:
Suppose A = BCB T with B invertible and C diagonal. Then, A is positive
definite if, and only if, diagonals of C are all positive! See Problem 35.
(Problem 35, Section 6.5) Suppose C is positive definite (so y T Cy > 0 whenever y = 0) and A has independent columns (so Ax = 0 whenever x = 0).
Apply the energy test to xTAT CAx to show that ATCA is positive definite:
the crucial matrix in engineering.
6.5 Positive definite matrices
6-92
Equivalent Definition: An×n is positive definite if, and only if, the n upper
left determinants (i.e., the (n − 1) leading principle minors and det(A)) are
all positive.
Definition (Minors, Principle minors and leading principle minors):
– A minor of a matrix A is the determinant of some smaller square matrix,
obtained by removing one or more of its rows or columns.
– The first-order (respectively, second-order, etc) minor is a minor, obtained
by removing (n − 1) (respectively, (n − 2), etc) rows or columns.
– A principle minor of a matrix is a minor, obtained by removing the same
rows and columns.
– A leading principle minor of a matrix is a minor, obtaining by removing
the last few rows and columns.
6.5 Positive definite matrices
6-93
The below example gives three upper left determinants (or two leading principle minors and det(A)), det(A1×1), det(A2×2) and det(A3×3 ).
a1,1 a1,2
a2,1 a2,2
a1,3
a2,3
a3,1
a3,3
a3,2
Proof of the equivalent definition:
By LU decomposition,

1 0 0 ···
l
 2,1 1 0 · · ·

A =  l3,1 l3,2 1 · · ·
 ..
...
... . . .
 .
ln,1 ln,2 ln,3 · · ·
=⇒ dk =
This can be proved based on Slide 5-18:

0
d1 u1,2 u1,3

0
  0 d2 u2,3

0  0 0 d3

...
... 
  ... ...
0 0
0
1
det(Ak×k )
det(A(k−1)×(k−1) )

· · · u1,n
· · · d2,n 


· · · d3,n 
. . . ... 

· · · dn
2
6.5 Positive definite matrices
6-94
Equivalent Definition: An×n is positive definite if, and only if, all (2n − 2)
principle minors and det(A) are positive.
• Based on the above equivalent definitions, a positive definite matrix cannot
have either zero or negative value in its main diagonals. See Problem 16.
(Problem 16, Section 6.5) A positive definite matrix cannot have a zero (or even
worse, a negative number) on its diagonal. Show that this matrix fails to have
xT Ax > 0:

 
4
1
1
x1
x1 x2 x3 1 0 2 x2 is not positive when (x1, x2, x3) = ( , , ).
x3
1 2 5
With the above discussion of positive definite matrices, we can proceed to define similar notions like negative definite, positive semidefinite, negative
semidefinite matrices.
6.5 Positive semidefinite (or nonnegative definite)
6-95
Definition (Positive semidefinite): A symmetric matrix A is positive
semidefinite if its eigenvalues are all nonnegative.
Equivalent Definition: A is positive semidefinite if, and only if, xTAx≥0 for
all non-zero x.
Equivalent Definition: An×n is positive semidefinite if, and only if, there exists
Rm×n (perhaps with dependent columns) such that A = RTR.
Equivalent Definition: An×n is positive semidefinite if, and only if, all pivots
are nonnegative.
Equivalent Definition: An×n is positive semidefinite if, and only if, all (2n − 2)
principle minors and det(A) are non-negative.


1 0 0
Example. Non-“positive semidefinite” A = 0 0 0  has (23−2) principle minors.
0 0 −1
1 0
1 0
0 0
det
, det
, det
, det 1 , det 0 , det −1 .
0 0
0 −1
0 −1
Note: It is not sufficient to define the positive semidefinite based on the nonnegativity of leading principle minors and det(A). Check the above example.
6.5 Negative definite, negative semidefinite, indefinite
6-96
We can similarly define negative definite.
Definition (Negative definite): A symmetric matrix A is negative definite if its eigenvalues are all negative.
Equivalent Definition: A is negative definite if, and only if, xTAx<0 for all
non-zero x.
Equivalent Definition: An×n is negative definite if, and only if, there exists
Rm×n with independent columns such that A = −RTR.
Equivalent Definition: An×n is negative definite if, and only if, all pivots are
negative.
Equivalent Definition: An×n is negative definite if, and only if, all odd-order
leading principle minors are negative and all even-order leading principle minors
and det(A) are positive.
Equivalent Definition: An×n is negative definite if, and only if, all odd-order
principle minors are negative and all even-order principle minors and det(A) are
positive.
6.5 Negative definite, negative semidefinite, indefinite
6-97
We can also similarly define negative semidefinite.
Definition (Negative definite): A symmetric matrix A is negative
semidefinite if its eigenvalues are all non-positive.
Equivalent Definition: A is negative semidefinite if, and only if, xT Ax≤0 for
all non-zero x.
Equivalent Definition: An×n is negative semidefinite if, and only if, there
exists Rm×n (possibly with dependent columns) such that A = −RTR.
Equivalent Definition: An×n is negative semidefinite if, and only if, all pivots
are non-positive.
Equivalent Definition: An×n is negative definite if, and only if, all odd-order
principle minors are non-positive and all even-order principle minors and det(A)
are non-negative.
Note: It is not sufficient to define the negative semidefinite based on the nonpositivity of leading principle minors and det(A).
Finally, if some of the eigenvalues of a symmetric matrix are positive, and some are
negative, the matrix will be referred to as indefinite.
6.5 Quadratic form
6-98
• For a positive definite matrix A2×2 ,
xTAx = xT(QΛQT )x = (QT x)T Λ(QTx)
T q x
= y T Λy where y = QT x = 1T
q2 x
2
2
= λ1 q T1 x + λ2 q T2 x
So, xTAx = c gives an ellipse if c > 0.
5 4
Example. A =
4 5
1 1
1
1
and q 2 = √
.
– λ1 = 9 and λ2 = 1, and q 1 = √
2 1
2 −1
2 2
+
x
−
x
x
x
1
1
2
2
√ 2 +
√ 2
– xTAx = λ1 (q T1 x) + λ2 (q T2 x) = 9
2
2
q T1 x is an axis perpendicular to q 1. (Not along q 1 as the textbook said.)
–
q T2 x is an axis perpendicular to q 2.
Tip: A = QΛQT is called the principal axis theorem (cf. Slide 6-69) because
xT Ax = y T Λy = c is an ellipse with axes along the eigenvectors.
6.5 Problem discussions
6-99
(Problem 13, Section 6.5) Find a matrix with a > 0 and c > 0 and a + c > 2b
that has a negative eigenvalue.
a b
Missing point in Problem 13: The matrix to be determined is of the shape
.
b c
6.6 Similar matrices
6-100
Definition (Similarity): Two matrices A and B are similar if B = M −1AM
for some invertible M .
Theorem: A and M −1 AM have the same eigenvalues.
Proof: For eigenvalue λ and eigenvector v of A, define v = M −1v. We then derive
Bv = (M −1AM )v = M −1AM (M −1v) = M −1Av = M −1λv = λM −1v = λv So, λ is also the eigenvalue of M −1AM (associated with eigenvector v = M −1 v).
2
Notes:
• The LU decomposition over a symmetric matrix A gives A = LDLT but A and
D (pivot matrix) apparently may have different eigenvalues. Why? Because
(LT)−1 = L. Similarity is defined based on M −1, not M T.
• The converse to the above theorem is wrong!. In other words, we cannot
say that “two matrices with the same eigenvalues are similar.”
0 1
0 0
Example. A =
and B =
have the same eigenvalues 0, 0, but
0 0
0 0
they are not similar.
6.6 Jordan form
6-101
• Why introducing matrix similarity?
Answer: We can then extend the diagonalization of a matrix with less than
n eigenvectors to the Jordan form.
• For an un-diagonalizable matrix A, we can find invertible M such that
A = M JM −1,
where

J1

0
J =  ..
.
0
0
J2
...
0

··· 0

··· 0
. . . ... 

· · · Js
and s is the number of distinct eigenvalues, and the size of Ji is equal to the
multiplicity of eigenvalue λi, and Ji is of the form


λi 1 0 · · · 0
0 λ 1 ··· 0


i


.
Ji =  0 0 λi · · · .. 
 .. .. .. . .

. 1
. . .
0 0 0 · · · λi
6.6 Jordan form
6-102
• The idea behind the Jordan form is that A is similar to J.
• Based on this, we can now compute
A100 = M J 100M −1

J1100 0

 0 J2100
= M  ..
...
 .
0
0

··· 0

· · · 0  −1
M .
. . . ... 

· · · Js100
• How to find Mi corresponding to Ji?
Answer: By AM = M J with M = M1 M2 · · · Ms , we know
AMi = Mi Ji
for
1 ≤ i ≤ s.
Specifically, assume that the multiplicity of λi is two. Then,
) (
) λ 1 (
i
(2)
(1)
(2)
A v (1)
vi = vi vi
i
0 λi
which is equivalent to
(1)
(1)
Av i = λiv i
(2)
(1)
(2)
Av i = v i + λi v i
=⇒
(1)
(A − λiI)v i = 0
(2)
(1)
(A − λiI)v i = v i
6.6 Problem discussions
6-103
(Problem 14, Section 6.6) Prove that AT is always similar to A (we know the λ’s
are the same):
1. For one Jordan block Ji: Find Mi so that Mi−1JiMi = JiT .
2. For any J with blocks Ji: Build M0 from blocks so that M0−1JM0 = J T.
3. For any A = M JM −1: Show that AT is similar to J T and so to J and to A.
6.6 Problem discussions
6-104
Thinking over Problem 14: AT and A are always similar.
Answer:
• It can

u1,1

 u2,1
 ..
 .
un,1
be easily checked that


u1,2 · · · u1,n
0

 ..
u2,2 · · · u2,n 
.
=

 ..
... . . . ...

.
1
un,2 · · · un,n
··· 0
··· 1
· · · ...
··· 0

un,n
 ..
 .
= V −1  ..
 .
u1,n

un,n
1
  ..
0  .
...   ...

0
u1,n
· · · un,2
. . . ...
· · · u2,2
· · · u1,2
· · · un,2
. . . ...
· · · u2,2
· · · u1,2

un,1
... 

V
u2,1 
u1,1

un,1
0
...   ...


u2,1   ...
1
u1,1
···
···
···
···
0
1
...
0

1

0
... 

0
where here V represents a matrix with zero entries except vi,n+1−i = 1 for
1 ≤ i ≤ n. We note that V −1 = V T = V .
So, we can use the proper size of Mi = V to obtain
JiT = Mi−1JiMi .
6.6 Problem discussions
Define
6-105



M =

We have
where

M1 0 · · · 0

0 M2 · · · 0 
... . . . ... 
...

0 0 · · · Ms
J T = M JM −1,

J1

0
J =  ..
.
0
0
J2
...
0

··· 0

··· 0
. . . ... 

· · · Js
and s is the number of distinct eigenvalues, and the size of Ji is equal to the
multiplicity of eigenvalue λi, and Ji is of the form


λi 1 0 · · · 0
0 λ 1 ··· 0


i


.
Ji =  0 0 λi · · · .. 
 .. .. .. . .

. 1
. . .
0 0 0 · · · λi
6.6 Problem discussions
6-106
Finally, we know
– A is similar to J;
– AT is similar to J T ;
– J T is similar to J.
So, AT is similar to A.
2
(Problem 19, Section 6.6) If A is 6 by 4 and B is 4 by 6, AB and BA have different
sizes. But with blocks,
I −A AB 0 I A
0 0
M −1F M =
=
= G.
0 I
B 0 0 I
B BA
(a) What sizes are the four blocks (the same four sizes in each matrix)?
(b) This equation is M −1 F M = G, so F and G have the same eigenvalues. F
has the 6 eigenvalues of AB plus 4 zeros; G has the 4 eigenvalues of BA plus
6 zeros. AB has the same eigenvalues as BA plus
zeros.
6.6 Problem discussions
6-107
Thinking over Problem 19: Am×n Bn×m and Bn×mAm×n have the same eigenvalues except for additional (m − n) zeros.
Solution: The example shows the usefulness of the similarity.
−Am×n Am×n Bn×m 0m×n Im×m Am×n
0m×n
I
0
• m×m
= m×m
0n×m In×n
Bn×m
0n×n 0n×m In×n
Bn×m Bn×m Am×n
Am×n Bn×m 0m×n
0m×m
0m×n
• Hence,
and
are similar and have the
Bn×m
0n×n
Bn×m Bn×m Am×n
same eigenvalues.
• From Problem 26 in Section 6.1 (see Slide 6-25), the desired claim (as indicated
above by red-color text) is proved.
2
6.6 Problem discussions
6-108
(Problem 21, Section 6.6) If J is the 5 by 5 Jordan block with λ = 0, find J 2 and
count its eigenvectors (are these the eigenvectors?) and find its Jordan form (there
will be two blocks).
Problem 21 : Find the Jordan form of A = J 2, where


0 1 0 0 0
 0 0 1 0 0




J =  0 0 0 1 0 .


 0 0 0 0 1
0 0 0 0 0
Solution.

0
0


2
• A = J = 0

0
0
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0

0
0


1, and the eigenvalues of A are 0, 0, 0, 0, 0.

0
0
6.6 Problem discussions




6-109


v1,1
v1,1
v3,1
v  v 
v 
 2,1  4,1
 2,1
   
 
• Av 1 = A v3,1 = v5,1 = 0 =⇒ v3,1 = v4,1 = v5,1 = 0 =⇒ v 1 =  0 .
   
 
v4,1  0 
 0 
0
0
v5,1
   
 

v1,2
v1,2
v3,2
v  v 




v3,2 = v1,1
 2,2  4,2
v2,2
   
 
=⇒ v 2 = v1,1
• Av 2 = A v3,2 = v5,2 = v 1 =⇒ v4,2 = v2,1

   
 

v5,2 = 0
v4,2  0 
v2,1
0
v5,2
0
   
 

v1,3
v1,3
v3,3

v3,3 = v1,2



v  v 


v = v
 2,3  4,3
v2,3
4,3
2,2
   
 
=⇒ v 3 = v1,2
• Av 3 = A v3,3 = v5,3 = v 2 =⇒

   
 
v5,3 = v1,1


v4,3  0 
v2,2

v = 0
2,1
0
v5,3
v1,1
6.6 Problem discussions
• To summarize,
 
 
 
v1,1
v1,2
v1,3
v 
v 
v 
 2,1
 2,2
 2,3
 
 
 
v 1 =  0  =⇒ v 2 = v1,1 =⇒ If v2,1 = 0, then v 3 = v1,2
 
 
 
 0 
v2,1
v2,2
0
0
v1,1

 
 
 
(1)
(1)

1
v
v




 
 1,2
 1,3


(1)
(1) 








 0
v2,2 
v2,3 


 
 
 (1)
(1)
(1)
(1)

v
=
=
=
=⇒
v
=⇒
v




v1,2 

1
0
1
2
3





 (1)






v 


0 
0



 2,2 



0
1
0
 
 
(2)

0
v

1,2


 
 (2)



v 



 1
 2,2 



 

(2)
(2)

=
=
v
=⇒
v


 0 

0
1
2



 




 1 


0


 



0
0
6-110
6.6 Problem discussions
6-111
(1)
(2)
(1)
Since we wish to choose each of v 2 and v 2 to be orthogonal to both v 1 and
(2)
v1 ,
(1)
(1)
(2)
(2)
v1,2 = v2,2 = v1,2 = v2,2 = 0,
i.e.,

 
 
 
(1)

1
0
v




 
 
 1,3

(1) 








0
 0
v2,3 


 
 
 
(1)
(1)
(1)

v
=
=
=
=⇒
v
=⇒
v



 0 


0
1
1
2
3




 






 0 



0
0



 



1
0
0
 
 

0
0















1
 0


 
 
(2)
(2)

=
=
v
=⇒
v


 0

0
1
2



 




 1


0


 



0
0
(1)
(1)
(2)
(1)
(2)
Since we wish to choose v 3 to be orthogonal to all of v 1 , v 1 , v 2 and v 2 ,
(1)
(1)
v1,3 = v2,3 = 0.
6.6 Problem discussions
6-112
As a result,
(
A
(1)
v1
(1)
v2

(1)
v3
(2)
v1
(2)
v2
)
(
(1)
(1)
(2)
= v (1)
v2 v3 v1
1
0
0
)
(2) 
v 2 0

0
0
1
0
0
0
0
0
1
0
0
0
0
0
0
0
0

0
0


0

1
0
2
(1)
(1)
(1)
(2)
(2)
• Are all v 1 , v 2 , v 3 , v 1 and v 2 eigenvectors of A (satisfying Av = λv)?
Hint: A does not have 5 eigenvectors. More specifically, A only has 2 eigenvectors? Which two? Think of it.
• Hence, the problem statement may not be accurate as I mark “(are these the
eigenvectors?)”.
6.7 Singular value decomposition (SVD)
6-113
• Now we can represent all square matrix in the Jordan form:
A = M −1JM.
• What if the matrix Am×n is not a square one?
– Problem: Av = λv is not possible! Specifically, v cannot have two different
dimensionalities.
Am×n v n×1 = λv n×1 infeasible if n = m.
– So, we can only have
Am×n v n×1 = σum×1 .
6.7 Singular value decomposition (SVD)
• If we can find enough numbers of orthogonal u and v such that

σ1 · · · 0
 ... . . . ...


 0 · · · σr
Am×n v 1 v 2 · · · v n n×n = u1 u2 · · · um m×m 
0 ··· 0


V
U
 ... . . . ...
0 ··· 0
6-114
0
...
0
0
...
0
···
...
···
···
...
···

0
... 


0
.

0
... 

0 m×n
where r is the rank of A, then we can perform the so-called singular value
decomposition (SVD)
A = U ΣV −1 .
Note again that the required enough number of orthogonal u and v may be
impossible when A has repeated “eigenvalues.”
6.7 Singular value decomposition (SVD)
6-115
• If V is chosen to be an orthogonal matrix satisfying V −1 = V T , then we have
the so-called reduced SVD
A = U ΣV T
=
r
%
σi ui v Ti
i=1

  T
σ1 · · · 0
v1
.
.
.
= u1 · · · ur m×r  .. . . ..   ... 
0 · · · σr r×r v Tr r×n
• Usually, we prefer to choose an orthogonal V (as well as orthogonal U ).
• In the sequel, we will assume the found U and V are orthogonal matrices in
the first place; later, we will confirm that orthogonal U and V can always be
found.
6.7 How to determine U , V and {σi}?
6-116
T
• Am×n = Um×m Σm×n Vn×n
T T
T
=⇒ ATn×m Am×n = Um×m Σm×n Vn×n
Um×m Σm×n Vn×n
T
T
T
Um×m Σm×n Vn×n
= Vn×n Σ2n×n Vn×n
.
= Vn×n ΣTn×m Um×m
So, V is the (orthogonal) matrix of n eigenvectors of symmetric AT A.
T
• Am×n = Um×m Σm×n Vn×n
T
T
T
=⇒ Am×n ATn×m = Um×m Σm×n Vn×n
Um×m Σm×n Vn×n
T
T
T
= Um×m Σm×n Vn×n
Vn×n ΣTn×m Um×m
= Um×m Σ2m×m Um×m
.
So, U is the (orthogonal) matrix of m eigenvectors of symmetric AAT .
• Remember that AT A and AAT have the same eigenvalues except for additional
(m − n) zeros.
Section 6.6, Problem 19 (See Slide 6-107): Am×n Bn×m and Bn×m Am×n have
the same eigenvalues except for additional (m − n) zeros.
In fact, there are only r non-zero eigenvalues for AT A and AAT , which satisfy
λi = σi2,
where 1 ≤ i ≤ r.
6.7 How to determine U , V and {σi}?
6-117
Example (Problem 21 in Section 6.6): Find the SVD of A = J 2, where


0 1 0 0 0
 0 0 1 0 0




J =  0 0 0 1 0 .


 0 0 0 0 1
0 0 0 0 0
Solution.

0
0


T
• A A = 0

0
0

0
0


• So V = 1

0
0
0
0
0
0
0
0
0
0
1
0
0
0
1
0
0
0
0
0
0
1
0
0
0
1
0
1
0
0
0
0

0
1


0
0


0 and AAT = 0


0
0
1
0


0
1 0


1
0 1


0 and U = 0 0


0
0 0
0
0 0

0
1
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
1
0

0 0
0 0


0 0.

0 0
0 0


1
0


0
0


0 for Λ = Σ2 = 0


0
0
0
1
0
1
0
0
0
0
0
1
0
0
0
0
0
0
0

0
0


0.

0
0
6.7 How to determine U , V and {σi}?
• Hence,

0
0


0

0
0
0
0
0
0
0
1
0
0
0
0
A
0
1
0
0
0

0
1


0 0
 
1 = 0
 
0 0
0
0

0
1
0
0
0
0
0
1
0
0
U
0
0
0
1
0

1
0


0  0

0  0

0  0
0
1
0
1
0
0
0
0
0
1
0
0
Σ
0
0
0
0
0
6-118

0
0


0  0

0  0

0  1
0
0
0
0
0
0
1
1
0
0
0
0
VT
0
1
0
0
0

0
0


1

0
0
2
We may compare this with the Jordan form.

 

0 0 1 0 0
0
1 0 0 0 0
0 0 0 1 0 0 0 0 1 0 0

 


 

0 0 0 0 1 = 0 1 0 0 0 0

 

0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0
0
0 0 1 0 0
A
M
1
0
0
0
0
0
1
0
0
0
J
0
0
0
0
0

0
1

0
 0

0  0

1  0
0
0
0
0
0
1
0
0
1
0
0
0
M −1
0
0
0
0
1

0
0


1

0
0
6.7 How to determine U , V and {σi}?
6-119
Remarks on the previous example
• In the above example,

±1
0


• Hence, Σ =  0

0
0
0
±1
0
0
0

1
0


we only know Λ = Σ2 = 0

0
0

0 0 0
0 0 0


±1 0 0.

0 0 0
0 0 0
0
1
0
0
0
0
0
1
0
0
0
0
0
0
0

0
0


0.

0
0
• However, by Av i = σiui = (−σi)(−ui), we can always choose positive σi by
adjusting the sign of ui.
6.6 SVD
6-120
Important notes:
• In terminology, σi , where 1 ≤ i ≤ r, is called the singular value.
Hence, the name of singular value decomposition is used.
• The singular value is always non-zero (even though the eigenvalues of A can
be zeros).




0 0 1 0 0
1 0 0 0 0
 0 0 0 1 0
 0 1 0 0 0








Example. A = 0 0 0 0 1 but Σ = 0 0 1 0 0.




 0 0 0 0 0
 0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
• It is good to know that:
The first r columns of V ∈ row space R(A) and are bases of R(A)
The last (n − r) columns of V ∈ null space N (A) and are bases of N (A)
The first r columns of U ∈ column space C(A) and are bases of C(A)
The last (m − r) columns of U ∈ left null space N (AT) and are bases of N (AT )
How useful the above facts are can be seen from the next example.
6.6 SVD
6-121
Example. Find the SVD of A4×3 = x4×1y T1×3.
Solution.

σ1

0
• A is a rank-1 matrix. So, r = 1 and Σ = 
0
0
σ1 = 1.)
• The base of the row space is v 1 =
span the null space.
y
y ;
• The base of the column space is u1 =
that span the left null space.
• The SVD is then
A4×3 =
0
0
0
0

0

0
, where σ1 = 0. (Actually,
0
0
pick up perpendicular v 2 and v 3 that
x
x ;
pick up perpendicular u2, u3, u4


yT
x
u
 yT 
2 u
3 u
4
Σ4×3  v 2 
x
left null space
4×4
column space
v T3 3×3
row space
*
null space
2