Determinants Recall that a 2×2 matrix M = a b c d is invertible only if

14C. Determinants and the Characteristic Polynomial
1
Determinants


Recall that a 2 × 2 matrix M = a b  is invertible
 c d
only if ad – bc ≠ 0; indeed, its inverse is given by
€
1  d −b
M −1 =€
−c a ,
ad − bc
so the condition ad – bc ≠ 0 is both necessary and
sufficient for this to happen. We call the quantity
ad – bc€the determinant of the matrix M (because
it determines whether M is invertible) and we
denote it det M. Similarly,
Proposition The 3 × 3 matrix
€
a11 a12 a13 


M = a21 a22 a23 
a

 31 a32 a33 
is invertible if and only if
€
a11a 22a33 − a12a23a31 + a13a 21a32
− a11a 23a32 + a12a 21a33 − a13a 22a31 ≠ 0.
€
//
14C. Determinants and the Characteristic Polynomial
2
This proposition (whose proof is messy but
straightfoward) defines the determinant
detM = a11a22a33 − a12a23 a31 + a13a21a32
− a11a 23a32 + a12a 21a33 − a13a 22a31
of a 3 × 3 matrix.
€ To see how to define the determinant of larger
€ matrices, let M = aij now represent a general n × n
( )
matrix. We work from the principle that however
we define det M, it should be a number which, if 0,
tells us that M is noninvertible. That €
is,
€
det:Matn,n → R should be a function with the
property that det M = 0 if and only if M is not
invertible.
€
€
Smith gives a detailed exposition (in Appendix A) of
how the general determinant is defined, but we will
not follow this discussion as it carries us far from
our immediate goals. We simply summarize the
important parts of this theory.
Instead of defining a determinant function from
Matn,n to R , we replace the input matrix M with
the list of its row vectors M1, M 2,…, Mn and work
instead to define a related function
€
€
14C. Determinants and the Characteristic Polynomial
3
δ :R n × R n ×L × R n → R
with the intention that detM = δ(M1, M 2,…, Mn ).
Proceeding
in this fashion, however, turns out to be
€
superior, because, as we shall see, det is not a
linear function whereas δ is.
€
More specifically, we want δ to be a multilinear
function, namely a function that is linear in each
€
component: for i = 1, … , n, and for all scalars a, b,
€
δ(M1,…, aM i + bM′i ,…, M n )
= a ⋅ δ(M1,…, M i ,…, M n ) + b ⋅ δ (M1,…, M′i,…, Mn ).
Such a function is also called a multilinear form.
€
Our motivation now comes from earlier work on
solving systems of equations via row reduction.
Recall that a system of linear equations can be
expressed in matrix form as MX = B, where M is
the coefficient matrix, X is a column vector of
variables and B is a column vector of constants.
The system has a unique solution precisely when M
is invertible (for then, X = M −1 B). The procedure
for solving the system involves row reducing the
matrix M; and each type of row reduction operation
replaces M with a new coefficient matrix, but one
€
that does not change the solution of the system.
14C. Determinants and the Characteristic Polynomial
4
In particular, if any row of the matrix is a zero
vector, then the system will not have a unique
solution (and M will not be invertible). Since we
want δ to satisfy the condition that
δ(M1, M2 ,…, M n ) ≠ 0 ⇔ M is invertible,
€
and M is invertible if and only if its row vectors are
n
linearly
independent
in
R
, it follows that
€
• if any M i = 0, then δ(M1, M2 ,…, M n ) = 0; and
• if any pair of €
the M i ’s are equal, then also
δ(M1, M2 ,…, M n ) = 0.
€
€
So, for instance,
the following must be true:
€
€
14C. Determinants and the Characteristic Polynomial
5
0 = δ(M1,…, M i + M j ,…, M i + M j ,…, M n )
1424
3
1424
3
position i
position j
= δ(M1,…, M i ,…, M i + M j ,…, M n )
1424
3
position j
+ δ (M1,…, M j ,…, M i + M j ,…, Mn )
1424
3
position j
= δ(M1,…, M i ,…, M i ,…, M n )
+ δ (M1,…, M i ,…, M j ,…, Mn )
+ δ (M1,…, M j ,…, M i,…, Mn )
+ δ (M1,…, M j ,…, M j ,…, Mn )
= δ(M1,…, M i ,…, M j ,…, M n )
+ δ (M1,…, M j ,…, M i,…, Mn )
From this we conclude that
€
δ(M1,…, M i ,…, M j ,…, M n )
= −δ(M1,…, M j ,…, M i ,…, M n )
that is, whenever we swap any two arguments in δ ,
the sign of the value changes. This property makes
€
€
14C. Determinants and the Characteristic Polynomial
6
δ what is called an alternating form.
Next, we observe that if E1,E2,…,En are the
standard basis vectors, then certainly we must
have δ(E1, E2 ,…, En ) ≠ 0 . But as we have freedom to
make the choice€of this value, we select
€
€
δ(E1, E2 ,…, En ) = 1.
The rest of the discussion works toward the proof of
the following
€
Theorem There is only one alternating
multilinear form δ :R n × R n ×L × R n → R for which
δ(E1, E2 ,…, En ) = 1.
Proof Omitted.
//
€
€
Having reached this position, we can now define
the general determinant function
det:Matn,n → R
by detM = δ(M1, M 2,…, Mn ). Unfortunately,
however, the
€ information that led to this definition
is not very useful for the purpose of computing
€
14C. Determinants and the Characteristic Polynomial
7
determinants. To this end, one needs to prove the
following
Theorem [The Lagrange Expansion Formulas]
If M = aij ∈ Matn,n , and, if for any choice of indices
( )
€
i and j, M ij represents the (n − 1) × (n − 1) matrix
obtained from M by deleting the ith row and jth
column (such matrices are called minors of M), the
following are true:
€• expansion along€the ith row:
n
detM = ∑ (−1)i+ j aij detMij
j =1
• expansion along the jth column:
€
n
detM = ∑ (−1)i+ j aij detMij .
i=1
Proof Omitted. (See Appendix A.) //
The €
Lagrange expansion formulas reduce the
computation of an n × n determinant to computing
n separate (n − 1) × (n − 1) determinants. Many of
these computations are further simplified by using
the following general properties of determinants.
€
€
14C. Determinants and the Characteristic Polynomial
€
€
€
€
8
Theorem Let M ∈ Matn,n .
• If M has a row of zeros or a column of zeros, then
det M = 0.
• If M has two equal rows or two equal columns,
then€det M = 0.
• If M′ is obtained from M by swapping a pair of
rows or a pair of columns, then det M′ = − det M.
• If M′ is obtained from M by multiplying through
some row or some column by a nonzero scalar a,
then det M′ = a ⋅det M.
€
• If M′ is obtained from M by replacing some row
by its sum with any multiple of another row, or
by replacing some column by its sum with any
€
multiple of another column, then det M′ = det M.
• If M tr is the transpose of M, detM tr = detM .
• If M is upper triangular or lower triangular, then
detM is the product of the
€ diagonal entries of M.
And most importantly,€if A, B ∈ Matn,n , then
det AB = det A ⋅detB .
€
Proof Omitted. €//
€
Another important corollary to the Lagrange
expansion formulas is given by the
14C. Determinants and the Characteristic Polynomial
9
( )
Corollary If M = aij ∈ Matn,n , and, if for any
choice of indices i and j, we define the numbers
aij* = (−1)i+ j det Mij ,
€
( )
then M cof = a *ji (note the reversal of indices), the
€ matrix associated with M, satisfies
cofactor
€
M ⋅ M cof = (det M)I.
Proof Using Lagrange expansion along the ith
row of M shows that the entry in the ith row and
€
ith column of the product matrix M ⋅ M cof is detM.
But if i ≠ j , the entry in the ith row and jth column
of the product matrix M ⋅ M cof corresponds instead
to using Lagrange expansion
along the
€
€ jth column
€ of the matrix obtained from M by replacing its ith
column with its jth column. Since this new matrix
€
has two equal columns, its determinant is 0, so all
the off-diagonal entries of M ⋅ M cof are 0. The
result follows. //
€
Corollary If M ∈ Matn,n has nonzero determinant,
1
then M −1 =
⋅ Mcof . //
detM
€
€
14C. Determinants and the Characteristic Polynomial
10
The Characteristic Polynomial
Returning to the discussion about finding
eigenvalues for endomorphisms T:V → V on a
vector space V, we have seen that e is an
eigenvalue for T precisely when the related
endomorphism T – e·Id is
€ singular. If we represent
T relative to a basis {V1, V2,…, Vn } for V by the
matrix M, then, since the representation of the
identity map Id relative to this basis is the identity
matrix I, it follows that the map T – e·Id is
represented€by the matrix M – eI. Therefore, e is
an eigenvalue of T if and only if it satisfies the
equation
det(M – eI) = 0.
If we use Lagrange expansion to work out the
expression det(M – eI), we find that it is a
polynomial of degree n in e. (Why?) For this
reason, we define
Δ M (t) = det(M − tI)
to be the characteristic polynomial of the matrix
M. The roots of Δ M (t) are then the eigenvalues of
€
T.
€
14C. Determinants and the Characteristic Polynomial
€
11
Notice that if we choose a different basis
{U1, U2 ,…, Un } for V, this will lead to a different
matrix representation for T; call this matrix N.
But then there is an invertible change of basis
matrix P that satisfies N = PMP −1 . So the
characteristic polynomial of the matrix N is
−1
−1
det(N – tI) = det(PMP
–
tPP
)
€
= det(P[M – tI]P −1 )
= det(P)·det(M – tI)·det(P −1 )
€
€ – tI)·det(P) −1
= det(P)·det(M
= det(M €
– tI)
€
In other words, matrices that represent the same
€
endomorphism have the same characteristic
polynomial. It follows that the characteristic
polynomial depends only on T and not on the
matrix representation. We can then denote
characteristic polynomial of T by ΔT (t) (or simply
Δ(t) when the context is sufficient to indicate what
the underlying map is).
€
€
It follows that finding eigenvalues
e for a map T
boils down to solving the polynomial equation
det(M – tI) = 0. The corresponding eigenspaces are
found by solving the system of equations
determined by the relation (T – eI)(X) = 0. These
eigenspaces are identified via their basis vectors.
14C. Determinants and the Characteristic Polynomial
12
Theorem If e1 , e 2 ,…, em are distinct eigenvalues
for the endomorphism T:V → V on a vector space
V, and V1 , V2 ,…, Vm are corresponding
eigenvectors,
then these vectors form a linearly
€
independent set
€ in V.
€
Proof If there is a linear dependence amongst
V1 , V2 ,…, Vm , then it may happen that by
removing vectors one by one from this list, we
obtain a linearly independent set. So there is some
smallest subset of {V1, V2,…, Vm } that is still
linearly dependent. By renumbering, if necessary,
we may assume that this smallest subset is
{V1, V2,…, Vk }. It follows that Vk is a linear
€
combination of the lower-indexed V’s):
€
(*)
€
Vk = a1V1 + a2 V2 +L + a k−1Vk−1 .
€
Applying T gives
€
T(Vk ) = a1T(V1 ) + a2T(V2 ) + L+ ak−1T(Vk−1 ),
or
€
ekVk = a1e1V1 + a2e2V2 + L+ ak−1e k−1Vk−1.
If we multiply (*) through by ek and subtract from
€
€
14C. Determinants and the Characteristic Polynomial
13
this last equation, we obtain
0 = a1 (e1 − e k )V1 + a2 (e2 − e k )V2 + L+ ak−1(e k−1 − e k )Vk−1.
€
€
€
But since {V1, V2,…, Vk } is the smallest subset of
{V1, V2,…, Vm } that is linearly dependent,
{V1, V2,…, Vk−1 } is linearly independent. Thus, for
every
€ i = 1, 2, … , k – 1, ai (ei − ek ) = 0, and since the
eigenvalues are distinct, ei − e k ≠ 0 , whence ai = 0 .
Returning to (*) then, we find that Vk = 0. But this
is impossible since
€ 0 cannot be an eigenvector. We
have proven by contradiction
then that
€
€ there
cannot be any linear dependencies amongst the
€
eigenvectors V1 , V2 ,…, Vm . //
Corollary If the endomorphism T:V → V on the
€
n-dimensional vector space V has a characteristic
polynomial Δ(t) with n distinct real roots, then T is
diagonalizable.
€
Proof If Δ(t) has n distinct real roots, then T has n
€
distinct real eigenvalues, and so the corresponding
n eigenvectors form a linearly independent set of n
vectors
in V. Thus, they form a basis of V relative
€
to which T is diagonalizable. //
14C. Determinants and the Characteristic Polynomial
14
See Example 3 (p. 248-251) for a situation in which
this occurs, and the corresponding map is
diagonalized.
Then see Example 4 (p. 238) for a situation in
which the map has distinct eigenvalues but is not
diagonalizable. How can this be? Note that the
distinct eigenvalues are both complex; the
corresponding characteristic polynomial has no real
roots!
What then can prevent an endomorphism from
being diagonalized? As we noticed in the previous
paragraph, the characteristic polynomial may have
some roots that are complex and not real. But
another possibility may occur: see Example 2 (p.
258). Here, the map has only real eigenvalues, but
they are not distinct; further, the underlying vector
space has dimension greater than the eigenspace of
the only eigenvalue, so it is impossible to find a
basis for the space made up of eigenvectors for the
map. This situation makes clear the need for the
following definitions.
€
If T:V → V is an endomorphism and e is an
eigenvalue for T, then the algebraic multiplicity
of e is the multiplicity of e as a root of Δ(t), while
the geometric multiplicity of e is the dimension
of its eigenspace Ve .
€
€
14C. Determinants and the Characteristic Polynomial
15
Proposition If α is the algebraic multiplicity of
the eigenvalue e for the endomorphism T:V → V
and γ is its geometric multiplicity, then α ≥ γ .
€
€
Proof Let {V1, V2,…, Vγ } be a basis
for Ve . Extend
€
this to a basis {V1, V2,…, Vn } for €
V. Since
T(Vi ) = eVi for i = 1, 2, … , γ , the matrix for T in the
basis
€
€ {V1, V2,…, Vn } has the form
€
€
€
€
eI B 

,
0
D


where I is γ × γ , D is (n − γ ) × (n − γ ) (and 0 and B are
correspondingly sized). So the characteristic
€ T is
polynomial for
€
€

B 
Δ(t) = det(e − t)I

D − tI 
 0
= det((e − t)I) ⋅det(D − tI).
= (e − t)γ ⋅ det(D− tI)
It follows that e is a root of Δ(t) with multiplicity
α ≥ γ . //
€
€
€
14C. Determinants and the Characteristic Polynomial
16
Corollary T:V → V is diagonalizable ⇔ α = γ for
each eigenvalue of T and the sum of these
multiplicities over all the eigenvalues of T equals
dimV.
€
€
Proof ( ⇒) If T is diagonalizable, then in some
basis {V1, V2,…, Vn } for V, it has a diagonal matrix
representation. Then the diagonal entries of this
matrix are eigenvalues of T. We may assume, by
€
renumbering the V’s and their corresponding
€ eigenvalues, if necessary, that the matrix has the
form
e I
1 1
 0


 0
0
L
0 

e2 I2 L
0 

O

0
L ek Ik 
where e1 , e 2 ,…, ek are distinct eigenvalues of T and
each of€the I’s is an identity matrix of a certain size
such that the sum of these sizes equals dimV. In
€ fact, the form of this matrix implies that the
characteristic polynomial is
Δ(t) = (e1 − t)α1 (e2 − t)α2 L(ek − t)αk ,
where the α ’s are the corresponding algebraic
€
€
14C. Determinants and the Characteristic Polynomial
17
multiplicities of the eigenvalues. We then deduce
from the matrix that each Ii has size α i × α i ,
whence the eigenspace corresponding to ei has
dimension α i . But by definition, the dimension of
this eigenspace is €
γ i . Finally,€it is clear that the
sum of the α i equals dimV.
€
€ ( ⇐) Suppose that α i = γ i for each of the k
eigenvalues
€ e1 , e 2 ,…, ek of T and that the sum of
the€γ i equals dimV. Then, since each of the k
€ eigenspaces have
€ dimension γ i , combining bases for
all the
€ eigenspaces produces a basis for V. Since V
€ has a basis consisting of eigenvectors for T, it is
diagonalizable. // €
Corollary T:V → V is diagonalizable ⇔ α = γ for
each eigenvalue of T and all the eigenvalues of T
are real.
€
€
Proof ( ⇒) Clear.
( ⇐) If all the eigenvalues are real, then
€
€
€
Δ(t) = (e1 − t)α1 (e2 − t)α2 L(ek − t)αk .
So dim V = ∑ α i = ∑ γ i , and combining bases for the
eigenspaces
gives a basis of eigenvectors for V. //
€