3 Matrix Algebra

3
Matrix Algebra
A matrix is a rectangular array of numbers; it is of size m × n if it has m rows and n columns.
A 1 × n matrix is a row vector; an m × 1 matrix is a column vector. For example:
1 5 −3
5
3 5 8 −1 −2
0 −1 7
8
The (i, j)-entry of a matrix is the number in the ith row and the jth column. If we want to
discuss a general m × n matrix A we will denote its (i, j)-entry by aij , so that
a11
 a21

A= .
 ..

am
1
a12
a22
..
.
a13
a23
..
.
···
···
am
2
am
3
···

a1n
a2n 

.. 
. 
am
n
A square matrix is one of size n × n, i.e. it has the same number of rows as columns.
Two matrices A and B are equal if they are of the same size, and their corresponding
entries are equal.
3.1
Operations on matrices
Matrices A and B can be added if they are of the same size; the sum A + B is formed by
adding the corresponding entries, so that the (i, j)-entry of A + B is aij + bij . For example
1
−1
2 3
1 1
+
2 0
2 0
−1
1+1 2+1
=
6
−1 + 2 2 + 0
3−1
2 3
=
0+6
1 2
2
6
The zero matrix (of any given size m × n) is denoted 0, and has only 0 as its entries. It
satisfies A + 0 = 0 + A = A. For example
1 −5
0 0
1 −5
+
=
2 3
0 0
2 3
If A is any matrix and k any number, then the scalar multiple kA is defined by multiplying
each entry of A by k. For example
1 −2 0
5 −10
0
5
=
3 4 −7
15 20 −35
By definition −A means (−1)A, so that A − B means A + (−1)B, the matrix obtained by
subtracting the entries of B from those of A.
Properties of these two operations include:
A + B = B + A;
(A + B) + C = A + (B + C);
(k + l)A = kA + lA;
(kp)A = k(pA);
k(A + B) = kA + kB;
A − A = 0;
0.A = 0,
provided that A, B and C are all of the same size. Note that 0 is used both as a scalar and
as matrix in the last identity.
If A is an m × n matrix then the transpose of A, denoted AT , is the n × m matrix whose
rows are the columns of A written in the same order. For example
T
1 2
3 4 = 1
2
5 6

3 5
4 6
and
T 
1
3 0
1
 2 −1 5 = 3
−1 0 4
0

26

2 −1
−1 0 
5
4
MATRIX ALGEBRA
Note that the (i, j)-entry of AT is the (j, i)-entry of A. Properties of the transpose operation
include:
(AT )T = A;
(kA)T = kAT ;
(A + B)T = AT + B T .
The matrix A is symmetric if A = AT . In this case A must have as many rows at it has
columns, so is square.
Exercise 3.1. Solve the following equations in each case to find the matrix A:
(i) AT +
0
3
T
2 2
9
5
= 3A + 2 −1 6
−7 19
4 0


Solution.
27
(ii) 3A + 2AT =
10 0
−5 0
Operations on matrices
Exercise 3.2. Find the general form of a 2 × 2 symmetric matrix.
Solution.
The final operation we shall discuss is matrix multiplication. If A is a k × m matrix and
B is an m × n matrix then their product is the k × n matrix whose (i, j)-entry is computed
by multiplying each entry of row i of A by the corresponding entry of column j of B and
summing the results. That is, the (i, j)-entry of AB is the dot product of row i of A with
column j of B.


1 8
5 9
For example if A = 3 −2 and B =
then
−2 7
0 4



 

1 8 1 × 5 + 8 × (−2)
1×9+8×7
−11 65
5
9
AB = 3 −2
= 3 × 5 + (−2) × (−2) 3 × 9 + (−2) × 7 =  19 13
−2 7
0 4
0 × 5 + 4 × (−2)
0×9+4×7
−8 28
Unlike our earlier operations involving two or more matrices, we no longer require A and B
to be the same size. Above, A is a 3 × 2 matrix and B is a 2 × 2 matrix, so we can form the
product AB, which is a 3 × 2 matrix. However, B does not have the same number of columns
as A has rows (equivalently, the rows of B and the columns of A are of different lengths), so
BA is not defined in this case.
Matrices A and B are said to commute if AB = BA. In particular if A is an m × n
matrix and B a p × q matrix, then for the first product to be defined we need n = p and for
the second product we need q = m. The product AB is then an m × q matrix and BA is a
p × n matrix, and if AB = BA they must be the same size, hence
m = p = n = q,
that is, A and B are both square and of the same size. However given any two square n × n
matrices A and B, it need not be true that AB = BA. For example:
0 1
1 1
3 0
2 0
A=
, B=
⇒ AB =
but BA =
2 −1
3 0
−1 2
0 3
The identity matrix of size n×n is the matrix with 1 appearing down the (main) diagonal,
and 0 elsewhere. We write this as I, or In if we want to specify the size. So


1 0 0
1 0
I2 =
,
I3 = 0 1 0 , . . .
0 1
0 0 1
28
MATRIX ALGEBRA
Note that
1
0
0 3 −2 4
3 −2 4
3 −2
=
=
1 0 2 −6
0 2 −6
0 2

1 0
4 
0 1
−6
0 0

0
0
1
In general, if A is an m × n matrix then Im A = AIn = A. Other properties of matrix
multiplication include
A(BC) = (AB)C;
A(B + C) = AB + AC;
(A + B)C = AC + BC;
T
(AB) = B T AT
k(AB) = (kA)B;
whenever these products are defined.
Exercise 3.3. Solve the following equations to find the matrix A:


1 −9 −3
1 3 −1 4
0 0 1 
T

4 6
0
(i) 3A =
(ii) 2A − A =
0 2
7 −5
2 3 −4
2 0
3
Solution.
29
Linear systems: the matrix viewpoint
3.2
Linear systems: the matrix viewpoint
Consider the following system of linear equations:
2x1 − 3x2 + x3 = 2
−x1 + x2 + 3x3 = 0
These two equations can be combined into a single equation about 1 × 2 column vectors:
 
x1
2x1 − 3x2 + x3
2
2 −3 1  
2
=
⇔
x2 =
−x1 + x2 + 3x3
0
−1 1 3
0
x3
That is, our system of equations is the same as the single equation AX = B, where
 
x1
2 −3 1
2
A=
, X = x2  and B =
−1 1 3
0
x3
This can be done for any system of m equations in n unknowns, where A is the m × n matrix
of coefficients, X the n × 1 matrix of unknowns, and B the m × 1 matrix of constants.
Theorem 3.4. Suppose a linear system of equations is written as AX = B, as above, and
suppose that X1 is a solution of this system. Any other solution X2 of AX = B is of the
form X2 = X1 + X0 where X0 solves the associated homogeneous system AX = 0; conversely
any vector of the form of X2 is a solution of AX = B.
Proof. Suppose that X1 and X2 both solve the given (nonhomogeneous) system. That is,
AX1 = B and AX2 = B. Then
A(X2 − X1 ) = AX2 − AX1 = B − B = 0,
and so X0 := X2 − X1 is a solution of the associated homogeneous such that X2 = X1 + X0 .
Conversely, suppose that X0 solves AX = 0, and that X1 solves AX = B. Let X2 =
X1 + X0 , then
AX2 = A(X1 + X0 ) = AX1 + AX0 = B + 0 = B,
so that X2 is a solution of the nonhomogeneous system.
What this result tells us is that to find all solutions to a given linear system AX = B it
is enough to find just one solution and then find all solutions to the equation AX = 0. We
shall see this principle again in Section 6.
Recall that for the homogeneous system AX = 0 there is always at least one solution,
namely X = 0. If there is more than one, that is if the rank r of A is smaller than the number
of columns n, then there are infinitely solutions given by varying the n − r parameters. In this
case we can find n − r nonzero vectors X1 , X2 , . . . , Xn−r such that any solution of AX = 0
can be written as t1 X1 + t2 X2 + · · · + tn−r Xn−r for numbers t1 , t2 , . . . , tn−r . Moreover these
vectors have the property that
t1 X1 + t2 X2 + · · · + tn−r Xn−r = 0 only if t1 = t2 = · · · = tn−r = 0.
This situation is described by saying that the set of vectors X1 , X2 , . . . , Xn−r is linearly
independent. An example of linearly independent vectors are the basis vectors i, j, k in R3 ,
since
t1 i + t2 j + t3 k = (t1 , 0, 0) + (0, t2 , 0) + (0, 0, t3 ) = (t1 , t2 , t3 )
and this equals (0, 0, 0) only if t1 = t2 = t3 = 0.
30
MATRIX ALGEBRA
Remarks. (i) An alternative but equivalent definition for linear independence is that the set
of vectors X1 , X2 , . . . , Xn−r is linearly independent if whenever X is any vector that can be
written as a linear combination of X1 , X2 , . . . , Xn−r , then it can be done so in only one way.
Again, this is clear for the standard basis vectors i, j, k, where the position vector for any
point in R3 can be written uniquely as a linear combination of these three.
(ii) It follows that any two vectors X1 and X2 are linearly independent if and only if they
are not parallel, i.e. they are not multiples of one another.
Exercise 3.5. Show that x1 = 3, x2 = 4, x3 = −2, x4 = 0 and x5 = 1 is a solution of the
system
2x1 + x2 − 2x3 + 3x4 − x5 = 13
3x1 − 2x2
+ 7x4 + 5x5 = 6
4x1 − 5x2 + 2x3 + 11x4 + 11x5 = −1
Find the general solution by solving the associated homogeneous system.
Solution.
31
Exercises
3.3
Exercises


1 3
2 1
3 −1 2
3 −1
1. Let A =
, B =
, C =
and D = −1 0. Compute the
0 −1
0 1 4
2 0
1 4
following (where possible):
(i) 3A − 2B
(ii) 5C
(iii) 4AT − 3C
2. Find A if
1 0
5 2
(i) 5A −
= 3A −
2 3
6 1
T 1 0
8 0
T
(iii)
3A + 2
=
0 2
3 1
(v) (A + C)T
(iv) B + D
(vi) A − D
2
3
= 5A − 2
1
0
T
1 0
1 1
T
(iv)
2A − 5
= 4A − 9
−1 2
−1 0
(ii) 3A +
3. Compute the following matrix products (if possible):



2 3 1
3
1 −1 2 
1 9 7
(i)
(ii) 1 3 −3 −2
2 0 4
−1 0 2
0

 ′

a 0 0 a 0 0
1 2 4
−1
(iv) 0 b 0  0 b′ 0 
(v)
0 1 −1
1
′
0 0 c
0 0 c

0
1
6
6
0
(iii)
(vi)
3 1
5 2
1 2
0 1
2 −1
−5 3


2 0
4 
−1 1
−1
1 2
4. In both cases express every solution of the given system as the sum of a specific solution
plus a solution of the associated homogeneous system:
x − y − 4z = −4
(i) x + 2y + 5z = 2
x + y + 2z = 0
2x1
3x1
(ii)
−x1
−2x1
+ x2 − x3 − x4
+ x2 + x3 − 2x4
− x2 + 2x3 + x4
− x2
+ 2x4
= −1
= −2
= 2
= 3
5. Let A, B and C be matrices.
(a) If A2 can be formed, what can be said about the size of A?
(b) If AB and BA can both be formed, describe the sizes of A and B.
(c) If ABC can be formed, A is 3 × 3 and C is 5 × 5, what size is B?
3.4
Matrix inverses
If a and b are numbers with a 6= 0 then it is easy to solve the equation ax = b — just divide
both sides by a to get x = ab . Note also that this is the only solution. In the case when a = 0,
the equation ax = b can only be solved if b = 0 as well, in which case any value of x ∈ R is a
solution. If a = 0 and b 6= 0 then there are no solutions.
Now any system of m linear equations in n unknowns can be written as AX = B, where
A is m × n, X is n × 1 and B is m × 1. By analogy with the above we would like to solve this
system by “dividing” by A; however matrix division does not make sense. In the one variable
case recall that when a 6= 0 we define a−1 = a1 , so that the solution of ax = b is x = a−1 b.
Definition 3.6. An n × n matrix A is invertible if there is another n × n matrix B such that
AB = BA = I.
B is called the inverse of A.
32
MATRIX ALGEBRA
Remark. In the definition we assumed that A is a square matrix, and consequently B is a
square matrix of the same size. To see why we do this, suppose it were possible to find an
m × n matrix A and a n × m matrix B such that m < n and
AB = Im ,
BA = In .
Since A has more columns than rows, its rank is no greater than m, and hence less than n,
thus there is a nontrivial solution X to the equation AX = 0. But then
X = In X = (BA)X = B(AX) = B0 = 0,
contradicting the fact that X 6= 0. So we must have m > n. But then, by symmetry, it follows
that m 6 n as well, so that m = n, and hence it only make sense to talk about inverses of
square matrices.
Consider the following matrices:
0 −1
3 1
A=
,
B=
.
1 3
−1 0
We have
0 −1
AB =
1 3
3 1
1 0
3 1 0
=
=
−1 0
0 1
−1 0 1
−1
= BA.
3
Thatis, B is
the inverse of A, and A is the inverse of B. However consider the matrix
0 0
C=
. This has no inverse since
1 −2
a b
0 0
a b
0
0
C
=
=
6= I,
c d
1 −2 c d
a − 2b c − 2d
and there is no way to make the top left entry of the product equal to 1. So, unlike numbers,
it is possible to have noninvertible matrices that are nonzero.
However, given a nonzero number a 6= 0, its multiplicative inverse is a−1 = a1 — it only
has the one inverse. The same is true for matrices:
Proposition 3.7. Any n × n matrix has at most one inverse.
Proof. Suppose that A, B and C are n × n matrices such that
AB = I
and
CA = I.
(†)
Then
C = CI = C(AB) = (CA)B = IB = B.
Since the equations in (†) must hold if B and C are inverses of A, it follows that B and C
must be the same, and so there can only be at most one inverse to A.
Since if A is invertible it has only one inverse we can denote it by A−1 , and talk about
the inverse of A.
Theorem 3.8. If A is an invertible n × n matrix then for any n × 1 vector B there is a
unique solution to AX = B, namely A−1 B.
Proof. Consider the column vector X1 = A−1 B. This satisfies
AX1 = A(A−1 B) = (AA−1 )B = IB = B,
and so is a solution. On the other hand, if X2 is any solution, then
AX2 = B ⇒ A−1 (AX2 ) = A−1 B ⇒ (A−1 A)X2 = A−1 B ⇒ IX2 = X2 = A−1 B,
and so A−1 B is the only solution.
33
Matrix inverses
Remark. If n > 2, because of the sizes of the matrices involved, A−1 B makes sense but BA−1
is not defined — unlike the case with numbers when a−1 b = ba−1 = ab .


1 2 0
Exercise 3.9 (S04 6(a)). Find the inverse of the matrix A = 0 2 3. Hence, or otherwise,
1 3 1
find the solutions of the following systems of linear equations:
(i)
x + 2y
= 3
2y + 3z = −4
x + 3y + z = 0
7x + 2y − 6z = 2
(ii) −3x − y + 3z = −1
2x + y − 2z = 5
Solution.
Exercise 3.10. Solve the system
− y=a
.
x + 3y = b
Solution.
34
MATRIX ALGEBRA
a b
d −b
Given a 2 × 2 matrix A =
define adj(A) :=
. Consider their product:
c d
−c a
a b
d −b
ad − bc
0
A adj(A) =
=
= (ad − bc)I = adj(A)A.
c d −c a
0
ad − bc
If det A := ad − bc 6= 0 then we can divide through by this quantity and find that:
−1
1
1
a b
d −b
A−1 =
adj(A),
i.e.
=
.
c d
det A
ad − bc −c a
On the other hand, suppose that A is invertible. Then
adj(A) = I adj(A) = (A−1 A) adj(A) = A−1 (A adj(A)) = A−1 (det A)I = (det A)A−1 .
This shows that we cannot have det A = 0, since this would imply that adj(A) = 0, and so
A = 0, which is clearly not possible since A is assumed to be invertible.
Overall, we see that 2 × 2 matrices are invertible if and only if their determinants are
nonzero. This is also true for larger matrices (once we have defined their determinants).
Theorem 3.11. Let A be an n × n matrix and R its reduced row-echelon form. The following
statements are equivalent :
(i) A is invertible.
(ii) The trivial solution is the only solution to AX = 0.
(iii) R = In .
(iv) The rank of A is n.
(v) There is at least one solution to AX = B for every possible choice of B.
(vi) There is an n × n matrix C such that AC = I.
Proof. (i ⇒ ii) If A is invertible then the unique solution of AX = 0 is A−1 0 = 0, by
Theorem 3.8.
(ii ⇔ iii) R 6= In precisely if R has a row of zeros, which is equivalent to there being more
than one solution to AX = 0.
(iii ⇔ iv) Since A is square, it is clear that R = In if and only if A has rank n
(iv ⇔ v) If the rank of A is equal to the number of rows then there can be no rows of zeros
in the reduced row-echelon form R of A, and so the system AX = B cannot be inconsistent
for any B. Hence there is always a solution.
On the other hand if the rank of A was less than n, then R would have a row of zeros,
and we could construct a vector B for which the system would be inconsistent.
(v ⇔ vi) Let
 
 
 
0
1
0
1
0
0
 
 
 
(†)
X 1 =  .  , X 2 =  .  , . . . Xn =  .  .
 .. 
 .. 
 .. 
0
0
1
If (v) holds then we can find n × 1 vectors C1 , C2 , . . . , Cn such that ACi = Xi for each i.
Putting these together to make the n × n matrix C = [C1 C2 · · · Cn ] we have AC = In .
On the other hand, if C exists then for any given vector B let X = CB, so that AX =
A(CB) = (AC)B = IB = B, i.e. there is a solution.
We postpone the proof that conditions (ii–vi) imply A is invertible until a later stage.
35
Matrix inverses
This alternative characterisation of invertibility leads to a method involving row operations
for calculating inverses to n × n matrices. Suppose that A is invertible and write A−1 as
A−1 = [B1 B2 · · · Bn ] for column vectors Bi . We want to find these Bi , but since AA−1 =
[AB1 · · · ABn ] = In , we have ABi = Xi for the Xi defined in (†). So if we solve the n systems
AB1 = X1 , AB2 = X2 , . . . , ABn = Xn then we can get the Bi . This can be done by row
reduction, and in fact all n systems can be solved simultaneously by applying row operations
to the n × 2n matrix [ A | I ] and reducing A to its reduced row-echelon form R. If it turns out
that R = In then A is invertible, and the matrix on the right hand side will be A−1 . That is,
by use of row operations, we transform to get
[ A | In ] → [ In | A−1 ].

1 −2
Exercise 3.12. Use row operations to show that A = −2 1
−1 −4
Solution.
Exercise 3.13. By finding inverses, solve the system:
x
− 2z = 0
−3x + y + 4z = 6
2x − 3y + 4z = −4
Solution.
36

1
3 is not invertible.
9
MATRIX ALGEBRA

1
1

Exercise 3.14. Find the inverse of A = 
1
1
1
0
1
1
1
1
0
0
1
1
1
0
0
0
1
1

0
0

0

0
1
Solution.
Properties of matrix inverses include the following:
(i) If A is invertible, so is A−1 , with (A−1 )−1 = A.
(ii) If A and B are invertible, so is AB, with (AB)−1 = B −1 A−1 .
(iii) If A is invertible, so is An for all n > 1 with (An )−1 = (A−1 )n .
(iv) If A is invertible and k 6= 0 then kA is invertible with (kA)−1 = k1 A−1 .
(v) If A is invertible, so is AT , with (AT )−1 = (A−1 )T .
37
Matrix inverses
For example if A and B are invertible then
(B −1 A−1 )(AB) = B −1 (A−1 A)B = B −1 IB = B −1 B = I = (AB)(B −1 A−1 ),
so that (AB)−1 = B −1 A−1 .
Exercise 3.15. Find the invertible 3 × 3 matrix A that satisfies

 
5 7 4
1 0
A−1 9 −8 0 = −1 1
0 0
1 −2 3

3
0
2
Solution.
Example 3.16. Solving the following equation for the 2 × 2 matrix A:
1
AT −
3
T 0
7
=
2
0
38
−1
2 −5 3
+ 2A
4 −2 1
MATRIX ALGEBRA
Solution.
T −1
1 0
7 2 −5 3
T
A −
=
+ 2A
3 2
0 4 −2 1
T 1
1 0
7 2
1 −3
(AT )T −
=
×
+ 2A
3 2
0 4
−5 + 6 2 −5
1 3
11 −31
A−
=
+ 2A
0 2
8 −20
1 3
11 −31
−12 28
A = 2A − A = −
−
=
0 2
8 −20
−8 18
⇒
⇒
⇒
Example 3.17. Find the matrix B, given that A is a another matrix such that
A=
Solution. Now A−1 =
5
2
7
3
and
AB =
0
2
−3
4
1
3 −7
3 −7
0
=
. So if AB =
−2 5
2
5 × 3 − 7 × 2 −2 5
B = IB = (A−1 A)B = A−1 (AB) =
3 −7 0
−2 5
2
−3
then
4
−3
−14 −37
=
.
4
10
26
Example 3.18 (S05 6(b)). Find the matrix A that satisfies the equation

1
−1

0
4
0
2
0
0


0
2 3
0
 A 4 0
0
0 −2
1
1
0
1
0
T
1 −5
0
0
−2 4
−1 
1
3 0
−7 =  6 7
1
−2 1
Solution.

1
 −1

 0
4
0
2
0
0
1
0
1
0
0
0
0
1

1
0
0
0
1
R1−R3; R2−R3  0
−−−−−−−−−−→ 
 0
R4+4×R3
0
0
1
0
0
0
2
0
0
0
0
1
0
0
0
1
0


0

0 
R2+R1
 −−
−−−−→ 

0
R4−4×R1 
1
0 1
0 1
0 0
1 −4
0
1
0
0
−1
−1
1
4
1
0
0
0

0 1
2 1
0 1
0 −4

0
R2× 12 
0 
−
−−−→ 


0
1
0
0
0
1
1
0
0
0
1
1
0
−4
0
1
0
0
0
1
0
0
0
0
1
0

0
0 

0 
1
0
0
1
0
0
0
0
1
1
0
1
2
1
2
0
−4
0
0
so then

1
−1
A=
0
4

1
 1
2
=
0
−4
0
2
0
0
0
1
2
0
0
−1 
0
3
0
0
 
0  1
1
−5

−1 0
30
 28
− 21 0

1 0  2
4 1 −10
1
0
1
0
6
7
0
0

−2 
2
1
 4
−2
0
4

3
1
0 −7
−2 1
 

13 −41
28
6
−40

−2 −48
2
−44
 =  28



7
−1
2
7
−1 
−23 −1
−122 −47 159
39
−1
− 12
1
4

0
0 

0 
1
Matrix inverses
3.4.1
Elementary matrices
An elementary matrix is any matrix that can
operation to the identity matrix. For example



1 0 0
1
E2 = 0
E1 = 0 0 1 ,
0 1 0
0
R2 ↔ R3
be obtained by applying an elementary row

0 0
−3 0 ,
0 1
R2 × (−3)

1 2
E3 = 0 1
0 0

0
0
1
R1 + 2 × R2
are all elementary matrices. Note what happens when multiplying any other 3 × 3 matrix A
on the left by one of the above:



 
1 0 0 1 2 −3
1 2 −3
E1 A = 0 0 1 4 −5 7  = 0 1
3
3
4 −5 7
0 1 0 0 1



 
1 0 0 1 2 −3
1
2 −3
E2 A = 0 −3 0 4 −5 7  = −12 15 −21
3
0
1
3
0 0 1 0 1


 

1 2 0 1 2 −3
9 −8 11
E3 A = 0 1 0 4 −5 7  = 4 −5 7 
0 0 1 0 1
3
0 1
3
That is multiplying by each of the matrices Ei produces the same effect on A as was used to
create Ei from I, and this holds true for any elementary matrix.
However any row operation can be undone (swap the rows back round, or divide the row
by the nonzero constant, or subtract so many copies of one row from another), and so every
elementary matrix is invertible. For example, with the matrices from above,






1 0 0
1 0 0
1 −2 0
−1
−1
−1
E1 = 0 0 1 , E2 = 0 − 31 0 , E3 = 0 1 0
0 1 0
0 0 1
0 0 1
We saw earlier that if a matrix A is invertible then its reduced row-echelon form must be
In . Suppose conversely that A can be reduced by means of row operations to In . That means
there are some elementary matrices E1 , E2 , . . . , Em such that
Em Em−1 · · · E3 E2 E1 A = In .
But each Ei is invertible, hence so is the product Em Em−1 · · · E3 E2 E1 . Moreover
(Em Em−1 · · · E2 E1 )−1 (Em Em−1 · · · E2 E1 )A = (Em Em−1 · · · E2 E1 )−1 In
and so
−1
−1
A = (Em Em−1 · · · E2 E1 )−1 = E1−1 E2−1 · · · Em−1
.
Em
But each Ei−1 is also an elementary matrix, so A is a product of such matrices, and hence
also invertible. This observation allows us to complete the proof of Theorem 3.11 (by showing
(iii) ⇒ (i)).
Exercise 3.19. Find the inverses of the following matrices,
their inverses as products of elementary matrices:

2 1
4 −7
(i) A =
(ii) B = 4 3
3 −5
0 1
Solution.
40
and express the matrices and

0
5
2
MATRIX ALGEBRA
41
Exercises
Example 3.20 (S05 6(c)). Write the matrix B and its inverse as products of elementary
matrices, where
2 4
B=
3 7
Solution.
B=
2
3
4 R2−R1 2 4 R1−2×R2 0
−−−−−→
−−−−−−→
7
1 3
1
−2
3
R1×− 21
0 1 R2−3×R1 0
−−−−−→
−−−−−−→
1 3
1
1 R1↔R2 1
−−−−−→
0
0
0
1
and so
0
1
1
B=
1
B −1 =
3.5
1
1 0 − 21 0 1 −2
1 0
0 −3 1
0 1 0 1
−1 1
0 1 2 −2 0 1 0 0 1
1 0 1
0 1 3 1 1 0
Exercises
1. In each case solve the system of equations by finding the inverse of the coefficient matrix:
x + 4y + 2x = 1
(ii) 2x + 3y + 3z = −1
4x + y + 4z = 0
2x − 3y = 0
(i)
x − 4y = 1

1 −1
−1

0
2. Given A = 2
−1

 1
2 −1 4
1 0.
CA−1 =  0
−1 −2 5
x+y+z+w= 1
x+y
= −1
(iii)
y
+ w = −1
x
+w= 2


3
1
5, find matrices B and C such that AB = 0
0
1

−1 2
1 1 and
0 0
3. In each case find A−1 in terms of c:
2
(i) A =
c

1
(ii) A =  c
3
−c
3
4. (a) Let A be an invertible
P A = QA then P = Q.
1 1
0
(b) Let A =
,B=
0 1
1
that B 6= C.

0 1
1 c
c 2
matrix. Show that if AX = AY then X = Y . Show that if
0
1
,C=
2
1
1
. Verify that AB = CA, A is invertible, but
1
5. Find the inverses of the following matrices, and express the matrices
products of elementary matrices:

2
5 6
0 3
(i) A =
(ii) B =
(iii) C = 4
4 5
2 1
0
42
and their inverses as

1 0
3 5
1 2
MATRIX ALGEBRA
3.6
Determinants
We saw that the 2 × 2 matrix A = ac db is invertible if and only if its determinant (det A =
ad − bc) is nonzero. We want to extend this to all n × n matrices — and to do so we shall
define determinants through an iterative procedure.
Let A be an n × n matrix, and for any 1 6 i, j 6 n let Aij denote the (n − 1) × (n − 1)
matrix obtained by deleting row i and column j from A. The (i, j)-minor of A is the quantity
mij = det Aij ; the (i, j)-cofactor of A is
cij = (−1)i+j mij = (−1)i+j det Aij .
The factor (−1)i+j is either +1 or −1 depending on whether or not i + j is even or odd —
and varies according to the following sign diagram:


+ − + − ···
− + − + · · ·


+ − + − · · ·


− + − + · · ·


.. ..
.. .. . .
.
. .
. .
The signs alternate along each row and each column, with +1 in the top-left hand corner.
The definition of the minors and cofactors is based on the assumption that we have defined
the determinant of an (n − 1) × (n − 1) matrix. This is the case for n = 2 and n = 3, since
det[a] = a, and the determinant of a 2 × 2 matrix was given earlier. Thus we can calculate
mij and cij for matrices up to size 3 × 3. These numbers are then used to define/calculate the
determinant of a matrix of the next size:
Definition 3.21. The determinant of an n × n matrix A is defined to be
det A := a11 c11 + a12 c12 + · · · + a1n c1n .
That is, the entries of the top row of A are multiplied by their respective cofactors, and the
results added together.
Alternative notation: if A = [aij ], then
1
Exercise 3.22. Calculate 0
6
Solution.
1
a1
2
a1
det A = .
..
an
1
2 3
−1 7
5 4
a12
a22
..
.
···
···
..
.
an2
···
43
a1n a2n .. . an n
Determinants
Theorem 3.23. For any n × n matrix A and any choice of i or j
det A = ai1 ci1 + ai2 ci2 + · · · + ain cin
= a1j c1j + a2j c2j + · · · + anj cnj .
That is, we can expand along any row or column, multiplying each entry by its respective
cofactor, and on adding the results will always be the same.
This freedom to choose which row or column to expand along can be exploited to simplify
greatly any calculation. For example consider the matrix


1 0 4 −7
2 3 9 0

A=
−1 0 6 8 
5 0 1 0
When calculating det A expanding along the first row or along the second column gives
2 9 0
2 3 0
2 3 9
3 9 0
1 × 0 6 8 − 0 × −1 6 8 + 4 × −1 0 8 − (−7) × −1 0 6 = · · · = 1107
5 1 0
5 0 0
5 0 1
0 1 0
and
2 9 0
1 4 −7
1 4 −7
1 4 −7
− 0 × −1 6 8 + 3 × −1 6 8 − 0 × 2 9 0 + 0 × 2 9 0 5 1 0
5 1 0
5 1 0 −1 6 8 4 −7
− 1 × 1 −7 = 3 × (5 × 74 − 1) = 1107.
= 3 × 5 × −1 8 6 8
Expanding along the second column meant that we only had to calculate one 3×3 determinant,
rather than three when expanding along the first row.
1 2 3
Exercise 3.24. Calculate 0 −1 7
6 5 4
Solution.
Definition 3.25. The main diagonal of a (square) matrix A = [aij ] consists of the entries
running from the top-left corner to the bottom-right corner, i.e. the numbers a11 , a22 , . . . , ann .
44
MATRIX ALGEBRA
(i) A is upper-triangular if the only nonzero entries are on or above the main diagonal (i.e.
aij = 0 if i > j).
(ii) A is lower-triangular if the only nonzero entries are on or below the main diagonal (i.e.
aij = 0 if i < j).
(iii) A is diagonal if the only nonzero entries are on the main diagonal. That is, if it is both
upper-triangular and lower-triangular.
If a matrix A is either upper-triangular or lower-triangular then its determinant is just
the product of the entries on the main diagonal. For example
6 2 0 0 1 −3 = 6 × 1 −3 = 6 × 1 × (−5) = −30
0 −5
0 0 −5
by expanding along the first columns; expanding along the first rows gives
7 0
0 0
−2 0 0
3 −2 0 0
= 7 × 0 −4 0 = 7 × (−2) × −4 0 = 7 × (−2) × (−4) × 6 = 336.
9 0 −4 0
4 6
1
4 6
1 1
4 6
Thus it is easy to calculate determinants for triangular matrices, but not every matrix
is of this form. However, any matrix that is in row-echelon form is upper-triangular, and
we can transform any matrix to this form by row-operations. Alternatively when calculating
determinants one can also make use of the analogous column operations.
Theorem 3.26. Let A and B be n × n matrices.
(i) If A has a row or column of zeros then det A = 0.
(ii) If two rows (or columns) are interchanged, the determinant of the new matrix is − det A.
(iii) If a row (or column) is multiplied by a constant k, the determinant of the new matrix
is k det A.
(iv) If a multiple of one row is added to another (or a multiple of a column added to another ),
the determinant of the resulting matrix is det A, i.e. unchanged.
(v) det(AB) = det A det B.
(vi) A is invertible if and only if det A 6= 0.
(vii) det AT = det A; if A is invertible then det(A−1 ) =
1
.
det A
The second part of part (vii) follows from part (v) since if A is invertible then det(A−1 ) det A =
det(A−1 A) = det I = 1.


1 2 3
Exercise 3.27 (S04 6(b)). Compute the rank and determinant of A = 4 5 6.
7 8 9
Solution.
45
Determinants


4 2 −1
Exercise 3.28. Compute the determinant of A =  3 7 5 
−2 6 9
Solution.
0
3
Exercise 3.29 (A04 6(c)). Compute 0
5
1
0
1
0
−1
0
2
0
Solution.
0
2
1
7

1
Exercise 3.30. For which values of x is the matrix x
x
46

x x
1 x invertible?
x 1
MATRIX ALGEBRA
Solution.
To see that invertibility of A is equivalent to det A 6= 0 is proved in the general n × n case
by the same argument as in the 2 × 2 case, as soon as we have defined adj(A) for an n × n
matrix A. This is given in terms of the cofactors of the matrix A defined earlier. Indeed, the
cofactor matrix of a matrix A is the matrix whose (i, j)-entry is cij = (−1)i+j det Aij . The
transpose of this matrix is the adjugate matrix (sometimes called the classical adjoint), that
is

 1
c1 c21 · · · cn1
 c12 c22 · · · cn2 


adj(A) =  .
.. . .
.
 ..
. .. 
.
c1n c2n · · · cnn
a b
For example in the 2 × 2 case if A =
then A11 = [d], A12 = [c], A21 = [b] and A22 = [a],
c d
so that c11 = d, c12 = −c, c21 = −b and c22 = a, giving
T d −c
d −b
adj(A) =
=
−b d
−c d
Once we define the general n × n cofactor matrix this way it is possible to show that
A adj(A) = adj(A)A = (det A)I, and so A is invertible if and only if det A 6= 0 with
A−1 =
1
adj(A),
det A
as before. Unfortunately as n grows large the number of operations involved in calculating A−1
this way grows far faster than if we were to use our earlier method based on row operations,
so that method is usually better for computations.
Example 3.31. Use determinants to find which values of c make the following matrix invertible:


c 1 0
 0 2 c
−1 c 1
c 1 0 0 1 + c2 c 1 + c2 c
= −[(1 + c2 )c − 2c] = c(1 − c2 )
Solution. 0 2 c = 0
2
c = (−1) 2
c
−1 c 1 −1
c
1
where we added c × R3 to R1 , and then expanded along C1 .
Thus the determinant is 0 if c = 0 or c = ±1, and is nonzero, hence invertible, if c is not
equal to one of these three values.
47
Determinants
However, the cofactor method does lead to another method for solving the linear system
AX = B, when A is invertible. This method, known as Cramer’s Rule, says that the solution
in this case is given by
x1 =
det A1
,
det A
x2 =
det A2
,
det A
...,
xn =
det An
,
det A
where Ai is the matrix obtained from A by replacing column i with the column vector B.


1 −1 2
Exercise 3.32 (A03 8(b)). If A = 2 1 −3, compute the adjugate matrix adj(A) and
4 1
1
the inverse A−1 . Solve the system of equations
x − y + 2z = −1
2x + y − 3z = 2
4x + y + z = 0
Solution.
48
MATRIX ALGEBRA
Exercise 3.33. Find x3 where
x1 − 2x2 + x3 = 3
3x1
− x3 = −1
4x1 + x2 + 2x3 = 0
Solution.
3.7
Eigenvalues and eigenvectors
Definition 3.34. Let A be an n×n matrix. A number λ is an eigenvalue of A if there is some
nonzero column vector X 6= 0 such that AX = λX. The vector X is called an eigenvector
corresponding to the eigenvalue λ.
Remark. X = 0 is always a solution to the equation AX = λX, for any choice of λ. It is for
this reason that we insist that X 6= 0 in the definition above.
On the other hand if X 6= 0 and AX = λX, i.e. if λ is an eigenvalue and X an eigenvector
corresponding to this value, then for any t 6= 0 we have tX 6= 0 and A(tX) = tAX = t(λX) =
λ(tX). That is, any nonzero multiple of X is again an eigenvector. Geometrically this says
that the line through the origin in the direction of X is mapped back onto itself by the
transformation Y 7→ AY .
As an example, consider A and X where
3 5
5
3 5
5
20
5
A=
, X=
⇒ AX =
=
=4
= 4X.
1 −1
1
1 −1 1
4
1
So 4 is an eigenvalue of A, and X is an eigenvector corresponding to the eigenvalue 4. On
the other hand consider another vector with the same matrix A:
1
3 5
1
−2
1
Y =
⇒ AY =
=
= −2
= −2Y.
−1
1 −1 −1
2
−1
49
Eigenvalues and eigenvectors
So −2 is another eigenvalue, and Y is an eigenvector associated to this eigenvalue. But now
note that the 2 × 1 vectors X and Y are not parallel — i.e. they are linearly independent, and
it follows that any 2 × 1 vector Z can be written as sX + tY for a unique choice of s, t ∈ R.
For example take
2
5
1
Z=
=
−3
= X − 3Y.
4
1
−1
This seemingly arbitrary fact allows us to calculate simply powers of A applied to the vector
Z. For example
A10 Z = A10 (X − 3Y ) = A10 X + A10 (−3Y ) = A10 X − 3A10 Y,
and now note that
A10 X = A9 (AX) = 4A9 X = 4A8 (AX) = 42 A8 X = · · · = 410 X.
Similarly A10 Y = (−2)10 Y , hence
10
A Z =4
10
5
1
10
− 3 × (−2)
.
1
−1
This is far preferable to calculating A10 , which, for the record, is
873984 872960
A10 =
.
174592 175616
Moreover, decomposing the vector Z into a linear combination of eigenvectors can help when
analysing the behaviour of the sequence of vectors AZ, A2 Z, A3 Z, . . .
This prompts the question: given a matrix A how can we find its eigenvalues and corresponding eigenvectors? If λ is an eigenvector then there is some X 6= 0 such that
AX = λX
⇔ AX − λX = (A − λI)X = 0.
That is, there is a nontrivial solution to the homogeneous system whose coefficient matrix is
A−λI, which happens precisely when A−λI is not invertible, i.e. precisely when det(A−λI) =
0. The quantity det(A − λI) is a polynomial in λ of degree n, called the characteristic
polynomial of A. In particular it has the form
det(A − λI) = (−1)n λn + bn−1 λn−1 + · · · + b2 λ2 + b1 λ + b0
for some numbers b0 , b1 , · · · , bn−1 . The eigenvalues of A are the roots of the characteristic
polynomial, and the associated eigenvectors are the nontrivial solutions of the corresponding
system (A − λI)X = 0 which can be found by row-reduction.
Example 3.35. By finding the corresponding eigenvectors, show that 1, 2 and 3 are eigenvalues of the matrix


3
1
1
−4 −2 −5
2
2
5


3
1
1
Solution. Let A = −4 −2 −5, then X is an eigenvector corresponding to the eigenvalue
2
2
5
1 if AX = X, which holds if and only if (A − I)X = 0. The relevant row operations on A − I
are








2
1
1
2
1
1
0 −1 −3
0 0 0
R3 × 21
R1 −2×R3
R1 +R2
−4 −3 −5 −−−−→ −4 −3 −5 −
3  −−−
−−−−−→ 0 1
−−→ 0 1 3 
R2 +4×R3
R3 −R2
2
2
4
1
1
2
1 1
2
1 0 −1
50
MATRIX ALGEBRA


t
from which we get that X = −3t is the eigenvector for eigenvalue 1.
t
For eigenvalue 2 the calculation is (starting with A − 2I):





1
1
1
1 1 1
1 1
R1 +R2 ; R3 +R2
R2 +4×R1
−4 −4 −5 −
−−−−−→ 0 0 −1 −−−−−−−−−−→ 0 0
R3 −2×R1
R2 ×(−1)
2
2
3
0 0 1
0 0

t
and so the eigenvectors have the form X = −t.
0
Finally, for the eigenvalue 3 the calculation is (starting





0
1
1
1
1
1
1
1
R3 × 2
R +4×R1
−4 −5 −5 −−−
0
−−→ −4 −5 −5 −−2−−−−→
R1 ↔R3
R1 −R3
2
2
2
0
1
1
0
 
0
and so the eigenvectors have the form X =  t .
−t

0
1
0

with A − 3I):


0
0
1
R3 +R2
−1 −1 −−−
−−−→ 0
R2 ×(−1)
1
1
0
1
Exercise 3.36 (S04 6(c)). Find all of the eigenvalues and eigenvectors of A =
3
Solution.
51
0
1
0
2
.
2

0
1
0
Eigenvalues and eigenvectors

1
Exercise 3.37 (S03 8(c)). Let A = −1
0
of the eigenvectors associated to one of the

1 −2
3 1 . Find all the eigenvalues of A. Find all
1 −1
eigenvalues.
Solution.
Example 3.38 (S05 6(d)). Find the eigenvalues of

4 0
C = −2 1
−2 0
the matrix

1
0
1
Find the eigenvectors associated to one of the eigenvalues.
52
MATRIX ALGEBRA
Solution.
4 − λ
0
1 1−λ
0 det(C − λI) = −2
−2
0
1 − λ
4−λ
0
1
−2
1 − λ 0
= −2 − (4 − λ)(1 − λ)
0
0
−2
1 − λ
= 2
−λ + 5λ − 6
0 = 0 − (1 − λ)(−λ2 + 5λ − 6) = (1 − λ)(λ − 2)(λ − 3)
and so the eigenvalues are 1, 2 and 3.
For λ = 1,

3 0
C − I = −2 0
−2 0


3 0
1
R3−R2
0 −−−−−→ 1 0
1
0 R2×− 2 0 0

1
0
0
and so the components of the eigenvector satisfy
  x = 0, z = −3x = 0, and y is arbitrary.
0
That is, the eigenvectors for eigenvalue 1 are  t .
0
 
 
1
−1
The eigenvectors for eigenvalue 2 are t −2; those for eigenvalue 3 are t  1 .
−2
1
Theorem 3.39. Suppose that an n × n matrix has n distinct eigenvalues λ1 , λ2 , . . . λn , and
let X1 , X2 , . . . , Xn be corresponding eigenvectors. This set of n eigenvectors is linearly independent, hence any n × 1 vector can be written uniquely as X = t1 X1 + t2 X2 + · · · + tn Xn .
Unfortunately it is not true that every matrix has n distinct eigenvalues, but sometimes
it is possible to find linearly independent eigenvectors associated to the same eigenvalue, and
hope to build a set of n vectors as in the theorem above.
Another potential problem is that it is possible that A has no real eigenvalues. One way
to get round this limitation is to do linear algebra with complex numbers instead of real
numbers. This will ensure that there is always at least one solution to det(A − λI) = 0,
and hence at least one eigenvalue and associated eigenvector. But it may still turn out that
it is impossible to find a set of n linearly independent eigenvectors. An assumption that
guarantees that this will not happen is given by assuming A is symmetric:
Theorem 3.40. If A is an n × n symmetric matrix (A = AT ) with real number entries, then
A has only real eigenvalues. Moreover it is possible to find a set of n linearly independent
eigenvectors for A, and hence write any n × 1 vector as a unique linear combination of these
vectors.
Exercise 3.41.

3 1
(i) A = 0 5
0 0
Find the eigenvalues and eigenvectors of the following matrices:

−2
0 1

−1
(ii) B =
−1 0
5
Solution.
53
Eigenvalues and eigenvectors
3.7.1
Dynamical systems: predator-prey models
Knowledge of eigenvalues and eigenvectors gives an insight into the behaviour of dynamical
systems, such as predator-prey models which take
place
in discrete time. For example denote
Ok
the owl and rat population at time k by Xk =
, where k is the time in months, Ok the
Rk
54
MATRIX ALGEBRA
number of owls, and Rk the number of rats (measured in thousands). Suppose that
Ok+1 = (0.5)Ok + (0.4)Rk
Rk+1 = −pOk + (1.1)Rk
(0.5)Ok represents the mortality rate of the owls; if the supply of rats is plentiful then the term
(0.4)Rk will be large causing the owl population to increase; (1.1)Rk says that the population
of rats will grow by 10%, if there were no owls, but it in fact there is a factor causing it to
decrease in proportion to the population of owls according to −pOk , where p is some positive
parameter to be specified.
So we see that the change from month to month is given by
Ok+1
Ok
0.5 0.4
=A
where A =
Rk+1
Rk
−p 1.1
√
1
(8 ± 9 − 40p). In particular if p = 0.104 then the
Now the eigenvalues of A are 10
eigenvalues are λ1 = 1.02 and λ2 = 0.58, with corresponding eigenvectors
10
5
Y1 =
, Y2 =
.
13
1
The eigenvalues are distinct, so any starting population X0 can be written uniquely as X0 =
t1 Y1 + t2 Y2 , and then
Xk = AXk−1 = A2 Xk−2 = · · · = Ak X0
= t1 Ak Y1 + t2 Ak Y2
= t1 (1.02)k Y1 + t2 (0.58)k Y2 .
Now as k → ∞, (0.58)k → 0, and so Xk ≈ t1 (1.02)k Y1 . It follows that for large k,
Xk+1 ≈ t1 (1.02)k+1 Y1 = (1.02) × t1 (1.02)k Y1 ≈ (1.02)Xk ,
that is, the populations of both the owls and the rats will increase at a rate of 2% each month.
0.8 0.6
1
Exercise 3.42. Find the eigenvalues and eigenvectors of P =
. Express
and
0.2 0.4
0
0
n r
as sums of eigenvectors. Suppose that r, s > 0, r + s = 1. Determine the limit of P
1
s
as n → ∞.
Solution.
55
Eigenvalues and eigenvectors
56
MATRIX ALGEBRA
3.8
Exercises
1. Compute the determinants of the following matrices:


1 0 3 1
 2 2 6 0
6 9
a+1
a

(i)
(ii)
(iii) 
−1 0 −3 1
8 12
a
a−1
4 1 12 0






4 −1 3 −1
0 a b
0 a 0
3 1
0
2

(v) a 0 c 
(vi)  b c d
(vii) 
0 1
2
2
0 e 0
b c 0
1 2 −1 1
x − 1
2. Evaluate 2
−3
3. Find

1
(i) 3
0


2 0 −3
(iv) 1 2 5 
0 3 0


0 0 0 a
0 0 b p 

(viii) 
0 c q k 
d s t u
−3
1 −1 x − 1 by first adding all other rows to the first row.
x + 2 −2 the adjugate of the following matrices:



−1 2
−1 2
2
1
 2 −1 2 
1 0
(ii)
3
−1 1
2
2 −1
4. Use determinants to find which real values c make the each of the following matrices
invertible:






0
c −c
4 c 3
1 c −1
(i) −1 2 −1
(ii)  c 2 c 
(iii)  c 1 1 
c −c c
5 c 4
0 1 c
5. In each case find the characteristic polynomial, eigenvalues and eigenvectors:






1 1 −3
0 1 1
2 1 1
2 −4
6
(i)
(ii) 2 0
(iii) 1 0 1
(iv) 0 1 0
−1 −1
1 −1 5
1 1 0
1 −1 2
−5 −2
4
6. Find the eigenvalues and eigenvectors of the matrix A =
. Write X =
as a
24
9
1
linear combination of eigenvectors, and hence calculate A5 X.
33
8
0
7. Find the eigenvalues and eigenvectors of the matrix A =
. Write X =
−136 −33
3
as a linear combination of eigenvectors, and hence calculate A100 X and A201 X.
57