Determinants and Permutations Permutations

Determinants and Permutations
Writing out the determinants for 2 × 2 and 3 × 3 matrices, we get
a11 a12 a21 a22 = a11 a22 − a12 a21
and
a11
a21
a31
a12
a22
a32
a13
a23
a33
= a11 a22 a33 − a11 a23 a32 + a12 a23 a31 − a12 a21 a33
+a13 a21 a32 − a13 a22 a31 .
It is not hard to see that there is a pattern at work. For an n × n determinant there
should be a sum of terms where
• each term is a product of n entries from the matrix
• these products involve exactly one element from each row and one element
from each column of the matrix
• each such product appears exactly once
• each product has a coeffiecient of ±1
The only difficulty with this formula is in specifying which terms get a factor of +1
and which get a factor of −1.
Permutations
To describe a single term in the determinants above, we make a normalization: we
order the factors by their first subscript. Since we want one entry from each row of
the matrix, each integer 1, . . . , n should appear exactly once as a first index. So,
each term will have the form a1b1 a2b2 · · · anbn for some integers bi . Since we want
to hit each column exactly once, the bi should run through the elements of the set
Nn = {1, 2, . . . , n} hitting each element once in some order. So, if we consider the
function σ : Nn → Nn , given by σ(i) = bi , then the requirement for σ becomes that
σ has to be a bijective function.
In general, a permutation of a set A is a bijective function σ : A → A. So, we are
interested in functions σ which give permutations of Nn . Traditionally, the set of
permutations on the set Nn is denoted by Sn . In this notation, we can write down
our proposed formula for determinants with only one piece yet to define:
X
(1)
det(aij ) =
sgn(σ) a1σ(1) a2σ(2) · · · anσ(n) .
σ∈Sn
The notation means that we sum over all permutations on Nn . The factor sgn(σ)
is called the sign of the permutation σ, and it is either +1 or −1.
Note, that the number of permutations on n things is n!, so |Sn | = n!. This matches
the formulas above, that a 2×2 determinant is a sum of 2! = 2 terms, and for a 3×3
determinant it has 3! = 6 terms. Extrapolating, we see that the direct formulas
are not useful for determinants of large matrices. For example, in just a 5 × 5
determinant, there would be 5! = 120 summands, and a 20 × 20 matrix would give
20! = 2,432,902,008,176,640,000 terms to compute and add up. As a result, this
1
2
formula for determinants is useful for small matricies computationally, and can be
useful in general for theoretical purposes.
Transpositions and the Sign of a Permutation
A transposition is a permutation which switches two values and sends every other
element of Nn to itself. As a function, it could be written down as follows. Suppose
i, j ∈ Nn with i 6= j. We can write the transposition which switches i and j, which
we will denote by ti,j , as follows:


x if x 6= j and x 6= j
ti,j (x) = j if x = i


i if x = j
Note, all permutations have inverse functions because permutations are bijective.
These inverse functions are then also bijective from Nn to Nn , so they are permutations also. A transposition is its own inverse.
We will need a few facts about permutations and transpositions in Sn for n ≥ 1.
(1) Every permutation σ ∈ Sn can be written as the composition of k transpositions for some k ≥ 0. By convention, we consider a composition of 0
transpositions to be the identity function.
(2) If σ is the composition of k transpositions and also of k 0 transpositions,
then either both k and k 0 are even or they are both odd. We can then
define
sgn(σ) = (−1)k
since the sign of σ does not depend on how it is written as a composition
of transpositions.
(3) If σ, τ ∈ Sn , sgn(σ ◦ τ ) = sgn(σ) · sgn(τ ).
(4) For all σ ∈ Sn , sgn(σ) = sgn(σ −1 ).
We sketch the proofs of most of these statements.
(1) The idea is to write an arbitrary permutation by a sequence of swaps (composition of transpositions). One way to do this is to swap 1 with σ(1), then
swap the image of 2 with σ(2), then swap the image of 3 with σ(3), and
so on. Each step is either a transposition or the identity (which can be
dropped from the composition).
(2) This is the most difficult part. A proof is deferred until the end of this
writeup.
(3) If σ can be written as the composition of k transpositions, and τ is the
composition of ` transpositions, then the composition σ ◦ τ entails doing
` transpositions (for τ ) followed by k transpositions (for σ). The result is
that the composition can be accomplished with k + ` transpositions. Thus
sgn(σ ◦ τ ) = (−1)k+` = (−1)k (−1)` = sgn(σ) · sgn(τ )
(4) Noting that the identity function i has sgn(i) = 1, we have
1 = sgn(i) = sgn(σ ◦ σ −1 ) = sgn(σ) · sgn(σ −1 ) .
3
So, since sgn(σ) and sgn(σ −1 ) are ±1 must have sgn(σ) = sgn(σ −1 ).
Note that in addition to sgn(i) = 1 for the identity map i, we have that for a
transposition ti,j , sgn(ti,j ) = (−1)1 = −1.
Properties of Determinants
Many properties of determinants are fairly easy to prove in terms of formula (1).
Here we will sketch some of those proofs. The most difficult property to prove turns
out to be the connection to cofactor expansion, and the problem there has more to
do with complicated notation than anything else.
Throughout, A is an n × n matrix, we assume formula (1) gives det(A).
Proposition 1. If A has a row with all zeros, then det(A) = 0.
Proof. If row i is all zero, then aiσ(i) = 0 for all σ ∈ Sn . Thus, every term in
formula (1) is 0.
Proposition 2 (Diagonal matricies). If A is a diagonal matrix, then det(A) =
a11 a22 · · · ann .
Proof. The fact that A is a diagonal matrix implies that aij = 0 if i 6= j. So in
formula (1), a term has a factor of 0 unless every factor is of the form aii . This
restricts us to terms where σ(i) = i for all i. But, this makes σ = i the identity
function. So, formula (1) reduces to one term. Since sgn(i) = 1, we get
det(A) = sgn(i)a1i(1) a2i(2) · · · ani(n) = a11 a22 · · · ann .
Corollary (Identity matrix).
det(In ) = 1
This follows since the identity matrix is a special case of a diagonal matrix.
Proposition 3 (Linearity in the rows). If B is the matrix where row i of A is
replaced with aij + kcj and C is the matrix where row i of A is replaced with cj ,
then det(B) = det(A) + k det(C).
Proof. The terms in the sum for det(B) take the form
a1σ(1) · · · a(i−1)σ(i−1) (aiσ(i) + kcσ(i) )a(i+1)σ(i+1) · · · anσ(n)
= a1σ(1) · · · a(i−1)σ(i−1) aiσ(i) a(i+1)σ(i+1) · · · anσ(n)
+ a1σ(1) · · · a(i−1)σ(i−1) (kcσ(i) )a(i+1)σ(i+1) · · · anσ(n)
= a1σ(1) · · · a(i−1)σ(i−1) aiσ(i) a(i+1)σ(i+1) · · · anσ(n)
+ k · a1σ(1) · · · a(i−1)σ(i−1) cσ(i) a(i+1)σ(i+1) · · · anσ(n)
After distributing, we have terms which match those of det(A) + k det(C).
Proposition 4 (Behavior under transpose).
det(A) = det(At )
4
Proof. We note that if σ ∈ Sn and σ(i) = j, then σ −1 (j) = i. So then a term
aij = aiσ(i) = aσ−1 (j)j . Applying this to formula (1), we have
X
det(A) =
sgn(σ) a1σ(1) a2σ(2) · · · anσ(n)
σ∈Sn
=
X
sgn(σ) aσ−1 (1)1 aσ−1 (2)2 · · · aσ−1 (n)n
rearranging terms in the products
σ∈Sn
=
X
sgn(σ −1 ) aσ−1 (1)1 aσ−1 (2)2 · · · aσ−1 (n)n
using sgn(σ) = sgn(σ −1 )
σ∈Sn
=
X
sgn(σ) aσ(1)1 aσ(2)2 · · · aσ(n)n
σ∈Sn
For the last line, we replace σ by σ −1 and note that each permutation σ ∈ Sn
appears as the inverse of exactly one function in Sn . The last line is det(At ). Having proved det(A) = det(At ), we can deal with properties of either row or
column operations. Some properties will be easier to prove in terms of column
operations. So, statements about row operations apply to columns, and statements
about column operations apply to rows. We will not explicitly state these corollaries.
Proposition 5 (Behavior under type I operations). If B is a matrix gotten from
A by swapping two columns, then det(B) = − det(A).
Proof. If det(A) is given by formula (1), let τ is the transposition which gives the
column swap taking A to B. Note, sgn(τ ) = −1 because τ is a transposition. Then
the formula for det(B) is
X
det(B) =
sgn(σ) a1τ ◦σ(1) · · · anτ ◦σ(n)
σ∈Sn
=−
X
sgn(τ ) · sgn(σ) a1τ ◦σ(1) · · · anτ ◦σ(n)
σ∈Sn
=−
X
sgn(τ ◦ σ) a1τ ◦σ(1) · · · anτ ◦σ(n)
σ∈Sn
But, τ is fixed, and as σ runs through the elements of Sn , σ ◦ τ also runs through
the elements of Sn . Replacing σ ◦ τ with σ gives us − det(A).
Proposition 6 (Behavior under type II operations). If B is a matrix gotten from
A by multiplying a row by a constant α, then det(B) = α · det(A).
Proof. This follows from linearity.
Or, if row i is the one which is multiplied by α, we just note that each term in the
formula for det(B) has
a1σ(1) · · ·a(i−1)σ(i−1) (αaiσ(i) )a(i+1)σ(i+1) · · · anσ(n)
= αa1σ(1) · · · a(i−1)σ(i−1) aiσ(i) a(i+1)σ(i+1) · · · anσ(n) .
So, each term of the determinant is multiplied by α. Factoring out α gives the
result.
5
Proposition 7 (Alternating for columns). If two columns of B are the same, then
det(B) = 0.
Proof. We will give two proofs. The first one is simpler, but only works if the
characteristic of F is not 2.
In general, if B 0 is the matrix gotten from B by swapping rows i and j, then we
already know det(B) = − det(B 0 ). But, we are now in a case where B 0 = B, so
det(B) = − det(B). This implies 2 det(B) = 0, so if the characteristic of F is not
2, det(B) = 0.
Now for the second proof which works in all characteristics. The idea is to pair
the elements in formula 1 so that each pair has the same absolute value, but with
opposite signs. So, these pairs will cancel out.
If j and k are the columns of B which are the same, then bij = bik for all i. Consider
the transposition which swaps j and k, tj,k . Since tj,k fixes all other elements of
Nn , bitj,k (`) = bi` for all ` ∈ Nn . So, for each term from formula (1),
b1σ(1) · · · bnσ(n) = b1tj,k (σ(1)) · · · bntj,k (σ(n))
We want to pair off elements the elements of Sn as σ and tj,k ◦ σ. For this to work,
we need to know that starting with tj,k ◦ σ we get back to σ. But tj,k ◦ (tj,k ◦ σ) =
(tj,k ◦ tj,k ) ◦ σ = σ because the transposition tj,k is its own inverse. We also need
to know that tj,k ◦ σ 6= σ (so that our pair really has 2 elements in it), but they
can’t be equal because they have opposite signs.
Finally, since sgn(tj,k ) = −1 we have
sgn(σ)b1σ(1) · · ·bnσ(n) + sgn(tj,k ◦ σ)b1tj,k (σ(1)) · · · bntj,k (σ(n))
= sgn(σ)b1σ(1) · · · bnσ(n) + sgn(tj,k )sgn(σ)b1σ(1) · · · bnσ(n)
= sgn(σ)b1σ(1) · · · bnσ(n) − sgn(σ)b1σ(1) · · · bnσ(n)
=0
Since all of the terms are paired off like this, det(B) = 0.
Proposition 8 (Behavior under type III operations). If a type III row operation
is applied to a matrix A to get matrix B, then det(A) = det(B).
Proof. This can now be proven as in the text using linearity in the rows, and the
fact that if two rows are the same, then the determinant is 0.
Even/Oddness of Permutations
Recall, every permutation σ ∈ Sn can be written as a composition σ = τ1 ◦ · · · ◦ τn
where each τj is a transposition. We want to define sgn(σ) = (−1)k , but need to
prove that this is independent of the number of transpositions used.
6
One way to construct a permutation in Sn is to pick k distinct elements from
ai ∈ {1, . . . , n} and consider the map which does the following:
a1 7→ a2 7→ a3 7→ · · · 7→ ak 7→ a1
and leaves all other elements in place. Such an element is called an k-cycle. We
use the shorthand notation (a1 , . . . , ak ) for this permutation.
Example. In S4 ,
1
3
1
3
1
3
2
1
3
4
2
2
3
4
2
2
3
1
4
= (1, 3, 4, 2)
2
4
= (1, 3, 4)
1
4
= (1, 3)
4
is a 4-cycle
is a 3-cycle
is a 2-cycle
Any 1-cycle is the identity element. Note, since a cycle is moving elements around
in a circle, we can start at any point. So,
(1, 3, 4, 2) = (3, 4, 2, 1) = (4, 2, 1, 3) = (2, 1, 3, 4).
1 2 3 4
Now not every permutation is a cycle. For example,
swaps 1 and
2 1 4 3
2 and also swaps 3 and 4. It is not just one cycle, but it is a composition of two
cycles1.
1 2 3 4
= (1, 2)(3, 4)
2 1 4 3
These cycles are said to be disjoint because no element is moved by both.
Example. A product of cycles which are not disjoint would be
1 2 3
(1, 2)(2, 3) =
= (1, 2, 3)
2 3 1
Note, permutations are functions, so we compute products (compositions) just as
for functions. In this example, if σ = (1, 2) and τ = (2, 3), then
στ (3) = σ(τ (3)) = σ(2) = 1
Every permutation can be written as a product of disjoint cycles, and the number of these cycles (and the elements in them) is completely determined by the
permutation. Here is how to do it.
Given a permutation σ ∈ Sn , we can pick an element a1 ∈ {1, . . . , n}, and repeatedly apply σ:
a2 = σ(a1 ), a3 = σ(a2 ), . . .
The values come from the finite set {1, . . . , n}, so after at most n+1 times, we must
have a repeat ai = aj with i < j. If i > 1, note σ(ai−1 ) = ai = aj = σ(aj−1 ). Since
σ is bijective, it is 1-1, so ai−1 = aj−1 . Repeating this process, we find that the first
repetition is when a1 = ak for some k. Thus we get a k − 1 cycle: (a1 , a2 , . . . , ak−1 ).
1Just like with linear transformations, we write the composition of permutations as a product.
7
We can start this process with any value from {1, . . . , n} and get a cycle, and the
different cycles are either exactly the same, or are disjoint. Doing this until we have
used all elements of {1, . . . , n}, we can write σ as a composition of cycles.
Note, transpositions are themselves cycles. If we swap i and j, it is the 2-cycle
(i, j).
Now, for a permutation σ ∈ Sn , we write it as a composition of disjoint cycles as
above σ = c1 c2 · · · c`(σ) including the 1-cycles so that every element from {1, . . . , n}
is used. Then the number of cycles used is denoted `(σ) for the length of σ.
Example. For
1
σ=
2
2
3
3
1
4
5
5
4
6
6
7
7
8
8
= (1, 2, 3)(4, 5)(6)(7)(8) ∈ S8 ,
`(σ) = 5 because we have 5 cylces
Example. If i ∈ Sn is the identity function, every value maps to itself, so they are
all 1-cycles
i = (1)(2)(3)(4)(5) · · · (n)
so `(i) = n.
Example. If τ ∈ Sn is a transposition, there is one pair swapped, and all other
elements are fixed. In other words, there is one 2-cycle, and n − 2 one-cycles. Thus,
`(τ ) = 1 + n − 2 = n − 1.
Now define
sgn(σ) = (−1)n+`(σ) .
The good news is that this value is well-defined. The bad news is that we still need
to connect it to the earlier idea of the sign of a permutation.
Example. For the example above in S8 , sgn(σ) = (−1)8+5 = −1. For the
identity in Sn , sgn(i) = (−1)n+n = (−1)2n = 1, and for a transposition,
sgn(τ ) = (−1)n+n−1 = (−1)2n−1 = −1.
To connect this definition to the one we had originally, we need to prove
Proposition 9. If σ ∈ Sn and σ equals a product of the identity times k transpositions, then (−1)n+`(σ) = (−1)k .
Proof. We do this by induction on k. The base case is when k = 0, so (−1)k = 1.
But then σ = i the identity, and as we have already seen, (−1)n+`(i) = (−1)2n = 1,
so the conclusion holds. Note, we could also start with k = 1, the case of one
transposition, which was also addressed in an example above.
For the inductive step, suppose
σ = τ1 τ2 · · · τk+1 = τ1 µ
where µ is the remaining product. By the inductive hypothesis, since µ is written
as a product of k transpositions, (−1)n+`(µ) = (−1)k . So, it is enough to show that
`(σ) = `(τ1 µ) = `(µ) ± 1 .
Then, both sides of the equation flip signs, and stay equal.
8
Since τ is a transposition, τ1 = (a, b) for some a, b ∈ {1, . . . , n}, and a 6= b. There
are two cases. First, suppose a and b are both in the same cycle of µ. Then
that cycle has the form (a1 , . . . , ar ) where a = ai and b = aj for some i, j. Since
(a, b) = (b, a), we can assume i < j. But then
(ai , aj )(a1 , . . . , ar ) = (a1 , . . . , ai−1 , aj , . . . , ar )(ai , . . . , aj−1 )
So, in this case, multiplying by a transposition broke the one cycle into two increasing the number of cycles by one (no other cycle is affected because they do
not involve a or b).
The other case is when a and b are in different disjoint cycles for µ. WLOG, those
cycles can be written (a1 , . . . , ar ) and (b1 , . . . , bs ) with a = a1 and b = b1 . Then,
(a1 , b1 )(a1 , . . . , ar )(b1 , . . . , bs ) = (a1 , . . . , ar , b1 , . . . , bs ).
So, the number of disjoint cycles goes down by one, completing the proof.