Roundoff Analysis of Gaussian Elimination

Jim Lambers
MAT 610
Summer Session 2009-10
Lecture 5 Notes
These notes correspond to Sections 3.3 and 3.4 in the text.
Roundoff Analysis of Gaussian Elimination
In this section, we will perform a detailed error analysis of Gaussian elimination, illustrating the
analysis originally carried out by J. H. Wilkinson. The process of solving 𝐴x = b consists of three
stages:
¯π‘ˆ
¯.
1. Factoring 𝐴 = πΏπ‘ˆ , resulting in an approximate πΏπ‘ˆ decomposition 𝐴 + 𝐸 = 𝐿
2. Solving 𝐿y = b, or, numerically, computing y such that
¯ + 𝛿 𝐿)(y
¯
(𝐿
+ 𝛿y) = b
3. Solving π‘ˆ x = y, or, numerically, computing x such that
¯ + π›Ώπ‘ˆ
¯ )(x + 𝛿x) = y + 𝛿y.
(π‘ˆ
Combining these stages, we see that
¯ + 𝛿 𝐿)(
¯ π‘ˆ
¯ + π›Ώπ‘ˆ
¯ )(x + 𝛿x)
b = (𝐿
¯π‘ˆ
¯ + 𝛿𝐿
¯π‘ˆ
¯ + 𝐿𝛿
¯ π‘ˆ
¯ + 𝛿 𝐿𝛿
¯ π‘ˆ
¯ )(x + 𝛿x)
= (𝐿
¯π‘ˆ
¯ + 𝐿𝛿
¯ π‘ˆ
¯ + 𝛿 𝐿𝛿
¯ π‘ˆ
¯ )(x + 𝛿x)
= (𝐴 + 𝐸 + 𝛿 𝐿
= (𝐴 + 𝛿𝐴)(x + 𝛿x)
¯π‘ˆ
¯ + 𝐿𝛿
¯ π‘ˆ
¯ + 𝛿 𝐿𝛿
¯ π‘ˆ
¯.
where 𝛿𝐴 = 𝐸 + 𝛿 𝐿
In this analysis, we will view the computed solution xΜ„ = x + 𝛿x as the exact solution to the
perturbed problem (𝐴 + 𝛿𝐴)x = b. This perspective is the idea behind backward error analysis,
which we will use to determine the size of the perturbation 𝛿𝐴, and, eventually, arrive at a bound
for the error in the computed solution xΜ„.
Error in the πΏπ‘ˆ Factorization
Let 𝐴(π‘˜) denote the matrix 𝐴 after π‘˜ βˆ’ 1 steps of Gaussian elimination have been performed, where
a step denotes the process of making all elements below the diagonal within a particular column
1
equal to zero. Then, in exact arithmetic, the elements of 𝐴(π‘˜+1) are given by
(π‘˜)
(π‘˜+1)
π‘Žπ‘–π‘—
=
(π‘˜)
π‘Žπ‘–π‘—
βˆ’
(π‘˜)
π‘šπ‘–π‘˜ π‘Žπ‘˜π‘— ,
π‘šπ‘–π‘˜ =
π‘Žπ‘–π‘˜
(π‘˜)
.
(1)
π‘Žπ‘˜π‘˜
Let 𝐡 (π‘˜) denote the matrix 𝐴 after π‘˜ βˆ’ 1 steps of Gaussian elimination have been performed using
floating-point arithmetic. Then the elements of 𝐡 (π‘˜+1) are
( (π‘˜) )
π‘π‘–π‘˜
(π‘˜+1)
(π‘˜)
(π‘˜)
(π‘˜+1)
𝑏𝑖𝑗
= 𝑏𝑖𝑗 βˆ’ π‘ π‘–π‘˜ π‘π‘˜π‘— + πœ–π‘–π‘— , π‘ π‘–π‘˜ = 𝑓 𝑙
.
(2)
(π‘˜)
π‘π‘˜π‘˜
For 𝑗 β‰₯ 𝑖, we have
(2)
= 𝑏𝑖𝑗 βˆ’ 𝑠𝑖1 𝑏1𝑗 + πœ–π‘–π‘—
(3)
= 𝑏𝑖𝑗 βˆ’ 𝑠𝑖2 𝑏2𝑗 + πœ–π‘–π‘—
..
.
(𝑖)
= 𝑏𝑖𝑗
𝑏𝑖𝑗
𝑏𝑖𝑗
𝑏𝑖𝑗
(1)
(1)
(2)
(2)
(2)
(3)
(π‘–βˆ’1)
(π‘–βˆ’1)
(𝑖)
βˆ’ 𝑠𝑖,π‘–βˆ’1 π‘π‘–βˆ’1,𝑗 + πœ–π‘–π‘—
Combining these equations yields
𝑖
βˆ‘
(π‘˜)
𝑏𝑖𝑗 =
π‘˜=2
π‘–βˆ’1
βˆ‘
(π‘˜)
𝑏𝑖𝑗 βˆ’
π‘˜=1
π‘–βˆ’1
βˆ‘
(π‘˜)
π‘ π‘–π‘˜ π‘π‘˜π‘— +
π‘˜=1
𝑖
βˆ‘
(π‘˜)
πœ–π‘–π‘—
π‘˜=2
Cancelling terms, we obtain
(1)
(𝑖)
𝑏𝑖𝑗 = 𝑏𝑖𝑗 +
π‘–βˆ’1
βˆ‘
(π‘˜)
π‘ π‘–π‘˜ π‘π‘˜π‘— βˆ’ 𝑒𝑖𝑗 ,
𝑗 β‰₯ 𝑖,
(3)
π‘˜=1
(π‘˜)
π‘˜=2 πœ–π‘–π‘— .
βˆ‘π‘–
where 𝑒𝑖𝑗 =
For 𝑖 > 𝑗,
(2)
= 𝑏𝑖𝑗 βˆ’ 𝑠𝑖1 𝑏1𝑗 + πœ–π‘–π‘—
..
.
(𝑗)
= 𝑏𝑖𝑗
𝑏𝑖𝑗
𝑏𝑖𝑗
(𝑗)
(𝑗)
(𝑗)
(1)
(π‘—βˆ’1)
(1)
(2)
(π‘—βˆ’1)
(𝑗)
βˆ’ 𝑠𝑖,π‘—βˆ’1 π‘π‘—βˆ’1,𝑗 + πœ–π‘–π‘—
(𝑗)
where 𝑠𝑖𝑗 = 𝑓 𝑙(𝑏𝑖𝑗 /𝑏𝑗𝑗 ) = 𝑏𝑖𝑗 /𝑏𝑗𝑗 (1 + πœ‚π‘–π‘— ), and therefore
(𝑗)
(𝑗)
(𝑗)
(𝑗)
(𝑗)
(𝑗+1)
0 = 𝑏𝑖𝑗 βˆ’ 𝑠𝑖𝑗 𝑏𝑗𝑗 + 𝑏𝑖𝑗 πœ‚π‘–π‘—
= 𝑏𝑖𝑗 βˆ’ 𝑠𝑖𝑗 𝑏𝑗𝑗 + πœ–π‘–π‘—
(1)
= 𝑏𝑖𝑗 βˆ’
𝑗
βˆ‘
π‘˜=1
2
(π‘˜)
π‘ π‘–π‘˜ π‘π‘˜π‘— + 𝑒𝑖𝑗
(4)
From (3) and (4), we obtain
⎑
⎒
⎒
¯
¯
πΏπ‘ˆ = ⎒
⎣
1
𝑠21
..
.
𝑠𝑛1
⎀ ⎑ (1) (1)
(1)
𝑏11 𝑏12 β‹… β‹… β‹… 𝑏1𝑛
..
..
βŽ₯⎒
1
.
.
βŽ₯⎒
⎒
βŽ₯⎒
..
..
.
⎦
..
.
⎣
.
(𝑛)
β‹…β‹…β‹… β‹…β‹…β‹… 1
𝑏𝑛𝑛
where
⎀
βŽ₯
βŽ₯
βŽ₯ = 𝐴 + 𝐸.
βŽ₯
⎦
(π‘˜)
π‘ π‘–π‘˜ =
(π‘˜) (π‘˜)
𝑓 𝑙(π‘π‘–π‘˜ /π‘π‘˜π‘˜ )
=
π‘π‘–π‘˜
(π‘˜)
π‘π‘˜π‘˜
βˆ£πœ‚π‘–π‘˜ ∣ ≀ u
(1 + πœ‚π‘–π‘˜ ),
Then,
(π‘˜)
(π‘˜)
(π‘˜)
(π‘˜)
βˆ£πœƒπ‘–π‘— ∣ ≀ u
𝑓 𝑙(π‘ π‘–π‘˜ π‘π‘˜π‘— ) = π‘ π‘–π‘˜ π‘π‘˜π‘— (1 + πœƒπ‘–π‘— ),
and so,
(π‘˜+1)
𝑏𝑖𝑗
(π‘˜)
(π‘˜)
(π‘˜)
= 𝑓 𝑙(𝑏𝑖𝑗 βˆ’ π‘ π‘–π‘˜ π‘π‘˜π‘— (1 + πœƒπ‘–π‘— ))
(π‘˜)
(π‘˜)
(π‘˜)
(π‘˜)
= (𝑏𝑖𝑗 βˆ’ π‘ π‘–π‘˜ π‘π‘˜π‘— (1 + πœƒπ‘–π‘— ))(1 + πœ‘π‘–π‘— ),
(π‘˜)
βˆ£πœ‘π‘–π‘— ∣ ≀ u.
After applying (2), and performing some manipulations, we obtain
(
(π‘˜+1)
πœ–π‘–π‘—
=
(π‘˜+1)
𝑏𝑖𝑗
(π‘˜)
πœ‘π‘–π‘—
1+
We then have the following bound for the
⎑
0 β‹…β‹…β‹…
⎒ 1 β‹…β‹…β‹…
⎒
⎒ 1 2
⎒
∣𝐸∣ ≀ (1 + β„“)πΊπ‘Žu ⎒
⎒ ... ...
⎒
⎒
⎣
1 2
)
(π‘˜) (π‘˜)
(π‘˜)
πœ‘π‘–π‘—
βˆ’ π‘ π‘–π‘˜ π‘π‘˜π‘— πœƒπ‘–π‘— .
entries of 𝐸:
β‹…β‹…β‹… β‹…β‹…β‹…
β‹…β‹…β‹… β‹…β‹…β‹…
β‹…β‹…β‹… β‹…β‹…β‹…
β‹…β‹…β‹…
β‹…β‹…β‹…
β‹…β‹…β‹…
0
1
2
β‹…β‹…β‹… β‹…β‹…β‹…
3
..
..
.
β‹…β‹…β‹…
.
β‹…β‹…β‹… 𝑛 βˆ’ 1 𝑛 βˆ’ 1
3
3
⎀
βŽ₯
βŽ₯
βŽ₯
βŽ₯
βŽ₯ + 𝑂(u2 ),
βŽ₯
βŽ₯
βŽ₯
⎦
where β„“ = max𝑖,𝑗 βˆ£π‘ π‘–π‘— ∣ and 𝐺 is the growth factor defined by
(π‘˜)
𝐺=
max𝑖,𝑗,π‘˜ ∣¯π‘π‘–𝑗 ∣
max𝑖,𝑗 βˆ£π‘Žπ‘–π‘— ∣
.
(π‘˜)
The entries of the above matrix indicate how many of the πœ–π‘–π‘— are included in each entry 𝑒𝑖𝑗 of 𝐸.
3
Bounding the perturbation in 𝐴
From a roundoff analysis of forward and back substitution, which we do not reproduce here, we
have the bounds
¯ 𝑖𝑗 ∣ ≀ 𝑛uβ„“ + 𝑂(u2 ),
max βˆ£π›Ώ 𝐿
𝑖,𝑗
¯π‘–𝑗 ∣ ≀ 𝑛uπΊπ‘Ž + 𝑂(u2 )
max βˆ£π›Ώ π‘ˆ
𝑖,𝑗
¯ 𝑖𝑗 ∣, and 𝐺 is the growth factor. Putting our bounds together,
where π‘Ž = max𝑖,𝑗 βˆ£π‘Žπ‘–π‘— ∣, β„“ = max𝑖,𝑗 ∣𝐿
we have
¯ π‘ˆ
¯π‘–𝑗 ∣ + max βˆ£π‘ˆ
¯ 𝛿𝐿
¯ 𝑖𝑗 ∣ + max βˆ£π›Ώ 𝐿𝛿
¯ π‘ˆ
¯π‘–𝑗 ∣
max βˆ£π›Ώπ΄π‘–π‘— ∣ ≀ max βˆ£π‘’π‘–π‘— ∣ + max βˆ£πΏπ›Ώ
𝑖,𝑗
𝑖,𝑗
𝑖,𝑗
𝑖,𝑗
2
2
𝑖,𝑗
2
≀ 𝑛(1 + β„“)πΊπ‘Žu + 𝑛 β„“πΊπ‘Žu + 𝑛 β„“πΊπ‘Žu + 𝑂(u )
from which it follows that
βˆ₯𝛿𝐴βˆ₯∞ ≀ 𝑛2 (2𝑛ℓ + β„“ + 1)πΊπ‘Žu + 𝑂(u2 ).
Bounding the error in the solution
Let xΜ„ = x + 𝛿x be the computed solution. Since the exact solution to 𝐴x = b is given by
x = π΄βˆ’1 b, we are also interested in examining xΜ„ = (𝐴 + 𝛿𝐴)βˆ’1 b. Can we say something about
βˆ₯(𝐴 + 𝛿𝐴)βˆ’1 b βˆ’ π΄βˆ’1 bβˆ₯?
We assume that βˆ₯π΄βˆ’1 𝛿𝐴βˆ₯ = π‘Ÿ < 1. We have
𝐴 + 𝛿𝐴 = 𝐴(𝐼 + π΄βˆ’1 𝛿𝐴) = 𝐴(𝐼 βˆ’ 𝐹 ),
𝐹 = βˆ’π΄βˆ’1 𝛿𝐴.
From the manipulations
(𝐴 + 𝛿𝐴)βˆ’1 b βˆ’ π΄βˆ’1 b = (𝐼 + π΄βˆ’1 𝛿𝐴)βˆ’1 π΄βˆ’1 b βˆ’ π΄βˆ’1 b
= (𝐼 + π΄βˆ’1 𝛿𝐴)βˆ’1 (π΄βˆ’1 βˆ’ (𝐼 + π΄βˆ’1 𝛿𝐴)π΄βˆ’1 )b
= (𝐼 + π΄βˆ’1 𝛿𝐴)βˆ’1 (βˆ’π΄βˆ’1 (𝛿𝐴)π΄βˆ’1 )b
and this result from Lecture 2,
βˆ₯(𝐼 βˆ’ 𝐹 )βˆ’1 βˆ₯ ≀
1
,
1βˆ’π‘Ÿ
we obtain
βˆ₯𝛿xβˆ₯
βˆ₯xβˆ₯
=
βˆ₯(x + 𝛿x) βˆ’ xβˆ₯
βˆ₯xβˆ₯
4
=
≀
≀
≀
βˆ₯(𝐴 + 𝛿𝐴)βˆ’1 b βˆ’ π΄βˆ’1 bβˆ₯
βˆ₯π΄βˆ’1 bβˆ₯
1
βˆ₯π΄βˆ’1 βˆ₯βˆ₯𝛿𝐴βˆ₯
1βˆ’π‘Ÿ
1
βˆ₯𝛿𝐴βˆ₯
πœ…(𝐴)
βˆ’1
1 βˆ’ βˆ₯𝐴 𝛿𝐴βˆ₯
βˆ₯𝐴βˆ₯
πœ…(𝐴)
βˆ₯𝛿𝐴βˆ₯
.
βˆ₯𝛿𝐴βˆ₯ βˆ₯𝐴βˆ₯
1 βˆ’ πœ…(𝐴)
βˆ₯𝐴βˆ₯
Note that a similar conclusion is reached if we assume that the computed solution xΜ„ solves a nearby
problem in which both 𝐴 and b are perturbed, rather than just 𝐴.
We see that the important factors in the accuracy of the computed solution are
βˆ™ The growth factor 𝐺
βˆ™ The size of the multipliers π‘šπ‘–π‘˜ , bounded by β„“
βˆ™ The condition number πœ…(𝐴)
βˆ™ The precision u
In particular, πœ…(𝐴) must be large with respect to the accuracy in order to be troublesome. For
example, consider the scenario where πœ…(𝐴) = 102 and u = 10βˆ’3 , as opposed to the case where
πœ…(𝐴) = 102 and u = 10βˆ’50 . However, it is important to note that even if 𝐴 is well-conditioned, the
error in the solution can still be very large, if 𝐺 and β„“ are large.
Pivoting
During Gaussian elimination, it is necessary to interchange rows of the augmented matrix whenever
the diagonal element of the column currently being processed, known as the pivot element, is equal
to zero.
However, if we examine the main step in Gaussian elimination,
(𝑗+1)
π‘Žπ‘–π‘˜
(𝑗)
(𝑗)
= π‘Žπ‘–π‘˜ βˆ’ π‘šπ‘–π‘— π‘Žπ‘—π‘˜ ,
(𝑗)
we can see that any roundoff error in the computation of π‘Žπ‘—π‘˜ is amplified by π‘šπ‘–π‘— . Because the
multipliers can be arbitrarily small, it follows from the previous analysis that the error in the
computed solution can be arbitrarily large. Therefore, Gaussian elimination is numerically unstable.
Therefore, it is helpful if it can be ensured that the multipliers are small. This can be accomplished by performing row interchanges, or pivoting, even when it is not absolutely necessary to do
so for elimination to proceed.
5
Partial Pivoting
One approach is called partial pivoting. When eliminating elements in column 𝑗, we seek the largest
element in column 𝑗, on or below the main diagonal, and then interchanging that element’s row
with row 𝑗. That is, we find an integer 𝑝, 𝑗 ≀ 𝑝 ≀ 𝑛, such that
βˆ£π‘Žπ‘π‘— ∣ = max βˆ£π‘Žπ‘–π‘— ∣.
𝑗≀𝑖≀𝑛
Then, we interchange rows 𝑝 and 𝑗.
(𝑗) (𝑗)
In view of the definition of the multiplier, π‘šπ‘–π‘— = π‘Žπ‘–π‘— /π‘Žπ‘—π‘— , it follows that βˆ£π‘šπ‘–π‘— ∣ ≀ 1 for
𝑗 = 1, . . . , 𝑛 βˆ’ 1 and 𝑖 = 𝑗 + 1, . . . , 𝑛. Furthermore, while pivoting in this manner requires 𝑂(𝑛2 )
comparisons to determine the appropriate row interchanges, that extra expense is negligible compared to the overall cost of Gaussian elimination, and therefore is outweighed by the potential
reduction in roundoff error. We note that when partial pivoting is used, the growth factor 𝐺 is
2π‘›βˆ’1 , where 𝐴 is 𝑛 × π‘›.
Complete Pivoting
While partial pivoting helps to control the propagation of roundoff error, loss of significant digits
(𝑗)
can still result if, in the abovementioned main step of Gaussian elimination, π‘šπ‘–π‘— π‘Žπ‘—π‘˜ is much larger
(𝑗)
(𝑗)
in magnitude than π‘Žπ‘–π‘— . Even though π‘šπ‘–π‘— is not large, this can still occur if π‘Žπ‘—π‘˜ is particularly
large.
Complete pivoting entails finding integers 𝑝 and π‘ž such that
βˆ£π‘Žπ‘π‘ž ∣ =
max
𝑗≀𝑖≀𝑛,π‘—β‰€π‘žβ‰€π‘›
βˆ£π‘Žπ‘–π‘— ∣,
and then using both row and column interchanges to move π‘Žπ‘π‘ž into the pivot position in row 𝑗 and
column 𝑗. It has been proven that this is an effective strategy for ensuring that Gaussian elimination
is backward stable, meaning it does not cause the entries of the matrix to grow exponentially as
they are updated by elementary row operations, which is undesirable because it can cause undue
amplification of roundoff error.
The πΏπ‘ˆ Decomposition with Pivoting
Suppose that pivoting is performed during Gaussian elimination. Then, if row 𝑗 is interchanged
with row 𝑝, for 𝑝 > 𝑗, before entries in column 𝑗 are eliminated, the matrix 𝐴(𝑗) is effectively
multiplied by a permutation matrix 𝑃 (𝑗) . A permutation matrix is a matrix obtained by permuting
the rows (or columns) of the identity matrix 𝐼. In 𝑃 (𝑗) , rows 𝑗 and 𝑝 of 𝐼 are interchanged, so that
multiplying 𝐴(𝑗) on the left by 𝑃 (𝑗) interchanges these rows of 𝐴(𝑗) . It follows that the process of
Gaussian elimination with pivoting can be described in terms of the matrix multiplications
𝑀 (π‘›βˆ’1) 𝑃 (π‘›βˆ’1) 𝑀 (π‘›βˆ’2) 𝑃 (π‘›βˆ’2) β‹… β‹… β‹… 𝑀 (1) 𝑃 (1) 𝐴 = π‘ˆ,
6
where 𝑃 (π‘˜) = 𝐼 if no interchange is performed before eliminating entries in column π‘˜.
However, because each permutation matrix 𝑃 (π‘˜) at most interchanges row π‘˜ with row 𝑝, where
𝑝 > π‘˜, there is no difference between applying all of the row interchanges β€œup front”, instead of
applying 𝑃 (π‘˜) immediately before applying 𝑀 (π‘˜) for each π‘˜. It follows that
[𝑀 (π‘›βˆ’1) 𝑀 (π‘›βˆ’2) β‹… β‹… β‹… 𝑀 (1) ][𝑃 (π‘›βˆ’1) 𝑃 (π‘›βˆ’2) β‹… β‹… β‹… 𝑃 (1) ]𝐴 = π‘ˆ,
and because a product of permutation matrices is a permutation matrix, we have
𝑃 𝐴 = πΏπ‘ˆ,
where 𝐿 is defined as before, and 𝑃 = 𝑃 (π‘›βˆ’1) 𝑃 (π‘›βˆ’2) β‹… β‹… β‹… 𝑃 (1) . This decomposition exists for any
nonsingular matrix 𝐴.
Once the πΏπ‘ˆ decomposition 𝑃 𝐴 = πΏπ‘ˆ has been computed, we can solve the system 𝐴x = b
by first noting that if x is the solution, then
𝑃 𝐴x = πΏπ‘ˆ x = 𝑃 b.
Therefore, we can obtain x by first solving the system 𝐿y = 𝑃 b, and then solving π‘ˆ x = y. Then,
if b should change, then only these last two systems need to be solved in order to obtain the new
solution; as in the case of Gaussian elimination without pivoting, the πΏπ‘ˆ decomposition does not
need to be recomputed.
Practical Computation of Determinants, Revisited
When Gaussian elimination is used without pivoting to obtain the factorization 𝐴 = πΏπ‘ˆ , we have
det(𝐴) = det(π‘ˆ ), because det(𝐿) = 1 due to 𝐿 being unit lower-triangular. When pivoting is used,
we have the factorization 𝑃 𝐴 = πΏπ‘ˆ , where 𝑃 is a permutation matrix. Because a permutation
matrix is orthogonal; that is, 𝑃 𝑇 𝑃 = 𝐼, and det(𝐴) = det(𝐴𝑇 ) for any square matrix, it follows
that det(𝑃 )2 = 1, or det(𝑃 ) = ±1. Therefore, det(𝐴) = ± det(π‘ˆ ), where the sign depends on the
sign of det(𝑃 ).
To determine this sign, we note that when two rows (or columns) of 𝐴 are interchanged, the
sign of the determinant changes. Therefore, det(𝑃 ) = (βˆ’1)𝑝 , where 𝑝 is the number of row interchanges that are performed during Gaussian elimination. The number 𝑝 is known as the sign of the
permutation represented by 𝑃 that determines the final ordering of the rows. We conclude that
det(𝐴) = (βˆ’1)𝑝 det(π‘ˆ ).
7