Solving Linear Systems Using Gaussian Elimination

Solving Linear Systems Using
Gaussian Elimination
How can we solve
!
!
?
1
Linear algebra
Typical linear system of equations :
5x1
x2 + 2x3
=
7
2x1 + 6x2 + 9x3
=
0
7x1 + 5x2
=
5
3x3
The variables x1 , x2 , and x3 only appear as linear terms
(no powers or products).
2
Linear algebra
Where do linear systems come from?
•
•
•
•
•
•
•
•
•
Fitting curves to data
Polynomial approximation to functions
Computational fluid dynamics
Network flow
Computer graphics
Difference equations
Differential equations
Dynamical systems theory
...
3
Typical linear system
How does Matlab solve linear systems such as :
5x1
x2 + 2x3
=
7
2x1 + 6x2 + 9x3
=
0
7x1 + 5x2
=
5
3x3
• Does such a system always have a solution?
• Can such a system be solved efficiently for millions
of equations?
• What does Matlab do if we have more equations
than unknowns? More unknowns than equations?
4
Solving linear systems
We are already familiar with at least one type of linear
system
3x1 + 5x2 = 3
2x1 4x2 = 1
The solution is the intersection of the two lines represented by each equation. This solution is a point
(x1 , x2 ) that satisfies both equations simultaneously.
We could also view the solution as providing the correct
linear combination of vectors (3, 2) and (5, 4) that
give us (3, 1).



3
5
3
x1
+ x2
=
2
4
1
5
Linear algebra - a 2x2 system
We can row-reduce an augmented matrix to find the
Use an elementary row
solution :
operation to produce a “0”


in the lower left corner.
3
5
3
3
5 3
✓ ◆
!
2
22
(eqn 2)
(eqn 1)
2
4 1
0
1
3
3
Use back-substitution to solve first for x2 and then for
x1 .
✓
◆
3
3
x2 =
( 1) =
22
22
✓
✓
◆◆
1
1
3
17
x1 =
(3 5x2 ) =
3 5
=
3
3
22
22
17
3
x1 =
,
x2 =
Solution :
22
22
6
Linear algebra - a 2x2 system
✓
3 17
,
22 22
◆
x1 (3, 2)
(3,2)
x2 (5, 4)
(3,1)
(5,-4)
Solution as the
intersection of two lines
Solution as linear
combination of vectors
7
Gaussian Elimination
2
4
5
2
7
1
6
5
3
2 7
9 0 5
3 5
•
•
Idea: Reduce the system to
upper triangular form. Technique: Apply elementary
row operations to the
augmented matrix to zero out
entries below the diagonal.
8
Gaussian Elimination
2
4
2
5
2
7
1
6
5
5
1
6
4 0
28
5
18
5
0
3
2 7
9 0 5
3 5
3
2
7
49
14 7
5
5 5
1
5
74
5
•
•
Idea: Reduce the system to
upper triangular form. Technique: Apply elementary
row operations to the
augmented matrix to zero out
entries below the diagonal.
(eqn 2)
(eqn 3)
✓
◆
2
(eqn 1)
5
✓
◆
7
(eqn 1)
5
9
Gaussian Elimination
2
4
2
5
2
7
1
6
5
5
1
6
4 0
28
5
18
5
0
2
5
6
4 0
0
1
28
5
0
3
2 7
9 0 5
3 5
3
2
7
49
14 7
5
5 5
1
5
74
5
2
7
49
5
65
10
14
5
65
5
•
•
Idea: Reduce the system to
upper triangular form Technique: Apply elementary
row operations to the
augmented matrix to zero out
entries below the diagonal.
(eqn 2)
(eqn 3)
3
7
5
(eqn 3)
✓
◆
✓
◆
2
(eqn 1)
5
✓
◆
7
(eqn 1)
5
9
14
(eqn 2)
10
Gaussian Elimination
2
4
2
5
2
7
1
6
5
5
1
6
4 0
28
5
18
5
0
2
5
6
4 0
0
1
28
5
0
3
2 7
9 0 5
3 5
3
2
7
49
14 7
5
5 5
1
5
74
5
2
7
49
5
65
10
14
5
65
5
•
•
Idea: Reduce the system to
upper triangular form. Technique: Apply elementary
row operations to the
augmented matrix to zero out
entries below the diagonal.
(eqn 2)
(eqn 3)
3
7
5
(eqn 3)
✓
◆
✓
◆
2
(eqn 1)
5
✓
◆
7
(eqn 1)
5
9
14
(eqn 2)
11
Solve upper triangular system for x
2
5
6
4 0
0
1
2
28
5
49
5
65
10
0
32
x1
3
2
76
7 6
5 4 x2 5 = 4
x3
7
14
5
65
5
3
7
5
Notice that the
right hand side is
not the right hand
side of the original
system
12
Solve upper triangular system for x
2
5
6
4 0
0
1
2
28
5
49
5
65
10
0
32
x1
3
2
76
7 6
5 4 x2 5 = 4
x3
7
14
5
65
5
Use back-substitution to solve for x :
✓
◆✓ ◆
10
65
Step 1 : x3 =
=
65
5
3
Notice that the
right hand side is
not the right hand
side of the original
system
7
5
2
13
Solve upper triangular system for x
2
5
6
4 0
0
1
2
28
5
49
5
65
10
0
32
x1
3
2
76
7 6
5 4 x2 5 = 4
x3
7
14
5
65
5
3
7
5
Notice that the
right hand side is
not the right hand
side of the original
system
Use back-substitution to solve for x :
✓
◆✓ ◆
10
65
Known from
Step 1 : x3 =
= 2
65
5
previous steps
✓ ◆✓
◆
5
14 49
Step 2 : x2 =
x3 = 4
28
5
5
14
Solve upper triangular system for x
2
5
6
4 0
0
1
2
28
5
49
5
65
10
0
32
x1
3
2
76
7 6
5 4 x2 5 = 4
x3
7
14
5
65
5
3
7
5
Notice that the
right hand side is
not the right hand
side of the original
system
Use back-substitution to solve for x :
✓
◆✓ ◆
10
65
Known from
Step 1 : x3 =
= 2
65
5
previous steps
✓ ◆✓
◆
5
14 49
Step 2 : x2 =
x3 = 4
28
5
5
✓ ◆
1
Step 3 : x1 =
(7 ( 1)x2 (2)x3 ) = 3
5
15
Some notation - pivots and multipliers
2
4
2
5
2
7
5
1
6
4 0
28
5
18
5
0
“Pivots”
1
6
5
2
5
6
4 0
0
1
28
5
0
3
2 7
9 0 5
3 5
3
2
7
49
14 7
5
5 5
1
5
74
5
2
7
49
5
65
10
14
5
65
5
Reduce the system to upper
triangular form.
(eqn 2)
(eqn 3)
3
7
5
✓
◆
2
(eqn 1)
5
✓
◆
7
(eqn 1)
5
“Multipliers”
(eqn 3)
✓
9
14
◆
(eqn 2)
16
Cost of Gaussian elimination
The cost of eliminating entries below
2
Pivot
X X X X X
6 X X X X X
6
6 X X X X X
6
4 X X X X X
X X X X X
the 1st pivot :
3
7
7
7
7
5
Only consider the matrix
not the augmented right
hand side.
Each row operation costs n multiplications (including
the cost of computing the multiplier) and n 1 subtractions.
Total one row: n + n 1 = 2n 1
Total n
1 rows:
(n
1)(2n
1) = 2n2
3n + 1
17
Cost of elimination
The cost of eliminating entries below
2
X X X X X
6 0 X X X X
6
6 0 X X X X
6
4 0 X X X X
0 X X X X
the 2nd pivot :
3
7
7
7
7
5
To ”zero out” the column below the second pivot, we
must do approximately 2(n 1)2 = 2(4)2 multiplications and subtractions.
18
Cost of elimination
The cost of eliminating entries below
2
X X X X X
6 0 X X X X
6
6 0 0 X X X
6
4 0 0 X X X
0 0 X X X
the 3rd pivot :
3
7
7
7
7
5
To ”zero out” the column below the second pivot, we
must do approximately 2(n 2)2 = 2(3)2 multiplications and subtractions.
19
Cost of elimination
The cost of eliminating entries below
2
X X X X X
6 0 X X X X
6
6 0 0 X X X
6
4 0 0 0 X X
0 0 0 X X
the 3rd pivot :
3
7
7
7
7
5
To ”zero out” the column below the second pivot, we
must do approximately 2(n 3)2 = 2(2)2 multiplications and subtractions.
20
Cost of elimination
The total number of multiplications is then :
2
n
X1
k=1
2
k =
(n
1)(n)(2n
6
1)
2 3
⇡ n
3
we ignore lower
order powers of n
• We say that “Gaussian elimination is an O(n3 ) process”.
• This is considered computationally expensive.
21
What about the back solve?
2
6
6
6
6
4
a11
0
0
0
0
a12
a22
0
0
0
x5 = b5 /a55
!
aa13
a23
a33
0
0
a14
a24
a34
a44
0
a15
a25
a35
a45
a55
32
x1
x2
x3
x4
x5
76
76
76
76
54
x4 = (b4
a45 x5 )/a44
2 ops
x3 = (b3
a34 xx
a35 x5 )/a33
x2 = (b2
a23 x3
a24 xx
a25 x5 )/a22
x1 = (b1
a12 x2
a13 x3
a14 xx
!
2
7 6
7 6
7=6
7 6
5 4
b1
b2
b3
b4
b5
3
7
7
7
7
5
One ‘op’ is a multiplication or
divide; Ignore subtractions for
now; we will add them in
momentarily.
1 op
!
3
3 ops
!
4 ops
a15 x5 )/a11
!
5 ops
22
What about the cost of the back solve?
23
The LU decomposition
If we have more than one right hand side (as is often
the case)
Ax = bi ,
i = 1, 2, . . . , M
We can actually store the work involved carrying out
the elimination by storing the multipliers used to carry
out the row operations.
24
LU Decomposition
2
4
2
5
2
7
5
1
6
4 0
28
5
18
5
0
“Pivots”
1
6
5
2
5
6
4 0
0
1
28
5
0
3
2 7
9 0 5
3 5
3
2
7
49
14 7
5
5 5
1
5
74
5
2
7
49
5
65
10
14
5
65
5
Reduce the system to upper
triangular form.
(eqn 2)
(eqn 3)
3
7
5
✓
◆
2
(eqn 1)
5
✓
◆
7
(eqn 1)
5
“Multipliers”
(eqn 3)
✓
9
14
◆
(eqn 2)
25
LU Decomposition
2
5
6
U =4 0
0
1
2
28
5
49
5
65
10
0
3
7
5
Store the multipliers in a lower triangular matrix :
2
6
L=4
3
1
0 0
2
5
7
5
7
1 0 5
9
14
1
26
LU Decomposition
The product LU is equal to A :
2
6
4
1
2
5
7
5
0
1
9
14
L
0
32
5
76
0 54 0
1
0
1
2
28
5
49
5
65
10
0
U
3
2
7 6
5=4
5
1
2
6
7
5
A
2
3
7
9 5
3
It does not cost us anything to store the multipliers.
But by doing so, we can now solve many systems
involving the matrix A.
27
Solution procedure given LU=A
How can we solve a system using the LU factorization?
Ax = b
Step 0 : Factor A into LU
Row-reduction
Step 1 : Solve Ly = b
Forward substitution
Step 2 : Solve U x = y
Back substitution
2
For each right hand side, we only need to do n operations. The expensive part is forming the original LU
decomposition.
28
Cost of a matrix inverse
To solve using the matrix inverse A 1 to get x = A
1
To get a column cj of the matrix A , we solve
1
b.
Acj = ej
for each column ej of an identity matrix.
2 3
8 3
3
total cost ⇡ n + 2n = n
3
3
operations
It costs about 4 times as much to multiply by the inverse as it does to solve the linear system using Gaussian elimination.
The cost of the matrix vector multiply A 1 b is n2 .
29
Row exchanges
What if we start with a system that looks like :
2
0
6
A=4 0
5
28
5
65
10
49
5
1
2
0
3
7
5
All we need to do is exchange the rows of A, and do
the decomposition on
LU = P A
where P is a permutation matrix, i.e.
2
3
0 0 1
P =4 0 1 0 5
1 0 0
30
Partial pivoting
We can also do row exchanges not just to avoid a zero
pivot, but also to make the pivot as large as possible.
This is called “partial pivoting”.
2
3
X X X X X
6 X X X X X 7
Find the largest
6
7
pivot in the entire
6 X X X X X 7
6
7
column, and do a
4 X X X X X 5
row exchange.
X X X X X
One can also do “full pivoting” by
looking for the largest pivot in the
entire matrix. But this is rarely
done.
31
deterministic calculation, it’s fitting that one of its earliest applications was the generation of random n
1947: George Dantzig, at the RAND Corporation, creates the simplex method for linear program
In terms of widespread application, Dantzig’s algorithm is one of the most successful of all time
programming dominates the world of industry, where economic survival depends on the ability to o
within budgetary and other constraints. (Of course, the “real” problems of industry are often nonlinear
of linear programming is sometimes dictated by the computational budget.) The simplex method is an
way of arriving at optimal answers. Although theoretically susceptible to exponential delays, the al
in practice is highly efficient—which in itself says something interesting about the nature of compu
Top 10 algorithms
The matrix decompositions
are listed as one of the
1950: Magnus Hestenes, Eduard Stiefel, and Cornelius Lanczos, all from the Institute for Numerical A
at the National Bureau of Standards, initiate the development of Krylov subspace iteration metho
top 10 algorithms
of theaddress
20th
century
These algorithms
the seemingly
simple task of solving equations of the form Ax = b. Th
In terms of widespread use, George
Dantzig’s simplex
method is among the
most successful algorithms of all time.
of course, is that A is a huge n ! n matrix, so that the algebraic answer x = b/A is not so easy to c
(Indeed, matrix “division” is not a particularly useful concept.) Iterative methods—such as solving equ
the form Kxi + 1 = Kxi + b – Axi with a simpler matrix K that’s ideally “close” to A—lead to the study of Krylov subspaces
for the Russian mathematician Nikolai Krylov, Krylov subspaces are spanned by powers of a matrix applied to a
“remainder”
vector
r0 =33,bNumber
– Ax0.4Lanczos found a nifty way to generate an orthogonal basis for such a subspace when th
from
SIAM News,
Volume
is symmetric. Hestenes and Stiefel proposed an even niftier method, known as the conjugate gradient method, for systems
both symmetric and positive definite. Over the last 50 years, numerous researchers have improved and extended these alg
The current suite includes techniques for non-symmetric systems, with acronyms like GMRES and Bi-CGSTAB. (GMR
Bi-CGSTAB premiered in SIAM Journal on Scientific and Statistical Computing, in 1986 and 1992,
By
Barry A. Cipra
respectively.)
The Best of the 20th Century: Editors Name Top 10 Algorithms
Algos is the Greek word for pain. Algor is Latin, to be cold. Neither is the root for algorithm, which stems instead from al1951: Alston
Householder
of Oak Ridge
National
Laboratory
the decompositional
Khwarizmi,
the name
of the ninth-century
Arab scholar
whose
book al-jabrformalizes
wa’l muqabalah
devolved into today’sapproach
high school
algebra
textbooks.
Al-Khwarizmi stressed the importance of methodical procedures for solving problems. Were he around today,
to matrix
computations.
he’dThe
no doubt
be impressed
the advances
his eponymous
approach.
ability
to factor by
matrices
into in
triangular,
diagonal,
orthogonal, and other special forms has turned
Some of the very best algorithms of the computer age are highlighted in the January/February 2000 issue of Computing in Science
to be extremely
useful.ofThe
decompositional
approach
hasComputer
enabledSociety.
software
to produce
&out
Engineering,
a joint publication
the American
Institute of Physics
and the IEEE
Guestdevelopers
editors Jack Don-garra
of the
flexibleofand
efficient
matrix
It alsoandfacilitates
the analysis
offorrounding
the big
University
Tennessee
and Oak
Ridgepackages.
National Laboratory
Fran-cis Sullivan
of the Center
Comput-ingerrors,
Sciencesone
at theof
Institute
for
Defense
Analyses
put togeth-er alinear
list theyalgebra.
call the “Top
Algorithms
the Century.” of the National Physical Laboratory in
bugbears
of numerical
(InTen
1961,
Jamesof Wilkinson
“We tried published
to assemble the
10 al-gorithms
influence
the development
and practice
of science
and engineering
London
a seminal
paperwith
in the
thegreatest
Journal
of theonACM,
titled “Error
Analysis
of Direct
Methods
in the 20th century,” Dongarra and Sullivan write. As with any top-10 list, their selections—and non-selections—are bound to be
of Matrix Inversion,” based on the LU decomposition of a matrix as a product of lower and upper
controversial, they acknowledge. When it comes to picking the algorithmic best, there seems to be no best algorithm.
triangular
factors.)
Without further
ado, here’s the CiSE top-10 list, in chronological order. (Dates and names associated with the algorithms should be read
as first-order approximations. Most algorithms take shape over time, with many contributors.)
Alston House
1957: John Backus leads a team at IBM in developing the Fortran optimizing compiler.
1946:
John
von Neumann,
Stan Ulam,
Metropolis,
all at the
Los Alamos
Scientific
up the Metropolis
The
creation
of Fortran
may and
rankNick
as the
single most
important
event
in theLaboratory,
history ofcook
computer
programming: Finally,
32 s
algorithm, also known as the Monte Carlo method.
!