Vector and Matrix Norms
Tom Lyche
Centre of Mathematics for Applications,
Department of Informatics,
University of Oslo
October 4, 2010
Vector Norms
Definition (Norm)
A norm in Rn (Cn ) is a function k·k : Rn (Cn ) → R that
satisfies for all x, y in Rn (Cn ) and all a in R(C)
1. kxk ≥ 0 with equality if and only if x = 0.
(positivity)
2. kaxk = |a| kxk.
(homogeneity)
3. kx + yk ≤ kxk + kyk.
(subadditivity)
n
n
The triples (R , R, k·k) and (C , C, k·k) are examples of
normed vector spaces and the inequality 3. is called the
triangle inequality.
Inverse Triangle Inequality
kx − yk ≥ | kxk − kyk |, x, y ∈ Cn .
I
I
I
I
Need to show kx − yk ≥ kxk − kyk and
kx − yk ≥ kyk − kxk
Since kx − yk = ky − xk the last inequality follows from
the first.
The first: kxk = kx − y + yk ≤ kx − yk + kyk
The inverse triangle inequality shows that a norm is a
continuous function Cn → R.
p Norms
Define for p ≥ 1
n
X
1/p
kxkp :=
|xj |p
,
(1)
j=1
kxk∞ := max |xj |.
1≤j≤n
(2)
The most important cases are:
P
1. kxk1 = nj=1 |xj | , (the one-norm or l1 -norm)
Pn
2 1/2
2. kxk2 =
|x
|
, the two-norm, l2 -norm, or
j
j=1
Euclidian norm)
3. kxk∞ = max1≤j≤n |xj |, (the infinity-norm, l∞ -norm, or
max norm.)
Why index ∞?
lim kxkp = kxk∞ for all x ∈ Cn .
p→∞
I
For x 6= 0 we write
kxkp := kxk∞
I
n
X
|xj | p 1/p
.
kxk
∞
j=1
Now each term in the sum is not greater than one and at
least one term is equal to one
I
kxk∞ ≤ kxkp ≤ n1/p kxk∞ ,
I
(3)
p ≥ 1.
Since limp→∞ n1/p = 1 for any n ∈ N we see that (3)
follows.
Hölder and Minkowski
I
I
I
It is shown in shown in an Appendix that the p-norm are
norms in Rn and in Cn for any p with 1 ≤ p ≤ ∞.
The triangle inequality kx + ykp ≤ kxkp + kykp is called
Minkowski’s inequality.
To prove it one first establishes Hölder’s inequality
n
X
j=1
I
|xj yj | ≤ kxkp kykq ,
1 1
+ = 1,
p q
x, y ∈ Cn .
(4)
The relation p1 + q1 = 1 means for example that if p = 1
then q = ∞ and if p = 2 then q = 2.
Equivalent Norms
Definition
Two norms k·k and k·k0 on Cn are equivalent if there are
positive constants m and M (depending only on n such that
for all vectors x ∈ Cn we have
mkxk ≤ kxk0 ≤ Mkxk.
Example: kxk∞ ≤ kxkp ≤ n1/p kxk∞ ,
(5)
p ≥ 1.
Equivalent Norms
The following result is proved in an Appendix.
Theorem
All vector norms in Cn are equivalent.
Matrix Norms
We consider matrix norms on (Cm,n , C). All results holds for
(Rm,n , R).
Definition (Matrix Norms)
A function k·k : Cm,n → C is called a matrix norm on Cm,n if
for all A, B ∈ Cm,n and all α ∈ C
1. kAk ≥ 0 with equality if and only if A = 0.
(positivity)
2. kαAk = |α| kAk.
(homogeneity)
3. kA + Bk ≤ kAk + kBk.
(subadditivity)
A matrix norm is simply a vector norm on the finite
dimensional vector spaces (Cm,n , C) of m × n matrices.
Equivalent norms
Adapting some general results on vector norms to matrix
norms give
Theorem
x
1. All matrix norms are equivalent. Thus, if k·k and k·k0 are
two matrix norms on Cm,n then there are positive
constants µ and M such that µkAk ≤ kAk0 ≤ MkAk
holds for all A ∈ Cm,n .
2. A matrix norm is a continuous function k·k : Cm,n → R.
The Frobenius Matrix Norm
I
I
I
I
I
I
A ∈ Cm,n , U, V unitary.
P Pn
2 1/2
kAkF := ( m
.
i=1
j=1 |aij | )
kABkF ≤ kAkF kBkF Submultiplicativity
kAxk2 ≤ kAkF kxk2 . Subordinance
kUAVkF = kAkF , Unitary invariance
kAkF = (σ12 + · · · + σn2 )1/2 . singular values
The p matrix norm. 1 ≤ p ≤ ∞, A ∈ Cm,n
kAkp := max
x6=0
I
I
I
I
I
kAxkp
.
kxkp
kAkp = maxkykp =1 kAykp
The p-norms are matrix norms
kABkp ≤ kAkP kBkP Submultiplicativity
kAxkp ≤ kAkp kxkp Subordinance
kUAVk2 = kAk2 Unitary invariance of 2-norm
Explicit expressions
Theorem
For A ∈ Cm,n we have
kAk1 := max
1≤j≤n
m
X
|ak,j |,
(max column sum)
k=1
kAk2 := σ1 ,
kAk∞ := max
1≤k≤m
(largest singular value of A)
n
X
|ak,j |,
(max row sum).
j=1
(6)
The expression kAk2 is called the two-norm or the spectral
norm of A. The explicit expression for the spectral norm
follows from the minmax theorem for singular values.
Examples
1 14 4 16
For A := 15
[ 2 22 13 ] we find
I kAk1 = 29 .
15
I kAk2 = 2.
I kAk∞ = 37 .
15
√
I kAkF =
5.
1
2
I A := [
3 4]
I kAk1 = 6
I kAk2 = 5.465
I kAk∞ = 7.
I kAkF = 5.4772
The spectral norm in special cases
Theorem
Suppose A ∈ Cn,n has singular values σ1 ≥ σ2 ≥ · · · ≥ σn and
eigenvalues |λ1 | ≥ |λ2 | ≥ · · · ≥ |λn |. Then
1
,
σn
1
kAk2 = |λ1 | and kA−1 k2 =
, if A is normal.
|λn |
kAk2 = σ1 and kA−1 k2 =
If A ∈ Rn,n is symmetric positive definite then
kAk2 = λ1 and kA−1 k2 = λ1n . For the norms of A−1 we
assume of course that A is nonsingular.
(7)
(8)
Consistency
I
I
I
I
A matrix norm is called consistent on Cn,n if
kABk ≤ kAk kBk
(submultiplicativity)
n,n
holds for all A, B ∈ C .
A matrix norm is consistent if it is defined on Cm,n for
all m, n ∈ N, and submultiplicativity holds for all matrices
A, B for which the product AB is defined.
The Frobenius norm is consistent.
The p norm is consistent for 1 ≤ p ≤ ∞.
Subordinate Matrix Norm
Definition
I
I
I
I
I
Suppose m, n ∈ N are given,
Let k kα on Cm and k kβ on Cn be vector norms, and let
k k be a matrix norm on Cm,n .
We say that the matrix norm k k is subordinate to the
vector norm k kα if kAxkα ≤ kAk kxkα for all A ∈ Cm,n
and all x ∈ Cn .
The Frobenius norm is subordinate to the Euclidian
vector norm.
The p matrix norm is subordinate to the p vector norm
for 1 ≤ p ≤ ∞.
Operator Norm
Definition
Suppose m, n ∈ N are given and let k·kα be a vector norm on
Cn and k·kβ a vector norm on Cm . For A ∈ Cm,n we define
kAk := kAkα,β := max
x6=0
kAxkβ
.
kxkα
(9)
We call this the (α, β) operator norm, the (α, β)-norm, or
simply the α-norm if α = β.
I
I
The p norms are operator norms
The Frobenius norm is not an operator norm.
Observations on Operator norms
I
The operator norm is a matrix norm on Cm,n .
Consistent on Cn,n and consistent if α = β and the α
norm is defined for all m, n.
kAxkα ≤ kAkkxkβ . Subordinance
I
kAkα,β = maxx∈ker(A)
/
I
I
I
∗
kAxkα
kxkβ
= maxkxkβ =1 kAxkα .
kAkα,β = kAx kα for some x∗ ∈ Cn with kx∗ kβ = 1.
Perturbation of linear systems
I
x1 +
x2 = 20
−16
x1 + (1 − 10 )x2 = 20 − 10−15
I
I
The exact solution is x1 = x2 = 10.
Suppose we replace the second equation by
x1 + (1 + 10−16 )x2 = 20 − 10−15 ,
I
I
the exact solution changes to x1 = 30, x2 = −10.
A small change in one of the coefficients, from 1 − 10−16
to 1 + 10−16 , changed the exact solution by a large
amount.
Ill Conditioning
I
I
I
A mathematical problem in which the solution is very
sensitive to changes in the data is called ill-conditioned
or sometimes ill-posed.
Such problems are difficult to solve on a computer.
If at all possible, the mathematical model should be
changed to obtain a more well-conditioned or
properly-posed problem.
Perturbations
I
I
I
I
We consider what effect a small change (perturbation) in
the data A,b has on the solution x of a linear system
Ax = b.
Suppose y solves (A + E)y = b+e where E is a (small)
n × n matrix and e a (small) vector.
How large can y−x be?
To measure this we use vector and matrix norms.
Conditions on the norms
I
I
k·k will denote a vector norm on Cn and also a consistent
matrix norm on Cn,n which in addition is subordinate to
the vector norm.
Thus for any A, B ∈ Cn,n and any x ∈ Cn we have
kABk ≤ kAk kBk and kAxk ≤ kAk kxk.
I
This is satisfied if the matrix norm is the p norm or the
Frobenius norm .
Absolute and relative error
I
I
The difference ky − xk measures the absolute error in y
as an approximation to x,
ky − xk/kxk or ky − xk/kyk is a measure for the relative
error.
Perturbation in the right hand side
Theorem
Suppose A ∈ Cn,n is invertible, b, e ∈ Cn , b 6= 0 and Ax = b,
Ay = b+e. Then
1 kek
ky − xk
kek
≤
≤ K (A)
,
K (A) kbk
kxk
kbk
K (A) = kAkkA−1 k.
(10)
I
Consider (10). kek/kbk is a measure for the size of the
perturbation e relative to the size of b. ky − xk/kxk can
in the worst case be
K (A) = kAkkA−1 k
times as large as kek/kbk.
Condition number
I
I
I
K (A) is called the condition number with respect to
inversion of a matrix, or just the condition number, if it
is clear from the context that we are talking about solving
linear systems.
The condition number depends on the matrix A and on
the norm used. If K (A) is large, A is called
ill-conditioned (with respect to inversion).
If K (A) is small, A is called well-conditioned (with
respect to inversion).
Condition number properties
I
I
I
Since kAkkA−1 k ≥ kAA−1 k = kI k ≥ 1 we always have
K (A) ≥ 1.
Since all matrix norms are equivalent, the dependence of
K (A) on the norm chosen is less important than the
dependence on A.
Usually one chooses the spectral norm when discussing
properties of the condition number, and the l1 and l∞
norm when one wishes to compute it or estimate it.
The spectral norm condition number
I
I
I
I
I
Suppose A has singular values σ1 ≥ σ2 ≥ · · · ≥ σn > 0
and eigenvalues |λ1 | ≥ |λ2 | ≥ · · · ≥ |λn | if A is square.
K2 (A) = kAk2 kA−1 k2 = σσn1
|λ1 |
, A normal.
K2 (A) = kAk2 kA−1 k2 = |λ
n|
It follows that A is ill-conditioned with respect to
inversion if and only if σ1 /σn is large, or |λ1 |/|λn | is large
when A is normal.
K2 (A) = kAk2 kA−1 k2 =
λ1
, A real, symmetric, positive definite.
λn
The residual
Suppose we have computed an approximate solution y to
Ax = b. The vector r(y :) = Ay − b is called the residual
vector , or just the residual. We can bound x−y in term of
r(y).
Theorem
Suppose A ∈ Cn,n , b ∈ Cn , A is nonsingular and b 6= 0. Let
r(y) = Ay − b for each y ∈ Cn . If Ax = b then
1 kr(y)k
ky − xk
kr(y)k
≤
≤ K (A)
.
K (A) kbk
kxk
kbk
(11)
Discussion
I
I
I
I
I
If A is well-conditioned, (11) says that
ky − xk/kxk ≈ kr(y)k/kbk.
In other words, the accuracy in y is about the same order
of magnitude as the residual as long as kbk ≈ 1.
If A is ill-conditioned, anything can happen.
The solution can be inaccurate even if the residual is small
We can have an accurate solution even if the residual is
large.
The inverse of A + E
Theorem
Suppose A ∈ Cn,n is nonsingular and let k·k be a consistent
matrix norm on Cn,n . If E ∈ Cn,n is so small that
r := kA−1 Ek < 1 then A + E is nonsingular and
kA−1 k
k(A + E) k ≤
.
1−r
(12)
k(A + E)−1 − A−1 k
kEk
.
≤ 2K (A)
−1
kAk
kA k
(13)
−1
If r < 1/2 then
Proof
I
I
I
I
I
I
We use that if B ∈ Cn,n and kBk < 1 then I − B is
1
.
nonsingular and k(I − B)−1 k ≤ 1−kBk
Since r < 1 the matrix I − B := I + A−1 E is nonsingular.
Since (I − B)−1 A−1 (A + E) = I we see that A + E is
nonsingular with inverse (I − B)−1 A−1 .
Hence, k(A + E)−1 k ≤ k(I − B)−1 kkA−1 k and (12)
follows.
From the identity (A + E)−1 − A−1 = −A−1 E(A + E)−1
we obtain by
k(A + E)−1 − A−1 k ≤ kA−1 kkEkk(A + E)−1 k ≤
kEk kA−1 k
K (A) kAk
.
1−r
Dividing by kA−1 k and setting r = 1/2 proves (13).
Perturbation in A
Theorem
Suppose A, E ∈ Cn,n , b ∈ Cn with A invertible and b 6= 0. If
r := kA−1 Ek < 1/2 for some operator norm then A + E is
invertible. If Ax = b and (A + E)y = b then
ky − xk
kEk
≤ kA−1 Ek ≤ K (A)
,
kyk
kAk
ky − xk
kEk
≤ 2K (A)
..
kxk
kAk
(14)
(15)
Proof
I
I
I
A + E is invertible.
(14) follows easily by taking norms in the equation
x − y = A−1 Ey and dividing by kyk.
From the identity y − x = (A + E)−1 − A−1 Ax we
obtain ky − xk ≤ k(A + E)−1 − A−1 kkAkkxk and (15)
follows.
Convergence in Rm,n and Cm,n
I
I
I
I
Consider an infinite sequence of matrices
{Ak } = A0 , A1 , A2 , . . . in Cm,n .
{Ak } is said to converge to the limit A in Cm,n if each
element sequence {Ak (ij)}k converges to the
corresponding element A(ij) for i = 1, . . . , m and
j = 1, . . . , n.
{Ak } is a Cauchy sequence if for each > 0 there is an
integer N ∈ N such that for each k, l ≥ N and all i, j we
have |Ak (ij) − Al (ij)| ≤ .
{Ak } is bounded if there is a constant M such that
|Ak (ij)| ≤ M for all i, j, k.
More on Convergence
I
I
I
I
By stacking the columns of A into a vector in Cmn we
obtain
A sequence {Ak } in Cm,n converges to a matrix A ∈ Cm,n
if and only if limk→∞ kAk − Ak = 0 for any matrix norm
k·k.
A sequence {Ak } in Cm,n is convergent if and only if it is
a Cauchy sequence.
Every bounded sequence {Ak } in Cm,n has a convergent
subsequence.
The Spectral Radius
I
I
I
I
I
I
ρ(A) = maxλ∈σ(A) |λ|.
For any matrix norm k·k on Cn,n and any A ∈ Cn,n we
have ρ(A) ≤ kAk.
Proof: Let (λ, x) be an eigenpair for A
X := [x, . . . , x] ∈ Cn,n .
λX = AX, which implies
|λ| kXk = kλXk = kAXk ≤ kAk kXk.
Since kXk =
6 0 we obtain |λ| ≤ kAk.
A Special Norm
Theorem
Let A ∈ Cn,n and > 0 be given. There is a consistent matrix
norm k·k0 on Cn,n such that ρ(A) ≤ kAk0 ≤ ρ(A) + .
A Very Important Result
Theorem
For any A ∈ Cn,n we have
lim Ak = 0 ⇐⇒ ρ(A) < 1.
k→∞
I
I
I
I
I
I
Proof:
Suppose ρ(A) < 1.
There is a consistent matrix norm k·k on Cn,n such that
kAk < 1.
But then kAk k ≤ kAkk → 0 as k → ∞.
Hence Ak → 0.
The converse is easier.
Convergence can be slow
I
0.99 1
0
0.4 9.37 1849
37 ,
A = 0 0.99 1 , A100 = 0 0.4
0 0 0.99
0
0
0.4
−9
10
0.004
10−9
A2000 = 0
0
0
10−9
More limits
I
For any submultiplicative matrix norm k·k on Cn,n and
any A ∈ Cn,n we have
lim kAk k1/k = ρ(A).
k→∞
(16)
© Copyright 2026 Paperzz