Inclusion Theorems for Pseudospectra of Block Triangular Matrices

Inclusion Theorems for Pseudospectra
of Block Triangular Matrices
Michael Karow
Matheon, TU-Berlin
Outline.
• Pseudospectra
• The 3 definitions separation of 2 matrices
• Inclusion theorems for pseudospectra of block triangular matrices
10
5
0
−5
Pseudospectra
−10
−10
−5
0
5
10
The pseudospectrum of A ∈ Cn×n to the perturbation level ǫ > 0 is
Λǫ(A) := set of all eigenvalues of all matrices of the form A + E,
where E ∈ Cn×n, kEk ≤ ǫ.
=
union of the spectra Λ(A + E) where E ∈ Cn×n, kEk ≤ ǫ
= Λ(A) ∪ { z ∈ C \ Λ(A) |
k(zI − A)−1k−1 ≤ ǫ }.
In this talk k · k denotes the spectral norm. Then
Λǫ(A) := {z ∈ C | σmin(zI − A) ≤ ǫ }.
The definitions of separation
Separation of two matrices: Demmel’s definition
Pseudospectra of L ∈ Cℓ×ℓ (blue) and M ∈ Cm×m (red):
6
4
ǫ = 0.50
2
0
−2
−4
−6
−6
−4
−2
0
2
4
6
Separation of two matrices: Demmel’s definition
Pseudospectra of L ∈ Cℓ×ℓ (blue) and M ∈ Cm×m (red):
6
4
ǫ = 0.80
2
0
−2
−4
−6
−6
−4
−2
0
2
4
6
Separation of two matrices: Demmel’s definition
Pseudospectra of L ∈ Cℓ×ℓ (blue) and M ∈ Cm×m (red):
6
4
ǫ = 1.19
2
0
−2
−4
−6
−6
−4
−2
0
2
4
6
Separation of two matrices: Demmel’s definition
Pseudospectra of L ∈ Cℓ×ℓ (blue) and M ∈ Cm×m (red):
6
4
ǫ = 1.19 = sepD
λ (L, M )
2
0
−2
−4
−6
−6
−4
−2
0
2
4
6
sepD
λ (L, M ) = min{ǫ | Λǫ(L) ∩ Λǫ(M ) 6= ∅ }
= min max{σmin(zI − L), σmin(zI − M )}
z∈C
Separation of two matrices: Varah’s definition
Pseudospectra of L ∈ Cℓ×ℓ (blue) and M ∈ Cm×m (red):
6
4
ǫ1 = 1.5
2
0
ǫ2 = 0.85
−2
−4
−6
−6
−4
−2
0
2
4
6
sepV
λ (L, M ) = min{ǫ1 + ǫ2 | Λǫ1 (L) ∩ Λǫ2 (M ) 6= ∅ }
= min [σmin(zI − L) + σmin(zI − M )]
z∈C
Separation of two matrices: Stewart’s definition
Definition uses Sylvester-operator Z 7−→ T (Z) = M Z − ZL:
sep(L, M ) = min ||| M Z − ZL||| .
||| Z||| =1
Facts:
• sep(L, M ) 6= 0
iff
T nonsingular
• sep(L, M ) ≤ sepV
λ (L, M )
if
iff
Λ(L) ∩ Λ(M ) 6= ∅
||| · ||| is unitarily invariant.
Proof:
Λ(L + E1) ∩ Λ(M + E2 ) 6= ∅
⇒
0 = sep(L + E1 , M + E2 )
=
≥
⇒
min ||| (M + E2 )Z − Z(L + E1 )|||
||| Z||| =1
sep(L, M ) − kE1 k − kE2 k
kE1 k + kE2 k ≥ sep(L, M )
Comparison of the separations
Stewart’s definition:
sep(L, M ) = min ||| M Z − ZL|||
Varah’s definition:
Demmel’s definition:
||| Z||| =1
sepV
λ (L, M ) = min{ǫ1 + ǫ2 | Λǫ1 (L) ∩ Λǫ2 (M ) 6= ∅}
sepD
λ (L, M ) = min{ǫ | Λǫ(L) ∩ Λǫ(M ) 6= ∅}
Computation of sepD
λ in [Gu,Overton, 2006] . We have
D (L, M ) ≤ dist(Λ(L), Λ(M))
sep(L, M ) ≤ sepV
(L,
M
)
≤
2
sep
λ
λ
Equality holds if L and M are both normal and ||| · ||| is the Frobenius norm.
Remark: For (scaled) Jordan blocks L, M :
sep(L, M ) << sepD
λ (L, M ) << dist(Λ(L), λ(M))
Application:
Inclusion theorems for pseudospectra of
block triangular matrices
The Problem
Let A ∈ Cn×n be given in block Schur form:
A=U
"
#
L C
U ∗,
0 M
U unitary,
Λ(L) ∩ Λ(M ) = ∅.
We always have
Λǫ(L) ∪ Λǫ(M ) ⊆ Λǫ (A) .
Problem: Find a tight function g of ǫ such that
Λǫ(A) ⊆ Λg(ǫ)ǫ(L) ∪ Λg(ǫ)ǫ(M ).
(∗)
Relevance:
If kEk = ǫ and the union in (∗) is disjoint then precisely dim L eigenvalues
of A+E are contained in Λg(ǫ)ǫ(L). The others are contained in Λg(ǫ)ǫ(M ).
Visualisation of the Problem
Problem again: Find a tight function g of ǫ such that

Λǫ 

L C 
⊆ Λg(ǫ)ǫ(L) ∪ Λg(ǫ)ǫ(M ).
0 M


L C 
grey region: Λǫ 
0 M
blue region: Λǫ(L)
red region: Λǫ(M )
blue curve: boundary of Λg(ǫ)ǫ(L)
red curve:
boundary of Λg(ǫ)ǫ(M )
6
4
2
0
−2
−4
−6
−6
−4
−2
0
2
4
6
Upper bounds in terms of C
Let A ∈ Cn×n be given in block Schur form:
A=U
"
#
L C
U ∗,
0 M
U unitary,
Λ(L) ∩ Λ(M ) = ∅.
Then
Λǫ(A) ⊆ Λg(ǫ)ǫ(L) ∪ Λg(ǫ)ǫ(M )
for
g(ǫ) =
s
kCk
1+
ǫ
(Grammont, Largillier, 2002)
and for
g(ǫ) =
1
+
2
s
kCk
1
+
4
ǫ
(Bora, 2001)
Good: Simple bounds which show that Λǫ(A) ≈ Λǫ(L) ∪ Λǫ(M ) for large ǫ.
Bad: g(ǫ) → ∞ as ǫ → 0.
Proof of the Grammont-Largillier-bound
Let
az := max{k(z I − L)−1k, k(z I − M )−1k}.
Then we have the following chain of inclusions and inequalities.
z ∈ Λǫ (A) ⇒ ǫ−1
k(z I − A)−1 k
(z I − L)−1 −(z I − L)−1 C (z I − M )−1 = −1
0
(z I − M )
az az 2 kCk ≤ 0
az
2
≤
= az
⇒
2(ǫaz
⇒ (ǫ
⇒
p
)−1
az kCk+
√
− az kCk
(az kCk)2 +4
2
p
≤
(az kCk)2 + 4
1 + kCk/ǫ)−1 ≤ az
z ∈ Λǫ √1+kCk/ǫ (L) ∪ Λǫ √1+kCk/ǫ (M ).
Demmel’s bound (1983)
Let T be such that
T −1
"
#
"
#
L 0
L C
T =
.
0 M
0 M
Then the Bauer-Fike-Theorem yields
"
Λǫ
L C
0 M
#!
⊆ ΛkT k kT −1k ǫ(L) ∪ ΛkT k kT −1k ǫ(M )
Problem: Find such T with smallest condition number kT k kT −1k.
Solution: Let R be such that RM − LR = C . Then
"
#
I R/p
,
T =
0 I/p
p=
q
1 + kRk2
has smallest possible condition number
κ := kT k kT −1k = p + kRk = p +
Note:
"
L C
0 M
#
q
p2 − 1 ≤ 2p.
has invariant subspaces range
" #
" #
I
R
, range
0
I
and p is the norm of the associated spectral projector.
Illustration: invariant subspaces of
A=
"
#
"
#
L C
L RM − LR
=
,
0 M
0
M
Λ(L) ∩ Λ(M ) = ∅.
invariant subspace
R
x
I
spectral projection
I
Px
Invariant subspaces:
range
"
" #
I
,
0
#
I −R
Spectral projector: P =
,
0 0
range
invariant subspace
" #
R
I
p := kP k =
q
1 + kRk2.
Demmel’s result and the separation.
Let A ∈ Cn×n be given in block Schur form:
A=U
"
#
"
#
L C
L RM − LR ∗
U∗ = U
U ,
0 M
0
M
U unitary,
Λ(L) ∩ Λ(M ) = ∅.
Let
κ = kRk +
q
kRk2 + 1 =
q
p2 − 1 + p.
Then for all ǫ ≥ 0,
Λǫ(A) ⊆ Λκǫ(L) ∪ Λκǫ(M ),
Moreover, if ǫ < sepD
λ (L, M )/κ then
Λκǫ(L) ∩ Λκǫ(M ) = ∅.
Corollary to Demmel’s result.
If L = λ I (i.e. λ is a semisimple eigenvalue of A) then
Λǫ(A) ⊆
Λκ ǫ(L)
=
D
ǫ(λ)}
| κ{z
∪
Disk of radius κǫ
where κ = kRk + p =
and p =
q
q
∪
Λκ ǫ(M )
10
5
Λκ ǫ(M ),
0
−5
−10
p2 − 1 + p
≈ 2p
−10
−5
0
5
10
1 + kRk2 is the norm of the spectral projector.
Furthermore, if ǫ is small enough then Dκ ǫ(λ) contains only one connected
component Cǫ(λ) of Λǫ(A). But we know that for small ǫ
Cǫ(λ) ≈ Dp ǫ(λ)
since p is the condition number of λ.
Question: Is Demmel’s bound to large (factor ≈ 2)?
Inclusion bound for small ǫ: Demmel’s separation
Let A ∈ Cn×n be given in block Schur form:
A=U
"
#
"
#
L C
L RM − LR ∗
U∗ = U
U ,
0 M
0
M
Let sD = sepD
λ (L, M ), κ = kRk +
q
U unitary,
kRk2 + 1 =
Then for ǫ ≤ sD /κ,
q
Λ(L) ∩ Λ(M ) = ∅.
p2 − 1 + p.
2.6
p+||R||=κ
2.4
Λǫ(A) ⊆ Λg (ǫ) ǫ(L) ∪ Λg (ǫ) ǫ(M ),
D
D
gD(ε)
where
gD (ǫ) = p +
kRk2 ǫ
sD − p ǫ
p
.
s /κ
D
1
0
0.05
0.1
0.15
0.2
Inclusion bound for small ǫ: Varah’s separation
Let A ∈ Cn×n be given in block Schur form:
A=U
"
"
#
#
L C
L RM − LR ∗
U∗ = U
U ,
0 M
0
M
Let sV = sepV
λ (L, M ), κ = kRk +
U unitary,
q
kRk2 + 1 =
Then for ǫ ≤ sV /(2κ),
q
Λ(L) ∩ Λ(M ) = ∅.
p2 − 1 + p.
2.6
Λǫ(A) ⊆ Λg (ǫ) ǫ(L) ∪ Λg (ǫ) ǫ(M ),
V
V
p+||R||=κ
2.4
gV(ε)
where
gV (ǫ) =
1
2
+
s
p − ǫ/sV
1
4
−
ǫ
sV
p−
ǫ
sV
.
p
1
sV/(2κ)
0
0.05
0.1
0.15
0.2
The proof, part 1
−1 −1 R(zI − M )−1 − (zI − L)−1 R
(zI
−
L)
L RM − LR
zI − 0
= −1
M
0
(zI − M )
"
#
k(zI − L)−1 k kRk k(zI − M )−1k + k(zI − L)−1k kRk ≤ −1
0
k(zI − M ) k
1
(t
+
1)
kRk
,
= k(zI − L)−1k 0
t
|
{z
}
=:h(t)
where
k(zI − M )−1k
,
t=
k(zI − L)−1 k
w.l.o.g. t ≤ 1
Thus,
Analogously,
h(t) σmin(zI − A) ≥ σmin(zI − L).
h(t−1 ) σmin(zI − A) ≥ σmin(zI − M ).
⇒ h(t) ≤ κ
The proof, part 2
We have seen
h(t) σmin(zI − A) ≥ σmin(zI − L),
and
h(t−1 ) σmin(zI − A) ≥ σmin(zI − M ).
Add these inequalites:
(h(t) + h(t−1)) σmin(zI − A) ≥ σmin(zI − L) + σmin(zI − M ) ≥ sepVλ (L, M ) ≥ s.
We have h(t−1 ) = t−1h(t).
Hence, if σmin(zI − A) ≤ ǫ then
(1 + t−1) h(t) = h(t) + h(t−1 ) ≥
s
s
≥
.
σmin(zI − A)
ǫ
The proof, part 3
We have shown that
s
(1 + t−1 ) h(t) ≥ .
ǫ
(∗)
It can be verified by direct computation that h(t) =
h(t)2 − 1
−1
,
(1 + t ) h(t) =
h(t) − p
1
((1
2
p=
p
+ t)p + (1 + t)2p2 − 4t) and
q
1 + kRk2 .
Thus, (∗) is equivalent to the quadratic inequality
0 ≤ h(t)2 −
s
sp
− 1.
h(t) +
ǫ
ǫ
Thus,
h(t) ≤
p − ǫ/s
q
ǫ
1
1
ǫ
p
−
+
−
2
4
s
s
{z
}
|
=:g(ǫ)
or
p − ǫ/s
q
≤ h(t).
ǫ
1
1
ǫ
− 4 − s p− s
2
{z
}
|
=:ĝ(ǫ)
The latter is impossible for t ≤ 1 and ǫ < s/(2κ) since then κ < ĝ(t) ≤ h(t), but direct
computation yields h(t) ≤ h(1) = κ. Thus,
g(ǫ) ǫ ≥ h(t) ǫ ≥ h(t) σmin(zI − A) ≥ σmin(zI − L)
2
The 2 × 2 case
1
1
1
1
0.8
0.8
0.8
0.8
0.6
0.6
0.6
0.6
0.4
0.4
0.4
0.4
0.2
0.2
0.2
0.2
0
0
0
0
−0.2
−0.2
−0.2
−0.2
−0.4
−0.4
−0.4
−0.4
−0.6
−0.6
−0.6
−0.6
−0.8
−0.8
−0.8
−0.8
−1
−1
−1
−1
−1
−0.5
0
"
s
2
0.5
1
#
−1
"
s
2
−0.5
0
0.5
1
−1
−0.5
0
0.5
1
#
c
sr
Let A =
=
, s > 0, c, r ≥ 0. Then
0 − 2s
0 − 2s
D
s = sepV
(−s/2,
s/2)
=
2
sep
λ
λ (−s/2, s/2) = sep(L, M ).
The 1 × 1 pseudospectra of the eigenvalues ±s/2 are disks:
Λǫ(±s/2) = Dǫ(±s/2) = {z ∈ C | |z ∓ s/2| ≤ ǫ}.
We are looking for
g(ǫ) = min{ g ≥ 0 | Λǫ(A) ⊆ Dg ǫ(−s/2) ∪ Dg ǫ(s/2) }.
g(ε)ε
−1
−0.5
0
0.5
1
Bounds are exact in the 2 × 2 case
2.6
2.4
Demmel
↓
κ=p+r
← Grammont,Largillier
K.→
p
s/(2κ)
1
0
Let A =
"
0.2
0.4
#
s
2
"
0.6
0.8
1
1.2
#
s
2
c
sr
=
, s > 0, c, r ≥ 0, and let
0 − 2s
0 − 2s
g(ǫ) = min{ g ≥ 0 | Λǫ(A) ⊆ Dg ǫ(−s/2) ∪ Dg ǫ(s/2) }.
Then we have (p =
g(ǫ) =
q
1 + r2, κ = p + r):


qp−ǫ/s



 1 + 1 − ǫ (p− ǫ )
2
4
s


q


 1 + c/ǫ
s
if ǫ ≤ s/(2κ),
if ǫ ≥ s/(2κ)
(K.)
(Grammont, Largillier)
Literature:
1. J.M. Varah: On the separation of two matrices, SIAM J. Numer. Anal. 16, No. 2,
1979
2. Grammont, Largellier: On ǫ-spectra and stability radii, J. Comp. Appl. Math. 147,
2002
3. J. W. Demmel: Computing Stable Eigendecompositions of Matrices, Lin. Alg. Appl.
79, 1986.
4. J. W. Demmel: The Condition Number of Equivalence Transformations that Block
Diagonalize Matrix Pencils, SIAM J. Numer. Anal. 20, No. 3, 1983.
Thanks for listening
Application of Stewart’s separation:
perturbation bounds for invariant subspaces
Joint work with Daniel Kressner
Recall:
sep(L, M ) = min ||| M
Z {z
− ZL} |||
|
||| Z||| =1
T (Z)
Invariant subspaces and Riccati equations
Let A =
"
#
A11 A12
∈ C(ℓ+m)×(ℓ+m), Z ∈ Cm×ℓ
A21 A22
Basic fact:
" #
I
is an ℓ-dimensional invariant subspace of A iff
Z
Z satisfies the (nonsymmetric) Riccati equation
range
A
+ A22Z −{zZA11 − ZA12Z} = 0
| 21
=:R(A,Z)
since then
"
A11 A12
A21 A22
#" #
" #
I
I
=
(A11 + A12Z).
Z
Z
On the following slides:
•
A=
"
#
#
"
L C
E
E12
+ 11
,
0 M
E21 E22
{z
}
| {z } |
A0
Λ(L) ∩ Λ(M ) = ∅.
E
• E is perturbation of A0.
• The invariant subspace
range
" #
of A0 + E
range
" #
of A0, where
is perturbation of
the invariant subspace
I
Z
I
0
R(A0 + E, Z) = 0.
Problem: Bound for kZk (with E as large as possible)
Stewart’s bound for invariant subspace of
A=
"
#
"
#
"
#
E
E12
L + E11 E12 + C
L C
+ 11
=
.
0 M
E21 E22
E21
M + E22
{z
}
| {z }
|
A0
E
Let sE = sep(L + E11, M + E22) w.r.t k · k and suppose
sE 2
kE21k kE12 + Ck <
4
Then R(A0 + E, Z) = 0 has a unique solution Z, and
kZk ≤
sE +
2 kE21k
≤
.
2
s
E
sE − 4kE21k kE12 + Ck
q
2 kE21k
Proof: Write Riccati equation in fixed point form,
−1
(E21 − ZE12Z),
Z = TE
TE (Z) = (M + E22)Z − Z(L + E11),
−1 −1
k .
and apply the contraction mapping theorem. We have sE = kTE
New bound for invariant subspace of
A=
"
#
#
"
E
E12
L C
+ 11
.
E21 E22
0 M
{z
}
| {z } |
A0
E
Let s = sep(L, M ) w.r.t. k · k and suppose
s2
kEk (kEk + kCk) <
4
Then R(A0 + E, Z) = 0 has a unique solution Z, and
2 kEk
2 kEk
q
kZk ≤
.
≤
2
s
s + s − 4kEk (kEk + kCk)
Proof: Write Riccati equation in fixed point form,
Z = T −1([−Z I]E[I Z ⊤]⊤),
T (Z) = M Z − ZL,
and apply Brouwer’s fixed point theorem. We have s = kT −1k−1.
Block diagonal case
A=
#
"
#
"
E
E12
L 0
.
+ 11
E21 E22
0 M
| {z } |
{z
}
A0
E
Let s = sep(L, M ) w.r.t. k · k and suppose
s
kEk <
(∗)
2
Then R(A0 + E, Z) = 0 has a unique solution Z, and
2 kEk
q
≤
kZk ≤
.
2
2
s
s + s − 4kEk
2 kEk
Open problem: Can condition (∗) be replaced by
kEk < sepD
λ (L, M )
?
Open question extended
Let
A=
"
#
#
"
E
E12
L 0
.
+ 11
E21 E22
0 M
{z
}
| {z } |
A0
Then Λǫ(A0) = Λǫ(L) ∪ Λǫ(M ).
E
6
4
2
0
−2
−4
−6
−6
−4
−2
0
2
4
6
If kEk = ǫ < sepD
λ (L, M ) then precisely dim L eigenvalues of A0 + E
(white crosses) are contained in Λǫ(L) (blue region).
Is the associated invariant subspaces always of the form
range
" #
I
Z
(graph subspace)
?
Thanks for listening