Two-Sided Eigenvalue Methods for Modal Approximation

Two-sided Eigenvalue Algorithms for Modal
Approximation
Master’s thesis
submitted to Faculty of Mathematics
at Chemnitz University of Technology
presented by:
Supervisor:
Advisor:
B.sc. Patrick Kürschner
Prof. Dr. Peter Benner
Dr. Michiel E. Hochstenbach
Chemnitz June 14, 2010
ii
ACKNOWLEDGEMENTS
My primary thanks go to my teacher and supervisor Prof. Dr. Peter Benner for helping
me writing this thesis and guiding me during all this years as student and scientific
research assistant. Without his supervision and the opportunity to work in his research
group, I probably would not have discovered numerical linear algebra, systems and
control theory and model order reduction as such extremely interesting fields of modern
mathematics.
Secondly, I thank my advisor Dr. Michiel E. Hochstenbach for the initial idea for the topic
of this thesis, all the advises and hints he gave me in the many inspiring discussions,
and of course for his hospitality during my stay in Eindhoven, which was sadly much
too short.
I am also very grateful to Dr. Joost Rommes for answering a lot of my questions in the
countless conversations that helped me get a deeper understanding of the investigated
methods.
Of course, many further thanks go to my friends and colleagues with whom I had the
pleasure to live and work together which made the last years such an unforgettable
time. Unfortunately, I am not able to mention every single person but only a few ones.
I especially thank, for instance, my dear colleagues Dr. Jens Saak and Matthias Voigt
for the daily coffee breaks in our office involving many encouraging conversations.
I also want to thank Alexander Bernhardt and Gordon Schmidt for reading parts of this
work. Furthermore, I thank all my other friends who did probably only rarely catch
sight on me during the last weeks of the development of this thesis. Finally, I am also
deeply grateful for the constant support my family gave me during my study.
iv
Abstract
Large scale linear time invariant (LTI) systems arise in many physical and technical
fields. An approximation, e.g. with model order reduction techniques, of this large
systems is crucial for a cost efficient simulation.
In this thesis we focus on a model order reduction method based on modal approximation, where the LTI system is projected onto the left and right eigenspaces corresponding
to the dominant poles of the system. These dominant poles are related to the most dominant parts of the residue expansion of the transfer function and usually form a small
subset of the eigenvalues of the system matrices. The computation of this dominant
poles can be a formidable task, since they can lie anywhere inside the spectrum and the
corresponding left eigenvectors have to be approximated as well.
We investigate the subspace accelerated dominant pole algorithm and the two-sided
and alternating Jacobi-Davidson method for this modal truncation approach. These
methods can be seen as subspace accelerated versions of certain Rayleigh quotient iterations. Several strategies that admit an efficient computation of several dominant poles
of single-input single-output LTI systems are examined.
Since dominant poles can lie in the interior of the spectrum, we discuss also harmonic
subspace extraction approaches which might improve the convergence of the methods.
Extentions of the modal approximation approach and the applied eigenvalue solvers to
multi-input multi-output are also examined.
The discussed eigenvalue algorithms and the model order reduction approach will be
tested for several practically relevant LTI systems.
vi
Contents
List of Figures
ix
List of Tables
xi
List of Algorithms
xiii
1 Introduction
1
2 Mathematical basics
2.1 Eigenvalue problems . . . . . . . . . . . . . . . . . . . .
2.1.1 The standard eigenvalue problem . . . . . . . .
2.1.2 The generalized eigenvalue problem . . . . . . .
2.1.3 Quadratic and polynomial eigenvalue problems
2.1.4 The singular value decomposition . . . . . . . .
2.2 Methods for eigenvalue problems . . . . . . . . . . . . .
2.3 Systems and control theory . . . . . . . . . . . . . . . .
2.3.1 Linear time invariant state-space systems . . . .
2.3.2 Linear descriptor systems . . . . . . . . . . . . .
2.3.3 Second-order systems . . . . . . . . . . . . . . .
2.4 Model order reduction . . . . . . . . . . . . . . . . . . .
2.4.1 The common principle of model order reduction
2.4.2 Modal approximation . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3
3
3
4
5
6
7
10
10
13
13
14
14
15
3 Rayleigh Quotient Iterations
3.1 The standard Rayleigh Quotient Iteration . .
3.2 The two-sided Rayleigh Quotient Iteration .
3.3 The Dominant Pole Algorithm . . . . . . . .
3.4 The Alternating Rayleigh Quotient Iteration .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
23
23
26
28
30
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Contents
viii
3.5
3.6
The Half-Step Rayleigh Quotient Iteration . . . . . . . . . . . . . . . . . .
Numerical example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
33
4 Two-sided subspace accelerated eigenvalue methods
4.1 The Subspace Accelerated Dominant Pole Algorithm . . . . . . . . . . .
4.2 The two-sided Jacobi-Davidson algorithm . . . . . . . . . . . . . . . . . .
4.2.1 The new correction equations . . . . . . . . . . . . . . . . . . . . .
4.2.2 Computing more than one eigentriplet . . . . . . . . . . . . . . .
4.2.3 Inexact solution and preconditioning of the correction equations
4.3 The Alternating Jacobi-Davidson algorithm . . . . . . . . . . . . . . . . .
4.3.1 An alternating subspace accelerated scheme . . . . . . . . . . . .
4.3.2 Computing dominant poles . . . . . . . . . . . . . . . . . . . . . .
4.3.3 Deflation, restarts and inexact solution of the correction equations
37
38
44
44
51
54
59
59
62
62
5 Further improvements and generalizations
5.1 Harmonic subspace extraction . . . . . . . . . . . . . . . . . . . . . .
5.1.1 One-sided harmonic subspace extraction . . . . . . . . . . .
5.1.2 Two-sided harmonic subspace extraction . . . . . . . . . . .
5.2 MIMO Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.1 Multivariable transfer functions . . . . . . . . . . . . . . . .
5.2.2 The Subspace Accelerated MIMO Dominant Pole Algorithm
5.2.3 Computation of MIMO dominant poles with 2-JD . . . . . .
65
65
66
67
71
71
72
74
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6 Numerical examples
77
7 Summary and Outlook
7.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.2 Future research perspectives . . . . . . . . . . . . . . . . . . . . . . . . . .
93
93
94
Bibliography
97
Theses
102
Declaration of Authorship/Selbstständigkeitserklärung
105
List of Figures
1.1
Schematic overview of model order reduction (MOR). . . . . . . . . . . .
2
2.1
(a) Bode plot of the transfer function of the CD player [8] SISO system of
order n = 120 in a double logarithmic plot. (b) Sigma plot of the full 2 × 2
MIMO system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
2.2
3-D Bode plot of H(s), eigenvalues and dominant poles in the region
[−2, 0] × i[0, 20] ⊂ C− of the New England test system [26] of order n = 66. 17
2.3
(a) Eigenvalues and 6 dominant poles in [−2, 0] × i[0, 10] ⊂ C. (b)
Bode magnitude plot of the transfer function of the New England test
system and imaginary parts
of the dominant poles. . . . . . . . . . .
19
Bode plot of original New England test system and reduced order models
with p = 3 (k = 5 eigenvalues / states) and p = 6 (k = 11 eigenvalues /
states) dominant poles according to (2.15). . . . . . . . . . . . . . . . . . .
20
(a) Convergence histories of RQI, 2-RQI, ARQI, DPA and HSRQI for the
New England test system. (b) The same as (a), but for the PEEC patch
antenna Model [8]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
34
Bode magnitude plot of original transfer function of the FOM model [8],
and modal equivalents Hi where the dominant pole pi for i = 1, 2, 3 is
deflated. The dominant poles are p1 = −1 ± 100i, p2 = −1 ± 200i and
p3 = −1 ± 400i. H4 shows the result when all three poles are removed.
The vertical dashed lines
mark Im (p j ) for j = 1, 2, 3. . . . . . . . . . .
43
Convergence histories for 2-JD, SA2RQI and SADPA for the PEEC model
[8] (n = 480) with bi-E-orthogonal (a) and orthogonal (b) search spaces. .
78
2.4
3.1
4.1
6.1
List of Figures
x
6.2
(a) Bode plot and (b) relative error of the original PEEC model and the
modal equivalents of order k = 80 obtained directly with the QZ algorithm. Figure (c) and (d) show the results obtained with 2-JD. . . . . . .
6.3 Convergence histories for 2-JD, SA2RQI and SADPA for the BIPS model
(n = 13.251). All linear systems were solved exactly. . . . . . . . . . . . .
6.4 (a) Bode plot and (b) relative error of the original BIPS model and the
reduced order models (r.o.m.) of order k = 100 obtained with 2-JD,
SA2RQI and SADPA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.5 (a) Convergence histories for 2-JD, SA2RQI and SADPA for the BIPS
system (n = 13.251). All linear systems were solved with 10 steps GMRES
and LU = iE − A as fixed preconditioner. (b) The same as (a), but the
preconditioner is updated after a triplet has been detected and after a
restart. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.6 Convergence histories for SAARQI and AJD for the clamped beam model
(n = 348). All linear systems were solved exactly using LU decompositions.
6.7 (a) Bode plot and (b) relative error of the original beam model and the
k = 14 modal equivalents obtained with AJD and SAARQI. . . . . . . .
6.8 Convergence histories for 2-JD with standard, generalized two-sided harmonic (gen. 2-harm.) and double one-sided harmonic Petrov-Galerkin
(double 1-harm.) extraction for the BIPS model and τ = i, γ = 0.95. All
linear systems were solved with 10 steps of GMRES and LU = τE − A. .
6.9 Sigma plot of complete model of the ISS system 3 × 3 transfer function
and k = 40 modal equivalents computed with 2-JD. . . . . . . . . . . . .
6.10 Sigma plot of complete model of the ISS system 3 × 3 transfer function
and k = 40 modal equivalents computed with SAMDP. . . . . . . . . . .
6.11 Sigma plot of complete model of the BIPS 8 × 8 transfer function and
k = 251 modal equivalents computed with 2-JD (a) and SAMDP (b). . . .
80
82
83
85
86
86
88
89
90
90
List of Tables
2.1
6.1
6.2
Dominant poles and corresponding scaled residues of the New England
test system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Excerpt of the found poles and corresponding residues of the BIPS system
for the three methods. Iteration numbers marked with brackets represent
poles that were found after the first 50 iterations while a minus sign
indicates that the pole was not found by the particular method. . . . . .
Summary of the found poles and corresponding residues of the BIPS
system using different subspace extractions. A minus sign indicates that
the pole was not found. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
82
88
xii
List of Tables
List of Algorithms
3.1
3.2
3.3
3.4
3.5
Rayleigh quotient iteration (RQI) . . . . . . . .
Two-sided Rayleigh quotient iteration (2-RQI)
Dominant Pole Algorithm (DPA) . . . . . . . .
Alternating Rayleigh quotient iteration (ARQI)
Half-step Rayleigh quotient iteration (HSRQI)
.
.
.
.
.
25
28
30
31
33
4.1
4.2
4.3
4.4
38
40
48
4.6
Subpace Accelerated Dominant Pole Algorithm (SADPA) . . . . . . . . .
(Λ̃, Q, Z)=Sort(S, T, b, c) . . . . . . . . . . . . . . . . . . . . . . . . . . .
Basic bi-E-orthogonal two-sided Jacobi-Davidson algorithm . . . . . . .
Efficient exact solution of the correction equations of Algorithm 4.3 (biE-orthogonal 2-JD) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Bi-E-orthogonal two-sided Jacobi-Davidson algorithm for dominant pole
computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Alternating Jacobi-Davidson method . . . . . . . . . . . . . . . . . . . . .
5.1
5.2
(Λ̃, Q, Z)=SortHarm(S1 , S2 , T1 , T2 , b, c, V, W, τ, γ) . . . . . . . . . . .
(Λ̃, Q, Z)=SortM(S, T, B, C, V, W) . . . . . . . . . . . . . . . . . . . . . .
70
74
4.5
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
50
52
60
1
Introduction
One of the most widely used tools to describe technical and physical phenomena are
dynamical systems governed by systems of differential equations. The simulation of
these phenomena via the solution of the underlying equations reveals insights in the
dynamical behavior of the system and is a cornerstone of the production cycle of modern
technical devices since it is common practice to simulate the device in the computer
before actually realizing it. In the last decades, however, the size of these dynamical
systems began to increase drastically. On the one hand because one is nowadays
interested in a more realistic description of the phenomena involved and thus more
details have to be considered, and on the other hand simply due to the significantly
increased complexity of the systems.
Electrical circuits provide a good example to illustrate this issue. An electrical circuit
consists of circuit elements which are, for instance, resistors, inductors, capacitors and
transistors, and is usually described by Kirchhoff’s laws and the characteristic equations
of the circuit elements [15, 35]. This leads to systems of nonlinear differential-algebraic
equations [24]. However, a modern integrated circuit has a very huge number of circuit
elements packed very densely in a small space. To guarantee a realistic description
of such a circuit from a physical point of view, other effects arising from the electromagnetic field or even heat conduction may have to be taken into account as well.
Altogether the result is a very large-scale system of circuit equations where the number
of unknowns can easily exceed one million. Other fields of applications where such
large-scale dynamical systems arise are, e.g., vibration analysis of mechanical structures,
chemical and biological engineering, and power supply networks.
The computational effort for the simulation of these large systems is therefore very high
and can even be beyond the capabilities of modern high-end computers. In the last
decades this has led to an increased focus in model order reduction which is a research
area that addresses the approximation of dynamical systems. Model order reduction
is schematically illustrated in Figure 1.1 using linear time invariant dynamical systems
Chapter 1. Introduction
2
E
ẋ(t) =
A
x(t) + B u(t)
y(t) =
C
x(t)
MOR
Ẽ
x̃˙ (t) = Ã
x̃(t) + B̃ u(t)
ỹ(t) = C̃
x̃(t)
Figure 1.1: Schematic overview of model order reduction (MOR).
as example. The dark green squares and rectangles represent the system matrices and
their size stands for the dimension of the matrices and likewise for the order of the
system. The large arrow in the middle poses as the intrinsic model order reduction
which reduces the size of the matrices, and hence the order of the system. In this sense
the goal of model order reduction is to obtain a so called reduced order model (r.o.m.)
of strongly decreased size which is much easier to simulate than the large-scale original
model. This is usually achieved by determining the dominant parts of the original
system, which have a significant contribution to the system dynamics, and neglecting
the less important segments which normally outnumber the dominant parts. Of course,
this reduced order model should be accurate enough to reflect the dynamic behavior of
the original model adequately. There exist several model order reduction techniques for
this purpose, for instance balanced truncation and Krylov subspace projection methods
[2, 5, 14, 28].
In this thesis we investigate another technique for linear time invariant systems, namely
modal approximation (or modal truncation). Thereby the original system is projected
onto the right and left eigenspaces associated with the dominant poles of the system.
Dominant poles are eigenvalues of the system matrices that have a significant contribution to the dynamical behavior of the system. Thus, modal truncation boils down to the
computation of a number of eigentriplets of the corresponding large-scale eigenvalue
problem. Algorithms for this often formidable task and their application within modal
approximation are the essential topics of this thesis.
The remainder of this thesis is structured as follows. In Chapter 2 we review the
necessary fundamentals of eigenvalue theory and systems and control theory, and
we investigate modal approximation as model order reduction method. Since we
have to compute eigenvalues and eigenvectors for this strategy, we discuss some basic
methods for eigenvalue computations based on the Rayleigh quotient in Chapter 3.
Chapter 4 represents the main part of this thesis and deals with some improved versions
of the basic iterations mentioned before which admit the computation of a number
of eigentriplets of the system matrices and can hence directly be used to generate
reduced order models. Our main interest in this context are Jacobi-Davidson style
eigenvalue solvers. Chapter 5 introduces some generalizations of these methods from
both a numerical and an application-oriented point of view. Numerical experiments
of the eigenvalue computation and the intrinsic modal approximation are presented in
Chapter 6. Finally, Chapter 7 gives conclusions and some future research perspectives.
2
Mathematical basics
2.1 Eigenvalue problems
This section briefly reviews some concepts from eigenvalue theory which are necessary
for the following sections and chapters of this thesis. More details and informations can
be found, for instance, in the textbooks [13, 33, 59].
2.1.1 The standard eigenvalue problem
For a matrix A ∈ Cn×n the standard eigenvalue problem is defined as
Ax = λx, x , 0,
with unknowns λ ∈ C and x ∈ Cn . The scalar λ is a root of the characteristic polynomial
of A, that is pA (λ) := det(A − λI) = 0, and is called eigenvalue. The nonzero vector x is a
(right) eigenvector for λ. We refer to a pair (λ, x) as eigenpair of A. Similarly, a nonzero
vector y ∈ Cn for which y∗ A = λy∗ holds is a left eigenvector for λ. Together with λ and x
this forms an eigentriplet (λ, x, y) of A. The set
Λ(A) = {λ ∈ C : det(A − λI) = 0}
contains all eigenvalues of A and is referred to as spectrum of A. The multiplicity of a root
of det(A − λI) is called algebraic multiplicity and is denoted by α(λ). The corresponding
geometric multiplicity β(λ) of the eigenvalue λ is the dimension of the null space of A − λI.
If A has s ≤ n distinct eigenvalues or equivalently, det(A − λI) has s ≤ n distinct roots,
denoted by λ j , and α(λ j ) = β(λ j ) holds for all j = 1, . . . , s, then the matrix is diagonalizable
(or nondefective). In this case there exist n linearly independent eigenvectors x1 , . . . , xn
such that X−1 AX = diag(λ1 , . . . , λn ) with a nonsingular matrix X ∈ Cn×n which has the
right eigenvectors as columns. The corresponding left eigenvectors y1 , . . . , yn satisfy
y∗i A = λy∗i
Chapter 2. Mathematical basics
4
and can (in the nondefective case) be scaled such that
y∗i x j = δi j , i, j = 1, . . . , n.
However, if xi , yi are scaled such that kxi k2 = kyi k2 = 1 and if α(λi ) = 1, then
κ(λi ) :=
1
|y∗i xi |
defines the condition number of the simple eigenvalue λi . The left eigenvectors of A are
also the rows of X−1 and therefore it is possible to write Y∗ AX = diag(λ1 , . . . , λn ) and
Y∗ X = I. The columns of the nonsingular matrix Y ∈ Cn×n are then the left eigenvectors.
Symmetric matrices A = AT ∈ Rn×n and Hermitian matrices A = A∗ ∈ Cn×n have
only real eigenvalues and the eigenvectors form a complete orthonormal basis. This
property of the eigenvectors is also shared by normal matrices, that is matrices for
which AA∗ = A∗ A holds. Hence, a right eigenvector for an eigenvalue of a symmetric,
Hermitian or normal matrix is also a left eigenvector for the same eigenvalue. A matrix
that is not diagonalizable is called defective. In this case there are defective eigenvalues
with α(λ) > β(λ).
For general square matrices there exists a Schur decomposition such that
Q∗ AQ = T = diag(λ1 , . . . , λn ) + N
with a unitary matrix Q ∈ Cn×n and a strictly upper triangular nilpotent matrix N. A
decomposition that reveals the algebraic and geometric multiplicities of the eigenvalues,
too, is the Jordan decomposition or Jordan canonical form
X−1 AX = diag(J1 , . . . , Js ).
Each Jordan block Ji ∈ Cmi ×mi is strictly upper triangular with a single eigenvalue λi on
its diagonal and ones along the first superdiagonal and it holds m1 + . . . + ms = n. The
Jordan blocks with mi > 1 correspond to defective eigenvalues.
2.1.2 The generalized eigenvalue problem
The problem to find scalars λ and nonzero vectors x that satisfy
Ax = λEx,
where A, E ∈ Cn×n , is called generalized eigenvalue problem. Analogous to the standard
eigenvalue problem, the scalar λ is a root of det(A − λE) and is referred to as generalized
eigenvalue with corresponding (right) eigenvector x. The pair (λ, x) is now called
generalized eigenpair of the matrix pair (A, E) or of the pencil A − λE. A nonzero vector
y ∈ Cn that fulfills
y∗ A = λy∗ E
2.1. Eigenvalue problems
5
is then a left eigenvector of (A, E) or, altogether, (λ, x, y) is a (generalized) eigentriplet of
the pair (A, E). As for the standard eigenvalue problem, the set of all eigenvalues of
the pair (A, E) is called the spectrum of (A, E) and is denoted by Λ(A, E). It is possible
to find for general matrices A, E unitary matrices Q, Z ∈ Cn×n which simultaneously
triangularize A and E to a generalized Schur decomposition:
Q∗ AZ = T, Q∗ EZ = S.
The matrices S, T are upper triangular and their diagonal entries reveal the eigenvalues
of (A, E) by λi = tii /sii if sii , 0. If sii = 0, there is an eigenvalue λi = ∞ and if sii = tii = 0
for some i, then Λ(A, E) = C. In this case the pair (A, E) is called singular and in
all former cases it is called regular. If there are n linearly independent right and left
eigenvectors xi and yi , then the pair (A, E) is called diagonalizable or nondefective.
Since in this case it holds y∗i Ex j = 0 for i , j, it is possible to write
Y∗ AX = ΛA , Y∗ EX = ΛE ,
with Y = [y1 , . . . , yn ], X = [x1 , . . . , xn ] ∈ Cn×n and diagonal matrices ΛA and ΛE . Additionally, if ΛE is nonsingular it follows that ΛA Λ−1
=: Λ = diag(λ1 , . . . , λn ). Right and
E
left eigenvectors corresponding to finite eigenvalues of a nondefective pair (A, E) can
be scaled so that
y∗i Axi = λi and y∗i Ex j = δi j
holds. An important special case is A = A∗ and E = E∗ > 0. Then Λ(A, E) ⊂ R and there
exists a nonsingular eigenvector matrix X ∈ Cn×n such that X∗ AX = diag(λ1 , . . . , λn ) and
X∗ EX = I. Here the columns xi of X are both right and left eigenvectors corresponding to
the eigenvalue λi and they are orthogonal with respect to the inner product induced by E
or, in other words, bi-E-orthogonal. If A or E is nonsingular, the generalized eigenvalue
problem can be transformed into a standard eigenproblem by multiplying with A−1 or
E−1 , respectively. However, in most cases this is not reasonable from a numerical point
of view. For arbitrary pairs (A, E) there exist, in analogy to the Jordan decomposition for
the standard eigenvalue problem, the Weierstrass and Weierstrass-Schur decompositions.
2.1.3 Quadratic and polynomial eigenvalue problems
Another generalization of the standard eigenproblem is the quadratic eigenvalue problem
of the form
(λ2 M + λL + K)x = 0, x , 0,
where M, L, K ∈ Cn×n . Eigenvalues and right and left eigenvectors are defined in the
same way as for the standard or generalized eigenproblem. The next generalization is
the polynomial eigenvalue problem
 p

X

i 

 x = 0, x , 0
λ
A
i


i=0
Chapter 2. Mathematical basics
6
with Ai ∈ Cn×n . Polynomial eigenproblems have np eigenvalues and up to np right
and left eigenvectors which implies that, if there are more than n eigenvectors, they
are not linearly independent. A usual approach to deal with quadratic and polynomial
problems is to reformulate them as equivalent generalized or standard eigenproblem.
For instance, the quadratic eigenproblem can be rewritten as L(λ)z = 0, where
"
# "
#
" #
M 0
L K
λx
L(λ) := λ
−
and z :=
.
0 I
−I 0
x
We refer to [54] for more details and a nice collection of examples of quadratic eigenproblems.
2.1.4 The singular value decomposition
For an arbitrary matrix A ∈ Cm×n there exists a singular value decomposition (SVD) of the
form
A = UΣV ∗ , U ∈ Cm×m , V ∈ Cn×n unitary, and Σ ∈ Rm×n .
We assume without loss of generality m ≥ n. Then
" #
Σ
Σ = 1 , Σ1 = diag(σ1 , . . . , σn ) ∈ Rn×n .
0
For m < n one can easily consider A∗ . The diagonal entries σi of Σ1 are called singular
values of A and are ordered such that
σ1 ≥ σ2 ≥ . . . ≥ σr > σr+1 = . . . = σn = 0,
where r := rank (A). The columns u and v of the unitary matrices U and V are called left
and right singular vectors, respectively, and are scaled so that kuk2 = kvk2 = 1. Together
with a corresponding singular value they form a singular triplet (σ, u, v) that satisfies
Av = σu and A∗ u = σv.
The square roots of the nonzero eigenvalues of A∗ A and AA∗ are the nonzero singular
values of A. The corresponding eigenvectors of A∗ A (AA∗ ) are the right (left) singular
vectors of A. Another connection of singular values and eigenvalues is given by the
augmented matrix [13, Section 8.6]
"
#
0 A
M := ∗
∈ C(m+n)×(m+n) .
A 0
The absolute values of the eigenvalues of M are the singular values of A and the
eigenvectors of M can be decomposed into an upper and lower part. The upper part
corresponds to the right and the lower part to the left singular vectors.
2.2. Methods for eigenvalue problems
7
2.2 Methods for eigenvalue problems
In this section we give a brief overview over some of the methods for eigenvalue problems. For more detailed descriptions we refer to [3, 13, 53, 56]. Methods for eigenvalue
problems are usually distinguished between full space methods for dense matrices of
moderate size and iterative subspace methods for very large and sparse matrices. Full
space methods compute the complete set of eigenvalues and, if necessary, the eigenvectors or invariant subspaces, too. Although these methods are sometimes referred
to as direct methods, they are also of iterative nature. For the standard eigenproblem
with A ∈ Cn×n the QR method can be used to compute a Schur decomposition. If the
matrix is symmetric, there exist, among others, the symmetric QR method and Jacobi
methods. Symmetric tridiagonal eigenproblems can be solved efficiently with divide
and conquer methods. The QZ method computes a generalized Schur decomposition
of a pair (A, E). Since full space methods usually transform the original matrices to diagonal or triangular form by applying transformation matrices, they have a complexity
of O(n3 ) and therefore have a limited range of applications.
For large scale sparse matrices, which play an essential role in scientific computations,
iterative subspace methods come into the picture which normally focus on the computation of only a fraction of the spectrum and, if necessary, the corresponding eigenvectors.
A matrix A ∈ Cn×n is called sparse if the number of nonzero elements is only of order
O(n). Iterative subspace methods work only with matrix vector products on the original
matrix A, which is inexpensive if A is sparse and hence they can theoretically be applied
to large sparse matrices of unlimited size.
In such methods, the eigenproblem is usually projected onto a lower dimensional subspace. The projected eigenproblem is then of small or moderate size and can be solved
with the full space methods mentioned above. The projection is usually carried out by
imposing a certain Galerkin condition on the approximate eigenpair (θ, v). Its most basic
form is given by the Ritz-Galerkin projection. There the eigenvector approximations are
represented by v := Vk x̃ ∈ Cn with a nonzero coefficient vector x̃ ∈ Ck and a matrix
Vk ∈ Cn×k whose columns are orthonormal and span the k−dimensional search space
Vk ⊂ Cn , practically with k n. The corresponding residual of a approximate eigenpair
(θ, v) is then assumed to be orthogonal to the whole subspace Vk :
r := Av − θv ⊥ Vk ,
or equivalently,
Vk∗ AVk x̂ = θx̂.
Clearly, (θ, x̃) is an eigenpair of the reduced matrix Mk := Vk∗ AVk ∈ Ck×k and can be
computed efficiently by standard methods for small eigenproblems. The eigenvector x̃
is lifted up to the n−dimensional space by v := Vk x̃ and the resulting pair (θ, v) is called
Ritz pair of A with respect to the subspace Vk . Afterwards, the subspace Vk is expanded
Chapter 2. Mathematical basics
8
orthogonally by a new basis vector which is derived from v, and the whole process is
repeated with this (k + 1)−dimensional subspace. In this thesis, we will work mainly
with Petrov-Galerkin projection where a second subspace Xk is used such that
r := AVk x̃ − θVk x̃ ⊥ Xk
holds. Note that Xk is often called test subspace in this context. The eigenpairs (θ, x̃) of
the reduced eigenproblem
Xk∗ AVk x̃ = θXk∗ Vk x̃
lead to Petrov pairs (θ, v := Vk x̃) with respect to the search subspace Vk and the test
subspace Xk . Note that, depending on the choice of Xk , the reduced eigenproblem
can in general be a generalized one, even if the original problem is not. One choice
is to derive the basis vectors of Xk from the approximate left eigenvectors of A. This
Petrov-Galerkin projection can be easily extended to generalized eigenvalue problems
and to a two-sided Petrov-Galerkin projection if the left eigenvectors are sought as well.
This two-sided approach will be the common feature of the eigenvalue methods to be
investigated in Chapter 4.
Important and well known iterative methods working with the Ritz-Galerkin projection
are the Krylov subspace methods where Vk is constructed to be a Krylov subspace
n
o
Vk = K(A, q1 , k) := span q1 , Aq1 , . . . , Ak−1 q1
for some initial vector q1 ∈ Cn . Two prominent methods using this framework are
the Lanczos method for Hermitian and the Arnoldi method [13, Ch. 9] for general
square matrices. The Lanczos methods produces a unitary matrix Vk ∈ Cn×k that
spans a k−dimensional A−invariant Krylov subspace such that the transformed matrix
Tk := Vk∗ AVk ∈ Ck×k is tridiagonal and its eigenvalues are good approximations for
the eigenvalues of A. Similarly, the Arnoldi method produces a matrix Vk such that
Hk := Vk∗ AVk is upper Hessenberg. For both methods there exist generalizations for the
generalized eigenvalue problem and for the computation of the left eigenvalues which
include a one- or two-sided Petrov-Galerkin type projection [56].
In this thesis we focus on another important class of subspace methods which are the
Jacobi-Davidson methods. Initially proposed by G. L. G. Sleijpen and H. A. van der
Vorst [48] for the standard linear eigenvalue problem, the Jacobi-Davidson method
has been improved and generalized in several ways to handle various problems, for
instance generalized eigenvalue problems [7, 12, 37], quadratic and polynomial eigenvalue problems [7, 21, 49], singular value problems [18], and even nonlinear eigenvalue
problems [6, 45, 46].
Although we investigate these methods in greater detail in the remaining chapters, we
give here a rough description of their functioning. The original basic Jacobi-Davidson
method [48] uses two principles. At first, the so called Davidson principle [10] where
2.2. Methods for eigenvalue problems
9
a Ritz-Galerkin type projection is applied as described previously. But now the constructed subspace Vk is in general not a Krylov subspace. Let Vk be the orthogonal
matrix with the basis vectors of Vk as columns. The reduced eigenvalue problem is then
given by
Vk∗ AVk x̃ = θx̃
with the reduced matrix Mk := Vk∗ AVk ∈ Ck×k which has, unlike in the Lanczos or
Arnoldi method, no special tridiagonal or triangular form. For a Ritz pair (θ, v := Vk x̃)
of A with respect to Vk obtained from this small eigenvalue problem the second principle
is used to find a correction t for the Ritz vector v. In the Jacobi-Davidson method this
is done by the Jacobi style correction [22], which finds a correction t ∈ Cn orthogonal
to the Ritz vector v. This can be done by applying a Newton scheme to the function
F : Cn+1 7→ Cn+1 , see [34, 47]:
"
#
Ax − λx
Fw (λ, x) =
.
w∗ x − 1
The vector w ∈ Cn is used to induce a suitable scaling of the vector x. Clearly, an exact
eigenpair (λ, x) with w∗ x = 1 is a root of F and satisfies Fw (λ, x) = 0. We now look for
a better approximation (θ+ , v+ ) = (θ + µ, v + t) of the previously generated Ritz pair,
where the improvement t is sought in the orthogonal complement
v⊥ := {z ∈ C : v∗ z = 0}
of v. After some manipulations this yields the linear system of equations [48]
(I − vv∗ )(A − θI)(I − vv∗ )t = −r
which is called Jacobi-Davidson correction equation. The subspace Vk is then expanded
orthogonally, by e.g. using Gram-Schmidt orthogonalization, with t as new basis vector
vk+1 . The whole process is then repeated with a search space dimension until the norm
of the residual krk2 is smaller than a given tolerance. If the correction equation is solved
exactly, the Jacobi-Davidson method is an exact Newton method [11, 47]. However, in
practice it is often sufficient to solve the correction equation only up to a moderate accuracy by applying a small number of steps of an iterative method for linear systems, such
as CG, BiCG or GMRES [43]. In [58] it is shown, that the error induced by this inexact
solution of the correction equation is in some sense minimized due to the projection
onto the orthogonal complement of the previous eigenvector approximation. In Section
4.2 and 4.3 we investigate the two-sided and the alternating Jacobi-Davidson method
[20, 50] which compute a number of approximate eigentriplets (λ, x, y) of a matrix
pair (A, E). The two-sided Jacobi-Davidson method uses a two-sided Petrov-Galerkin
projection to compute approximations for right and left eigenvectors simultaneously,
while the alternating Jacobi-Davidson applies a Ritz-Galerkin projection alternately to
(A, E) and (A∗ , E∗ ) to produce approximations of right and left eigenvectors in each
even and odd iteration, respectively.
Chapter 2. Mathematical basics
10
2.3 Systems and control theory
This section covers some fundamentals of systems and control theory that are necessary
for the remainder of this thesis. Very good and more detailed introductions can be
found, for instance, in [2, 4, 9, 23].
2.3.1 Linear time invariant state-space systems
A linear time invariant (LTI) state-space system is of the form



ẋ(t) = Ax(t) + Bu(t), x(t0 ) = x0 ,


 y(t) = Cx(t) + Du(t),
(2.1)
with a state-space matrix A ∈ Rn×n , input and output maps B ∈ Rn×m and C ∈ Rp×n , and a
direct transmission map D ∈ Rp×m . Furthermore, x(t) ∈ Rn is called state vector, u(t) ∈ Rm
is called input or control vector, and y(t) ∈ Rp is referred to as output vector. Instead of the
form (2.1), we will denote such systems often by tuples (A, B, C, D). If for the direct
transmission map holds D = 0, we remove D from the tuple and denote the system only
by (A, B, C). The ordinary differential equation of (2.1) is called state equation and the
algebraic equation below is called output equation. The order of such a system is defined
by the dimension n of the state-space matrix A. If m, p > 1 (2.1) is called multi-input
multi-output (MIMO) system and otherwise single-input single-output (SISO) system if
m = p = 1. In the sequel we assume without loss of generality that t0 = 0 and x0 = 0.
SISO systems are often written in the form



ẋ(t) = Ax(t) + bu(t), x0 = 0,
(2.2)


 y(t) = cx(t) + du(t)
with A, x(t) as in (2.1), input and output vectors b, c∗ ∈ Rn , d ∈ R and u(t), y(t) ∈ R.
Dynamical systems of the form (2.1) and (2.2) are used to characterize the dynamics
of certain physical and technical models. However, in realistic models the relations
between x(t), u(t) and y(t) can be nonlinear. These nonlinear models can under certain
conditions be linearized to get LTI systems. Other physical phenomena, for instance the
heat transfer within a material, are modeled by instationary partial differential equations. A discretization of the space variables using finite elements or finite differences
leads then to ordinary differential equations which can, if necessary, again be linearized
to obtain LTI systems. See [9, Section 5.2] for a nice collection of examples of these
concepts.
Of great importance in systems and control theory is the transfer function of the system
(2.1). The transfer function can be obtained by applying the Laplace transform
Z∞
L f (s) :=
e−st f (t)dt
0
2.3. Systems and control theory
11
to the state and output equations. This yields



sX(s)


Y(s)
= AX(s) + BU(s),
= CX(s) + DU(s),
where X(s), U(s) and Y(s) are the Laplace transforms of x(t), u(t) and y(t), respectively.
The upper equation can be rearranged to X(s) = (sI − A)−1 BU(s) and inserted into the
output equation such that
Y(s) = C(sI − A)−1 b + D U(s).
The term in the brackets is called transfer function H : C 7→ Cp×m of (2.1) and is defined
by
H(s) = C(sI − A)−1 B + D.
(2.3)
It relates inputs to outputs in the frequency domain by the relation Y(s) = H(s)U(s). The
poles of (2.3) are a subset of Λ(A) and play an essential role for model order reduction
based on modal approximation. The transfer function H(s) of a SISO (2.2) is often
illustrated via the Bode magnitude plot of H(s), which is a logarithmic plot of H(iω) versus
frequencies ω ∈ R+ . Usually, the magnitude |H(s)| for s ∈ C is expressed as decibels of
the gain
G(s) := 20 log10 (|H(s)|) .
(2.4)
With this logarithmic scaling the Bode magnitude plot shows the graph of (ω, G(iω)).
Another useful illustration are 3-D Bode plots [55].
In the MIMO case H(s) is a matrix valued function and an appropriate illustration are so
called sigma plots, which depict the largest and smallest singular values σmax and σmin
of H(iω) versus frequencies ω ∈ R+ where a logarithmic scaling similar to (2.4) can be
used as well.
In Figure 2.1 we illustrated the Bode magnitude and sigma plot of the CD player1
example [8] of order n = 120. The SISO system (A, b2 , c1 ) in the Bode magnitude plot
2.1a is extracted from the full MIMO system by taking only the second column b2 of B
and the first row c1 of C. Figure 2.1b shows the sigma plot of the full MIMO system
(A, B, C, D = 0) with p = m = 2. Observe the peaks in both plots, which are caused by
the dominant poles of H(s) which usually form a small subset of Λ(A). These dominant
poles are defined and described in more detail in subsection 2.4.2 and can be considered
as the backbone of our modal truncation approach.
An LTI system is called asymptotically stable if lim x(t) = 0. The solution of the system
t→∞
1
available at http://www.slicot.org/index.php?site=examples
Chapter 2. Mathematical basics
12
100
Gain (dB)
Gain (dB)
0
−50
0
−100
−100
H(iω)
10−1 100 101 102 103 104 105 106
Frequency (rad/sec)
σmax (H(iω))
σmin (H(iω))
10−1 100 101 102 103 104 105 106
Frequency (rad/sec)
(a) Bode magnitude plot
(b) Sigma plot
Figure 2.1: (a) Bode plot of the transfer function of the CD player [8] SISO system of
order n = 120 in a double logarithmic plot. (b) Sigma plot of the full 2 × 2
MIMO system.
(2.1) is given by
Zt
x(t) = eAt x0 +
eA(t−τ) Bu(τ)dτ,
(2.5)
0
y(t) = Cx(t) + Du(t).
Hence, a LTI system is asymptotically stable if Λ(A) ⊂ C− , that is, all eigenvalues of A
have negative real parts. Matrices A with this property are also called Hurwitz.
Another important system theoretic property is passivity which means that the system
generates no energy and absorbs energy only from sources that are used to excite it.
Passivity can be investigated with the transfer function H(s). A system is passive if and
only if H(s) is positive real. Positive realness is given when H(s) is analytic for all s ∈ C+ ,
H(s) = H(s) for all s ∈ C, and H(s) + H∗ (s) = Re (H(s)) ≥ 0 for all s ∈ C+ .
A system is called controllable if every state x(t) can be reached via appropriate control u(t)
from the initial state x0 = 0 for any t0 . From the solution (2.5) of the differential equation
it follows that controllability is equivalent to im(eAt ) = Rn and the Cayley-Hamilton
theorem implies that this is the case when the controllability matrix
C(A, B) = [B, AB, . . . , An−1 B]
has rank n. A dual property of controllability is observability. A system is observable if
the initial state x0 is uniquely determined from the input and the output. Equivalently,
with u(t) = 0, the output y(0) = 0 implies that x0 = 0. Another equivalent condition of
2.3. Systems and control theory
13
observability is that the observability matrix
O(A, C) = [C, A∗ C, . . . , (An−1 )∗ C]∗
has full rank n. If a system is both controllable and observable it is called minimal.
2.3.2 Linear descriptor systems
A modification of (2.1) are linear time invariant descriptor systems



Eẋ(t) = Ax(t) + Bu(t), x(0) = x0 ,


 y(t)
= Cx(t) + Du(t)
(2.6)
with an additional, possibly singular descriptor matrix E ∈ Rn×n . The vector x(t) ∈ Rn is
in this case called descriptor or generalized state vector. We will denote descriptor systems
also by tuples (E, A, B, C, D). If E is indeed singular then the first equation in (2.6)
is a differential algebraic equation (DAE). DAEs have their own special properties which
distinguishes them from the ordinary differential equations of the standard state-space
systems. See [24] for detailed informations. The transfer function of (2.6) is, if the matrix
pair (A, E) is regular, given by
H(s) = C(sE − A)−1 B + D
and its poles form a subset of Λ(A, E). If E is singular (DAE case), Λ(A, E) contains
eigenvalues at infinity.
A descriptor system is called asymptotically stable if the finite eigenvalues of the pair
(A, E) have negative real parts. Controllability and observability can be generalized
for descriptor systems, too. Most of the systems we consider in this thesis will be in
descriptor form.
2.3.3 Second-order systems
Another generalization of the standard LTI systems are second-order linear time invariant
dynamical systems of the form



Mẍ(t) + Lẋ(t) + Kx(t) = Bu(t),
(2.7)


 y(t)
= Cx(t) + Du(t)
with three system matrices M, L, K ∈ Rn×n . All other matrices and vectors have the same
size and meaning as in the standard and descriptor case. Since second-order systems
arise often in structural system analysis, M is called mass matrix, L is the damping matrix,
and K is referred to as stiffness matrix. The transfer function of (2.7) is defined as
H(s) = C(s2 M + sL + K)−1 B + D.
(2.8)
Chapter 2. Mathematical basics
14
Obviously, the poles of the transfer function (2.8) are a subset of the eigenvalues λi ∈ C
of the quadratic eigenproblem
(λ2 M + λL + K)x = 0, x , 0.
Second-order systems can be transformed to first order descriptor systems by applying
the same linearization technique as for the quadratic eigenvalue problem. Assuming a
nonsingular K, we can define matrices
"
#
"
#
0 −K
−K 0
A :=
, E :=
∈ R2n×2n ,
−K −L
0 M
Bl = [0, BT ]T ∈ R2n×m and Cl = [C, 0] ∈ Rp×2n , and a vector z = [xT , ẋT ]T ∈ R2n such that
the corresponding linear system is given by



Eż(t) = Az(t) + Bl u(t), x(t0 ) = x0 ,


 y(t)
= Ll z(t) + Du(t).
Note that, similar to quadratic eigenvalue problems, there are other linearizations possible [54].
2.4 Model order reduction
2.4.1 The common principle of model order reduction
The goal of model order reduction is to reduce the order of a given dynamical system,
for instance in order to allow a simulation with a reduced computational effort. For a
descriptor system (2.6) this can be expressed in finding a reduced order model of order
kn



Ẽx̃˙ = Ãx̃ + B̃u,


 ỹ
= B̃x̃ + Du
with Ã, Ẽ ∈ Rk×k , B̃ ∈ Rk×m , B̃ ∈ Rp×k , D ∈ Rp×m and x̃ = x̃(t) ∈ Rk , u = u(t) ∈
Rm , ỹ = ỹ(t) ∈ Rp . There are mainly three prominent methods of model order reduction
of linear time invariant systems: Krylov subspace methods, balanced truncation, and
modal approximation. See e.g. [2, 5, 14, 28] for more details of these approaches. The
common principle of all those methods can again be considered as a Petrov-Galerkin
type projection. For the state equation of the original system (2.6) one could write
Eẋ − Ax − Bu ⊥ Cn (i.e. = 0).
Let Xk , Yk be two k−dimensional subspaces of Cn spanned by the basis vectors x1 , . . . , xk
and y1 , . . . , yk , respectively, and let Xk , Yk ∈ Cn×k be the corresponding basis matrices
2.4. Model order reduction
15
with the basis vectors as columns. The space Xk is called search space and Yk is the test
space. In a Petrov-Galerkin projection we use an oblique projection of the system onto
Xk along Yk . A representation of the state vector x in Xk is then given by x = Xk x̃ with
x̃ ∈ Ck . The associated Petrov-Galerkin condition is
EXk x̃˙ − AXk x̃ − Bu ⊥ Yk
(2.9)
which is equivalent to
Yk∗ EXk x̃˙ − Yk∗ AXk x̃ − Yk∗ Bu = 0.
Together with the corresponding projected output equation this leads to


∗
∗
∗

Yk EXk x̃˙ = Yk AXk x̃ + Yk B̃u,


 ỹ
= CXk x̃ + Du.
Obviously, this is a reduced order model
Ẽ, Ã, B̃, C̃, D = Yk∗ EXk , Yk∗ AXk , Yk∗ B̃, CXk , D
of order k. The question arises how to choose the subspace Xk , Yk adequately, such that
the reduced order model yields a good approximation for the dynamics of the original
model. The availability of an a-priori error bound is also an often requested goal. We are
now going to review how the spaces Xk and Yk can be chosen for a modal approximation
since this is the model order reduction technique of interest in this thesis.
2.4.2 Modal approximation
For modal approximation one chooses the subspaces Xk , Yk of the Petrov-Galerkin
projection (2.9) as the right and left eigenspaces corresponding to a set of k eigenvalues
of the pair (A, E). For this purpose, we consider the eigenvalue decomposition of a
nondefective pair (A, E):
Y∗ AX = Λ = diag(λ1 , . . . , λn ), Y∗ EX = I.
(2.10)
The matrices X, Y ∈ Cn×n contain the right and left eigenvectors xi and yi corresponding
to the eigenvalues λi (i = 1, . . . , n), that is
Axi = λExi and y∗i A = λy∗i E.
Now suppose the eigenvalue decomposition can be partitioned as
"
#
h
i∗ h
i
h
i∗ h
i
Λ1 0
Y1 , Y2 A X1 , X2 =
, Y1 , Y2 E X1 , X2 = I,
0 Λ2
Chapter 2. Mathematical basics
16
where Λ1 ∈ Ck×k contains k eigenvalues of interest and the other block matrices are of
appropriate size. The reduced order model is then obtained via
(Ẽ, Ã, B̃, C̃, D) := (Y1∗ EX1 = Ik , Y1∗ AX1 = Λ1 , Y1∗ B, CX1 , D).
and the subspaces Xk , Yk are trivially given by span(X1 ), span(Y1 ). The natural question
that arises, is which subset of Λ(A, E) should be selected in order to obtain a good
approximation of the systems behavior in the reduced order model?
For the answer we start for a better illustration with the SISO descriptor case and
investigate the transfer function (2.3)
H : C 7→ C, H(s) = c(sE − A)−1 b + d
(2.11)
which is a scalar rational function and its poles form a subset of Λ(A, E).
Remark 2.1:
Since the matrix pairs (A, E) of descriptor systems are in most cases real, we consider
this case only in this thesis and refer to a pole of (2.11) as either an eigentriplet
(λ, x, y) if λ ∈ R or to a pair of complex conjugate eigentriplets if λ ∈ C.
♦
Since the absolute value |H(s)| of the transfer function H(s) maps complex numbers
s = Re (s) + i Im (s) to real numbers z := |H(s)| ∈ R, we can write it as bivariate function
z = |H(x, y)| for (x := Re (s), y := Im (s)) ∈ R × R. A very interesting and revealing
illustration for our purpose is a 3-D Bode plot [55] which is a surface plot of the gain (2.4)
against s = x + iy ∈ C. Figure 2.2 shows the 3-D Bode plot of H(s) of the New England
test system2 [26] in a region in the left half plane. The poles of H(s) (eigenvalues of
A) are marked as black dots in the Re (s)-Im (s)-plane. Note that the cutsection of the
surface with the Im (s)-z-plane corresponds to the Bode plot of H(s) and is therefore
emphasized as thick black curve. Observe that the function values grow in the limit
towards infinity as s reaches an eigenvalue λ of (A, E). However, the poles marked as
green dots elevate the function values in a stronger way and cause peaks in the Bode
plot. In the following, these poles will be referred to as dominant poles. To investigate
which specific poles cause these peaks we rewrite H(s) as sum of residues R j ∈ C of the
r ≤ n finite poles [23]:
H(s) =
r
X
i=j
Rj
s − λj
+ R∞ + d, R j := (cx j )(y∗j b),
(2.12)
where R∞ is the constant contribution of the infinite eigenvalues and often zero. This
expression can be obtained by inserting the eigenvalue decomposition (2.10) into (2.11)
[23] or by rewriting H(s) as a partial fraction expansion [1]. Since |H(s)| is raised towards
infinity as s approaches λ ∈ Λ(A, E), the peaks in the Bode plot occur close to frequencies
ω ∈ R+ which are close to the imaginary parts of certain eigenvalues λ j . For clarification
which specific eigenvalues cause this behavior, let λ = λn = α + iβ ∈ Λ(A, E), R = Rn
2
available at http://sites.google.com/site/rommes/software
2.4. Model order reduction
17
Figure 2.2: 3-D Bode plot of H(s), eigenvalues and dominant poles in the region
[−2, 0] × i[0, 20] ⊂ C− of the New England test system [26] of order n = 66.
the corresponding residue, assume R∞ = 0, d = 0, and consider the limit of H(iω) for ω
towards Im (λ) = β:
lim H(iω) = lim
ω→β
ω→β
n
X
j=1
Rj
iω − λ j


n−1
X

R j 
R

 = R + Hn−1 (iβ).
= lim 
+
ω→β  iω − (α + iβ)
iω − λ j  −α
j=1
We see that if ω is close to Im (λ) and |R|/| Re (λ)| is large, then |H(iω)| in the Bode
magnitude plot is large as well which establishes a first criterion of modal dominance.
Note that by the scaling by | Re (λ)|, dominant poles are usually positioned close to the
imaginary axis as it can be observed in Figure 2.2. If this scaling is omitted, the residue
Chapter 2. Mathematical basics
18
magnitude |R| alone can also be used as an indicator for modal dominance. In some
applications one is interested in the poles closest to zero which can be emphasized by a
modal dominance indicator of the form |R|/|λ|. This quantity can be derived in a similar
way as above by considering the limit of H(s) for s towards zero. Altogether we get the
following three definitions of modal dominance.
Definition 2.2:
Let λi ∈ Λ(A, E) be a pole of the transfer function H(s) of a SISO system (E, A, b, c, d)
with corresponding left and right eigenvectors yi and xi which are scaled so that
y∗i Exi = 1. Then λi is called dominant pole if
|Ri | > |R j |,
|R j |
|Ri |
>
,
| Re (λi )| | Re (λ j )|
|Ri | |R j |
>
or
|λi | |λ j |
holds for all j , i.
(2.13)
(2.14)
(2.15)
♦
For the more general MIMO systems, the residue represention of H(s) is
H(s) =
r
X
i=1
Ri
+ R∞ + D, Ri := (Cxi )(y∗i B) ∈ Cp×m .
s − λi
(2.16)
In this case peaks occur in the sigma plot of H(s). By using the spectral norm k · k2 this
leads to similar definitions of modal dominance for MIMO systems.
Definition 2.3:
Let (λi , xi , yi ) be an eigentriplet of (A, E) with y∗i Exi = 1. Then the pole λi of the
transfer function H(s) of a MIMO system (E, A, B, C, D) is called dominant pole if
kRi k2 > kR j k2 ,
kR j k2
kRi k2
>
,
| Re (λi )| | Re (λ j )|
kRi k2 kR j k2
or
>
|λi |
|λ j |
holds for all j , i. Note that kRk2 = σmax (R).
(2.17)
(2.18)
(2.19)
♦
Note that none of the introduced dominance criteria is dependent of the direct transmission maps d or D. For both SISO and MIMO systems several other dominance
measurements are possible [1, 57]. In Figure 2.3a 6 dominant poles with respect to the
dominance definition (2.15) of the New England test system are plotted together with
Λ(A) in a region of the Re (λ)-Im (λ)-plane. Note that this plot is essentially the same
2.4. Model order reduction
19
poles
dominant poles
H(s)
Im (λ)
−40
Gain (dB)
Re (λ)
-2
-1.5
−60
-1
-0.5
−80
0
2
6
4
Im (λ)
8
10
0
5
10
15
Frequency ω = Im (s)
20
(b) Bode plot of H(s) and Im (λ)
(a) Eigenvalues and dominant poles
Figure 2.3: (a) Eigenvalues and 6 dominant poles in [−2, 0] × i[0, 10] ⊂ C. (b) Bode
magnitude plot of the transfer function of the New England test system and
imaginary parts
of the dominant poles.
as the Re (s)-Im (s)-plane in Figure 2.2 and that the 6 dominant poles correspond originally to 11 eigenvalues (one real and 5 pairs of complex conjugated eigenvalues). The
purpose of this somehow redundant example is to show that the dominant poles can lie
anywhere in the spectrum of (A, E). The Bode magnitude plot of the transfer function
is illustrated in Figure 2.3b. Again, this is essentially the Im (s)-H(s)-plane of Figure 2.2.
The vertical dashed lines indicate the imaginary parts of the computed dominant poles
and, as discussed previously, they are indeed positioned close to frequencies ω where
H(iω) has peaks.
For modal approximation purposes, the use of the p most dominant poles seems to be a
reasonable choice. Recall that by Remark 2.1, p is the number of selected real eigenvalues
and pairs of complex conjugate eigenvalues. We therefore use an approximation of the
transfer function that consists only of the k summands which belong to the p ≤ k < n most
dominant poles in the residue representation (2.12). Hence, the order of the reduced
order model is given by k = 2kc + kr , where kc and kr denote the number of selected
dominant complex conjugate eigenvalue pairs and real eigenvalues, respectively.
Definition 2.4:
A transfer function modal equivalent Hk (s) is an approximation of a transfer function
H(s) consisting of k < n terms of (2.16):
H̃(s) = Hk (s) =
k
X
i=1
Ri
+ D.
s − λi
(2.20)
♦
Chapter 2. Mathematical basics
20
Orig. model n = 66
Modal equiv. k = 5
Modal equiv. k = 11
−40
Gain (dB)
−50
−60
−70
−80
0
2
4
6
8
10
12
14
Frequency (rad/sec)
16
18
20
Figure 2.4: Bode plot of original New England test system and reduced order models
with p = 3 (k = 5 eigenvalues / states) and p = 6 (k = 11 eigenvalues / states)
dominant poles according to (2.15).
The Bode plots of the original model and of two modal equivalents of the New England
test system of order k = 5 and k = 11 (p = 3 and p = 6 dominant poles from Figure
2.3a) are illustrated in 2.4, where the dominance definition (2.15) was used. It can
be observed that the transfer functions of the modal equivalents approximates the
original transfer function better if more summands in the residue expression of H(s)
are kept. The corresponding dominant poles and scaled residues of this example are
listed in Table 2.1. The rightmost column denotes which poles are used in which modal
equivalent of Figure (2.4). Of course, the k = 11 reduced order model contains all
eigentriplets. Observe that the k = 5 modal equivalent does not include, for instance,
the pole λ4 ≈ −0.2491 ± 3.686i and thus the peak close to Im (λ4 ) is not reproduced.
Furthermore, both peaks around the frequencies ω = 7i and ω = 9i are caused by two
dominant poles each (see also Figure 2.3b) which are only incorporated in the k = 11
modal equivalent. Hence, the k = 11 reduced order model approximates the exact
model more accurately. Note that using another dominance definition than (2.15) will
most likely lead to different results (see the numerical example 2 in Chapter 6).
The subspaces Xk and Yk of the Petrov-Galerkin projection are in this context spanned
by the right and left eigenvectors for the k selected dominant eigenvalues. Note that
2.4. Model order reduction
21
Dominant pole
λ1
λ2
λ3
λ4
λ5
λ6
−0.0649
−0.4672
−0.2968
−0.2491
−0.1118
−0.3704
±
±
±
±
±
8.964i
6.956i
3.686i
7.095i
8.611i
Scaled residues
|R|/|λ|
9.869 · 10−3
6.006 · 10−4
3.589 · 10−4
1.579 · 10−4
6.914 · 10−5
5.372 · 10−5
Modal equivalent
in Figure 2.4
k=5
k = 11
Table 2.1: Dominant poles and corresponding scaled residues of the New England test
system.
the number k also corresponds to the number of states in the reduced order model.
However, since the original system matrices A, E are usually real but the basis matrices
Xk , Yk can in general be complex, it is practicable to construct real bases for the right and
left eigenspaces by using for every complex triplet (λ, x, y) the vectors [Re (x), Im (x)]
and [Re (y), Im (y)], respectively. The bases spanned in this way are obviously still (at
most) k dimensional. Now let the columns of Xr , Yr be such real bases. The reduced
order model is then given by
(Ẽ, Ã, B̃, C̃, D) = (Yr∗ EXr , Yr∗ AXr , Yr∗ B, CXr , D).
For the computation of dominant eigentriplets (λ, x, y) we need therefore algorithms
that are able to compute eigenvalues and the corresponding right and left eigenvectors
of large and sparse matrix pairs (A, E). Eigenvalue algorithms that compute both
right and left eigenvectors are referred to as two-sided eigenvalue methods. There exist a
number of popular algorithms for this task, for instance, two-sided Lanczos and Arnoldi
methods [42]. An specialized algorithm for the computation of dominant eigentriples
of a SISO system (E, A, b, c, d) is the Subspace Accelerated Dominant Pole Algorithm
(SADPA) [39]. Another class of eigenvalue methods are the Jacobi-Davidson (JD) style
algorithms, which can be used for the computation of dominant poles by involving an
appropriate eigenvalue selection strategy.
In this thesis we will focus on SADPA and two variants of Jacobi-Davidson, the twosided and alternating JD [20], that compute the left eigenvectors as well. The JD methods
and SADPA can be interpreted as subspace accelerated Rayleigh quotient iterations and
since the convergence behavior of these basic iterations carries over to their subspace
accelerated successors, we will investigate them in the next chapter. It is even possible
to show that under certain assumptions some Rayleigh quotient iterations are in some
sense equivalent to their related JD methods. In Chapter 4 we will discuss SADPA
and JD style subspace accelerated eigensolvers and their application for dominant pole
computation of SISO systems. Generalizations to multivariable transfers functions are
described in Section 5.2.
22
Chapter 2. Mathematical basics
3
Rayleigh Quotient Iterations
In the following sections we review some of the basic iterations for eigenvalue and dominant pole computations based on the Rayleigh quotient. The understanding of their
properties and convergence behavior will later be useful for the subspace accelerated
methods in Chapter 4.
3.1 The standard Rayleigh Quotient Iteration
Definition 3.1:
Let A ∈ Cn×n and 0 , x ∈ Cn . The Rayleigh quotient of A and x is the scalar quantity
defined by
ρ(x) := ρ(x, A) :=
x∗ Ax
.
x∗ x
(3.1)
♦
Theorem 3.2:
The Rayleigh quotient (3.1) has the following basic and well known properties, see
[31]:
• Homogeneity: ρ(αx, βA) = βρ(x, A) for complex scalars α, β , 0.
• Translation invariance: ρ(x, A − αI) = ρ(x, A) − α for all α ∈ C.
• Boundedness: For all nonzero vectors x, the image of the function ρ(x) is a subset
within the complex plane which is called field of values or numerical range. If
A = A∗ , this field of values is the real interval [λmin , λmax ] bounded by the
largest and smallest eigenvalues of A
• Stationarity: For normal matrices A, the function ρ(x) is stationary in the eigenvectors of A, or equivalently, all directional derivatives vanish at the eigenvec-
Chapter 3. Rayleigh Quotient Iterations
24
tors of A.
• Minimal Residual: For all scalars µ ∈ C and vectors x , 0 it holds
k(A − µI)xk22 ≥ kAxk22 − kρxk22 .
(3.2)
♦
Proof. We restrict ourselves to the proofs of the stationarity and minimal residual property. For t ∈ R, the directional derivative of ρ(x) in direction of the unit vector v is given
by
1
ρ(x + tv) − ρ(x)
t→0 t
"
#
1 (x + tv)∗ A(x + tv)
= lim
− ρ(x)
t→0 t
(x + tv)∗ (x + tv)
"
#
1 x∗ Ax + tv∗ Ax + tx∗ Av + t2 v∗ Av − (x + tv)∗ ρ(x)(x + tv)
= lim
t→0 t
(x + tv)∗ (x + tv)
" ∗
#
1 x Ax + tv∗ Ax + tx∗ Av + t2 v∗ Av − x∗ Ax − tv∗ ρ(x)x − tx∗ ρ(x)v − t2 v∗ vρ(x)
= lim
t→0 t
(x + tv)∗ (x + tv)
∗
∗
v (A − ρ(x)I)x + x (A − ρ(x)I)v − tρ(x)
= lim
t→0
(x∗ + tv∗ )(x + tv)
∗
v (A − ρ(x)I)x + x∗ (A − ρ(x)I)v
=
.
x∗ x
ρ0v (x) = lim
Obviously, ρ0v (x) vanishes for all directions v if and only if (A − ρ(x)I)x = 0 and
x∗ (A − ρ(x)I) = 0. Thus ρ(x) is stationary in x if x is an eigenvector of both A and
A∗ with the corresponding eigenvalue ρ(x) which can only be true for normal matrices
A. Furthermore, for any complex scalar µ it holds that
k(A − µI)xk22 = x∗ x|µ|2 − µx∗ Ax − µx∗ A∗ x + x∗ A∗ Ax
x∗ A∗ Ax
= x∗ x (µ − ρ)(µ − ρ) − |ρ|2 +
x∗ x
≥ kAxk22 − kρxk22
from which the minimal residual property (3.2) follows.
The (standard) Rayleigh quotient iteration (RQI) is shown in Algorithm 3.1 and can, for
instance, be derived by replacing the constant shift of the inverse iteration [13, Ch.
7.6.1.] by the Rayleigh quotient in every iteration. The sequence (ρk , vk ) produced by
Algorithm 3.1 is called Rayleigh sequence. Let (λ, x) be an eigenpair of A. Since the
function ρ(x) is continuous, ρk → λ if vk → x for k → ∞. It can be shown that the local
asymptotic convergence rate of the vectors vk is cubic.
3.1. The standard Rayleigh Quotient Iteration
25
Algorithm 3.1 Rayleigh quotient iteration (RQI)
Input: Matrix A, initial vector v0 , tolerance 1.
Output: Approximate eigenpair (λ, x).
1: v0 = v0 /kv0 k2 , ρ0 = ρ(v0 ).
2: Set k = 0.
3: while not converged do
4:
Solve (A − ρk I)vk+1 = vk for vk+1 .
5:
Set vk+1 = vk+1 /kvk+1 k2 .
6:
Compute next eigenvalue estimate
ρk+1 = v∗k+1 Avk+1 = ρ(vk+1 , A).
7:
8:
9:
10:
11:
if kAvk+1 − ρk+1 vk+1 k2 < then
Set λ = ρk+1 , x = vk+1 . break
end if
Set k = k + 1.
end while
Theorem 3.3:
k→∞
Let A ∈ Cn×n be a normal matrix with an eigenpair (λ, x). If vk −→ x then
lim kvk+1 − xk/kvk − xk3 ≤ 1.
k→∞
♦
For the proof we refer to [31, p. 681]. Moreover, the RQI converges for almost all starting
vectors v0 because the residuals rk are monotonically decreasing in norm.
Theorem 3.4:
Let A ∈ Cn×n be a normal matrix and rk = (A − ρ(vk )I)vk be the residual of the kth
iteration of Algorithm 3.1. Then the sequence {krk k2 , k = 0, 1, . . .} is monotonically
decreasing for all initial vectors v0 .
♦
Proof. Using the minimal residual property, step 4 of Algorithm 3.1, the CauchySchwarz inequality and the normality of A yields the following sequence (ρk = ρ(vk )) of
(in)equalities:
krk+1 k2 = k(A − ρk+1 I)vk+1 )k2 ≤ k(A − ρk I)vk+1 k2 = |v∗k+1 (A − ρk I)∗ (A − ρk I)vk+1 |
= |v∗k (A − ρk I)vk+1 | ≤ k(v∗k (A − ρk I)k2 kvk+1 k2
= k(A − ρk I)vk k2 = krk k2 .
The main ingredients for this nice local and global convergence are the stationarity
and minimal residual property of the Rayleigh quotient (3.1). However, for nonnormal
26
Chapter 3. Rayleigh Quotient Iterations
matrices the stationarity of the Rayleigh quotient does not hold and the asymptotic
convergence rate is in this case only quadratic at best. The minimal residual property
holds also for nonnormal matrices, but the norms of the residuals of the Rayleigh
sequence do not have to be monotonically decreasing anymore.
Therefore we examine in the subsequent sections if there are generalizations of the
standard RQI that have the stationarity property and produce sequences of monotone
decreasing residuals. It will be exposed that only one of the two wanted properties will
be fulfilled in each case.
Another drawback of the standard RQI is that if we are interested in the left eigenvectors
too, we have to apply the RQI to A∗ as well. Furthermore, we would like to use these
iterations for generalized eigenvalue problems and for the computation of dominant
triplets of SISO systems (E, A, b, c, d).
Remark 3.5:
The standard RQI can intuitively be
eigenpairs of matrix
generalized
to compute
pairs (A, E) by computing ρk+1 = v∗k+1 Avk+1 / v∗k+1 Evk+1 and solving the system
(A − ρk E)vk+1 = Evk in the steps 6 and 4 of Algorithm 3.1 (cf. Algorithm 3.2 or [34]). If
left eigenvectors are sought, one can apply this modified RQI to the transposed pair
(A∗ , E∗ ). If eigenvectors associated to dominant poles of a SISO system (E, A, b, c, d)
are sought, it is advised to take v0 = (A − s0 E)−1 b and v0 = (A − s0 E)−∗ c∗ for some
s0 ∈ C as initial vectors.
♦
A generalization of the Rayleigh quotient that computes right and left eigenvectors
simultaneously is the two-sided generalized Rayleigh quotient as described in the next
section.
3.2 The two-sided Rayleigh Quotient Iteration
According to [30, 31], the two-sided generalized Rayleigh quotient can be defined in the
following way:
Definition 3.6:
For a matrix pair (A, E) and vectors x, y with y∗ Ex , 0, the generalized two-sided
Rayleigh quotient is defined as
ρ(x, y) := ρ(x, y, A, E) :=
y∗ Ax
.
y∗ Ex
(3.3)
♦
If E = I we may skip the prefix generalized and call (3.3) only the two-sided Rayleigh
quotient. Note that the expression ’generalized Rayleigh quotient’ has sometimes different meanings as the one we will use in this thesis in the context of matrix pairs
(A, E). The following theorem states that the generalized two-sided Rayleigh quotient
has, among other properties, the useful stationarity in the eigenvectors.
3.2. The two-sided Rayleigh Quotient Iteration
27
Theorem 3.7:
The generalized two-sided Rayleigh quotient (3.3) has the following basic properties.
• Homogeneity: ρ(αx, βy, γA, δE) =
α, β, γ, δ , 0.
γ
δ ρ(x,
y, A, E) for all complex scalars
• Translation invariance: ρ(x, y, A − αE, E) = ρ(x, y, A, E) − α for all α ∈ C.
• Stationarity: The directional derivatives of ρ = ρ(x, y, A, E) are zero if x and
y are right and left eigenvectors corresponding to the eigenvalue ρ, provided
y∗ Ex , 0.
♦
Proof. The properties homogeneity and translation invariance can be derived by simple
manipulations so that we only prove the stationarity property here. Let for this purpose
w, z be unit vectors serving as directions and , η ∈ R such that (y∗ + ηz∗ )E(x + w) , 0.
Using similar basic manipulation as in the proof of Theorem 3.2, we find for ρ = ρ(x, y)
that
(y + ηz)∗ A(x + w)
−ρ
(y + ηz)∗ E(x + w)
∗
1
=
y Ax + ηz∗ Ax + y∗ Aw + ηz∗ Aw
∗
(y + ηz) E(x + w)
− y∗ Ex + ηz∗ Ex + y∗ Ew + ηz∗ Ew ρ
ηz∗ (A − ρE)x + y∗ (A − ρE)w + ηz∗ (A − ρE)w
=
(y∗ + ηz∗ )E(x + w)
ρ(x + w, y + ηz) − ρ =
(see also [31, p. 688]). This is O(η) for all w, z if and only if (A − ρE)x = 0 and
y∗ (A − ρE) = 0 from which the claim follows.
The two-sided Rayleigh quotient leads to the two-sided Rayleigh quotient iteration (2RQI) [30, 31] illustrated in Algorithm 3.2. In steps 4 and 5 one can solve the two linear
systems efficiently with only one LU-factorization LU = (A−ρk E) since U∗ L∗ = (A−ρk E)∗ .
The next theorem states that due to the stationarity of ρ(x, y), this iteration converges
asymptotically with a cubic rate.
Theorem 3.8:
Let (A, E) be a nondefective matrix pair with right and left eigenvectors x, y correk→∞
sponding to an eigenvalue λ, scaled such that y∗ Ex , 0, kxk = kyk = 1. If vk −→ x,
k→∞
k→∞
wk −→ y, then ρk −→ λ with a cubic rate of convergence.
♦
For the proof we refer to [31, p. 689] and an extension of Theorem 3.8 can be found in
[20, p. 150]. The disadvantage of this generalization of the standard RQI is that it does
not necessarily converge for all starting vectors. For instance, Algorithm 3.2 will show
a poor convergence behavior if v0 and w0 are almost right and left eigenvectors for two
different eigenvalues.
Chapter 3. Rayleigh Quotient Iterations
28
Algorithm 3.2 Two-sided Rayleigh quotient iteration (2-RQI)
Input: Matrices A, E, initial vectors v0 , w0 (w∗0 Ev0 , 0), tolerance 1.
Output: Approximate eigentriplet (λ, x, y).
1: v0 = v0 /kv0 k2 , w0 = w0 /kw0 k2 , ρ0 = ρ(v0 , w0 ).
2: Set k = 0.
3: while not converged do
4:
Solve (A − ρk E)vk+1 = Evk for vk+1 .
5:
Solve (A − ρk E)∗ wk+1 = E∗ wk for wk+1 .
6:
Set vk+1 = vk+1 /kvk+1 k2 and wk+1 = wk+1 /kwk+1 k2 .
7:
Compute next eigenvalue estimate
ρk+1 =
w∗k+1 Avk+1
w∗k+1 Evk+1
.
if kAvk+1 − ρk+1 Evk+1 k2 < then
Set λ = ρk+1 , x = vk+1 , y = wk+1 . break
10:
end if
11:
Set k = k + 1.
12: end while
8:
9:
For the computation of dominant poles of a system (E, A, b, c, d), one could again use
v0 = (A − s0 E)−1 b and w0 = (A − s0 E)−∗ c∗ as initial vectors for an initial pole s0 ∈ C. In
the next section we investigate the dominant pole algorithm (DPA), which is a slight
modification of 2-RQI and focuses on the computation of dominant poles.
3.3 The Dominant Pole Algorithm
According to (2.3), the transfer function H(s) of a SISO system (E, A, b, c, d = 0) is
defined by
H(s) = c(sE − A)−1 b
and for its poles λ ∈ C holds obviously lim |H(s)| = ∞. These poles are also the roots of
s→λ
the function G : C 7→ C
G(s) =
1
H(s)
because lim G(s) = 0. The main idea behind the Dominant Pole Algorithm (DPA) [26, 41]
s→λ
is to use Newton’s method to find these roots. Let sk be an approximation of a dominant
pole, then the next approximation obtained by Newton’s method is
sk+1 = sk −
G(sk )
.
G0 (sk )
(3.4)
3.3. The Dominant Pole Algorithm
29
The derivative of G(s) with respect to s is
G0 (s) = −
H0 (s)
H2 (s)
and the derivative of H(s) with respect to s is given by
H0 (s) = −c(sE − A)−1 E(sE − A)−1 b.
Hence the Newton scheme (3.4) can be rewritten to
sk+1
H2 (sk )
G(sk )
1
= sk +
= sk − 0
G (sk )
H(sk ) H0 (sk )
!
c(sk E − A)−1 b
c(sk E − A)−1 E(sk E − A)−1 b
sk c(sk E − A)−1 E(sk E − A)−1 b − c(sk E − A)−1 b
=
c(sk E − A)−1 E(sk E − A)−1 b
c(sk E − A)−1 [sk E − (sk E − A)] (sk E − A)−1 b
=
c(sk E − A)−1 E(sk E − A)−1 b
c(sk E − A)−1 A(sk E − A)−1 b
=
.
c(sk E − A)−1 E(sk E − A)−1 b
= sk −
With the vectors vk := (sk E − A)−1 b and wk := (sk E − A)−∗ c∗ , this can be expressed as
sk+1 = sk −
w∗k Avk
cvk
=
,
w∗k Evk
w∗k Evk
where we recognize the last term as the generalized two-sided Rayleigh quotient (3.3).
The vectors wk and vk are approximations to the left and right eigenvectors corresponding to the approximate pole sk . The Dominant Pole Algorithm is illustrated in Algorithm
3.3. Observe that the main difference to Algorithm 3.2 are the fixed right hand sides
in step 3 and 4 of Algorithm 3.3 which reduce the asymptotic convergence rate to a
quadratic one. This quadratic convergence rate can also explained by the fact that this
method is derived as Newton scheme.
Theorem 3.9:
Let (A, E) be nondefective with right and left eigenvectors x, y corresponding to an
k→∞
k→∞
k→∞
eigenvalue λ and y∗ Ex = 1. Then vk −→ x, wk −→ y if and only if sk+1 = ρ(vk , wk ) −→
λ with an asymptotically quadratic rate of convergence.
♦
See [36, 41] for the proof. However, as investigated in [41, Section 4-5], these fixed right
hand sides cause a better global convergence of the iterates sk towards dominant poles.
When both methods converge, DPA converges in most cases to the dominant pole close
to the shift s0 while 2-RQI approaches the pole closest to s0 which may not be dominant at
Chapter 3. Rayleigh Quotient Iterations
30
Algorithm 3.3 Dominant Pole Algorithm (DPA)
Input: System (E, A, b, c), initial pole s0 , tolerance 1.
Output: Approximate dominant eigentriplet (λ, x, y).
1: Set k = 0.
2: while not converged do
3:
Solve (sk E − A)vk = b for vk .
4:
Solve (sk E − A)∗ wk = c∗ for wk .
5:
Compute next pole estimate
sk+1 = sk −
6:
7:
8:
9:
10:
11:
cvk
.
w∗k Evk
Set vk = vk /kvk k2 and wk = wk /kwk k2 .
if kAvk − sk+1 Evk k2 < then
Set λ = sk+1 , x = vk , y = wk . break
end if
Set k = k + 1.
end while
all. DPA forms the main feature of the Subspace Accelerated Dominant Pole Algorithm
(SADPA) [39], which can be considered as subspace accelerated version of Algorithm
3.3, incorporating also restart and deflation techniques. We will investigate SADPA in
more detail in Section 4.1.
3.4 The Alternating Rayleigh Quotient Iteration
Theoretically, 2-RQI and DPA do not converge globally. Another generalization of the
RQI to ensure global convergence is the alternating application of one iteration of the
standard RQI (Algorithm 3.1) to A and one iteration to A∗ . The result is the alternating
Rayleigh quotient iteration (ARQI) by Parlett [31] which is described in Algorithm 3.4. In
every odd iteration a right eigenvector and in every even iteration a left eigenvector is
approximated. For normal matrices this method reduces to the standard Rayleigh quotient iteration. Although there is no stationarity property in this case, ARQI produces
sequences of monotonically decreasing residual norm, see [31, p. 690]:
Theorem 3.10:
For the sequences {(ρk , vk ), k = 0, 1, . . .} generated by Algorithm 3.4 holds
k(A − ρk+1 I)vk+1 k2 ≤ k(A − ρk−1 I)vk−1 k2 ,
kv∗k+2 (A − ρk+2 I)k2 ≤ kv∗k (A − ρk I)k2 .
3.4. The Alternating Rayleigh Quotient Iteration
31
Equalities hold only if ρk−1 = ρk = ρk+1 and, additionally, v∗k+1 and vk are proportional
to v∗k (A − ρk I) and (A − ρk−1 I)vk−1 , respectively.
♦
Proof. Similar to the proof of Theorem 3.4, basic algebraic manipulations and the minimal residual property (3.2) of the standard Rayleigh quotient (3.1) yield the following
(in)equalities (for the odd iterations):
k(A − ρk+1 I)vk+1 k2 ≤ k(A − ρk I)vk+1 k2
= |v∗k+1 (A − ρk I)∗ (A − ρk I)vk+1 |
= |v∗k (A − ρk I)−∗ (A − ρk I)∗ (A − ρk I)vk+1 |
= |v∗k (A − ρk I)vk+1 |
≤ kv∗k (A − ρk I)k2 kvk+1 k2
≤ kv∗k (A − ρk−1 I)k2
= |v∗k (A − ρk−1 I)vk−1 |
≤ k(A − ρk−1 I)vk−1 k2 .
The other inequality can be shown in an analog way (see also [31, p. 690]).
The next theorem can also be found in [31, p. 690] and states that, for nonnormal
matrices, the advantage of this alternating scheme is that it converges for all starting
vectors.
Algorithm 3.4 Alternating Rayleigh quotient iteration (ARQI)
Input: Matrix A, initial vector v0 , tolerance 1.
Output: Approximate eigentriplet (λ, x, y).
1: v0 = v0 /kv0 k2 .
2: Set k = 0.
3: while not converged do
4:
Update eigenvalue estimate ρk = v∗k Avk .
5:
Solve (A − ρk I)vk+1 = vk for vk+1 .
6:
Set vk+1 = vk+1 /kvk+1 k2 .
7:
Update eigenvalue estimate ρk+1 = v∗k+1 Avk+1 .
8:
Solve (A − ρk+1 I)∗ vk+2 = vk+1 for vk+2 .
9:
Set vk+2 = vk+2 /kvk+2 k2 .
10:
if kAvk+2 − ρk+1 vk+2 k2 < then
11:
Set λ = ρk+1 , x = vk+1 , y = vk+2 . break
12:
end if
13:
Set k = k + 2.
14: end while
Chapter 3. Rayleigh Quotient Iterations
32
Theorem 3.11:
As k → ∞ it holds for all initial vectors v0 :
ρk → λ, v2k → y, v2k+1 → x,
where either (λ, x, y) is an eigentriplet of A or (τ := k(A − λI)xk2 , x, y) is a singular
triplet of A − λI for a multiple singular value τ.
♦
For the proof see [31, p. 691], where it is also revealed that the global convergence has
been bought at the price of an only linear asymptotic convergence rate with a factor
close to one (1 − κ(λ)−2 ) for nonnormal matrices.
However, this rather slow convergence could be improved by using subspace acceleration, which will lead to the alternating subspace accelerated methods in Section 4.3.
Our hope is that this alternating schemes will provide us with methods that produce approximations of eigentriplets for slightly nonnormal matrices with a moderate amount
of extra work compared to the one-sided iteration.
Similarly to Remark 3.5, ARQI can be modified to compute dominant poles of a SISO
system (E, A, b, c) by solving the corresponding
linear
systems of 2-RQI in the odd
∗
∗
and even iterations, computing ρk = vk Avk / vk Evk and by using v0 = (A − s0 E)−1 b (or
v0 = (A − s0 E)−∗ c∗ ) as initial vector for some s0 ∈ C.
3.5 The Half-Step Rayleigh Quotient Iteration
From the previous section originates the idea of an alternating scheme which uses the
two-sided Rayleigh quotient (see [45]). That is, we could use always the newest iterates
in 2−RQI and update the two-sided Rayleigh quotient (3.3) after the solution of the first
linear system in step 4 of Algorithm 3.2 by computing
ρk+ 1 = ρ(vk , wk−1 ).
2
This extra Rayleigh quotient is then used as shift in the second linear system and the
resulting iteration can be interpreted as a half-step scheme. An algorithmic representation of this procedure is given in Algorithm 3.5. Since the operators in the two linear
system in every iteration are now not conjugate transposed to each other, we can not
solve them with a single LU factorization. Note that this half-step scheme was originally established for methods for nonlinear eigenvalue problems based on the Rayleigh
functional iteration in [45, chapter 5.1], where also more details and generalizations can
be found. There it is shown, that the additional update between each half-step might
improve the convergence speed but the asymptotic convergence rate of the half-step
iteration is still cubic as in the original two-sided iteration.
3.6. Numerical example
33
Algorithm 3.5 Half-step Rayleigh quotient iteration (HSRQI)
Input: Matrices E, A, initial vectors v0 , w0 , tolerance 1.
Output: Approximate eigentriplet (λ, x, y).
1: v0 = v0 /kv0 k2 , w0 = w0 /kw0 k2 , ρ0 = ρ(v0 , w0 ).
2: Set k = 0.
3: while not converged do
4:
Solve (A − ρk E)vk+1 = Evk for vk+1 .
5:
Update eigenvalue estimate ρk+ 1 = ρ(vk+1 , wk ).
2
6:
Solve (A − ρk+ 1 E)∗ wk+1 = Ewk for wk+1 .
2
7:
Update eigenvalue estimate ρk+1 = ρ(vk+1 , wk+1 ).
8:
Set vk+1 = vk+1 /kvk+1 k2 and wk+1 = wk+1 /kwk+1 k2 .
9:
if kAvk+1 − ρk+1 Evk+1 k2 < then
10:
Set λ = ρk+1 , x = vk+1 , y = wk+1 . break
11:
end if
12:
Set k = k + 1.
13: end while
3.6 Numerical example
To illustrate the convergence behavior of the previously discussed iterations, we apply
them on two system of small and moderate order. We use v0 := (A − s0 E)−1 b and
w0 := (A − s0 E)−∗ c∗ for some s0 ∈ C as initial vectors for all two-sided iterations except
RQI and ARQI, where only a single initial vector v0 or w0 is needed. Note that DPA
only requires the initial shift s0 . In all methods the error tolerance is set to = 10−8 and
the linear systems are solved using the LU factorization of the shifted matrices. Our
first example is the New England test system which is already known from Chapter
2.4.2. It is a state-space system (E = I) with a nonnormal system matrix A of order
n = 66. As initial shift we take s0 = 9i. In Figure 3.1a we compare the history of the
right residual norms krv k2 := kAvk − ρk Evk k2 of all five methods. All iterations converge
to the eigenvalue λ ≈ −0.467 + 8.965i which is indeed the most dominant pole closest to
s0 = 9i (magnitude of residue |R| ≈ 5.3915 · 10−3 ). As expected, the convergence speed
is different in each case and with 4 iterations, HSRQI and 2-RQI are the fastest methods
for this example. The cubic convergence of 2-RQI (Theorem 3.8) is almost achieved in
iteration 3 to 4 where krv k2 ≈ 4.7 · 10−5 falls down to krv k2 ≈ 3.4 · 10−13 . The additional
Rayleigh quotient update of HSRQI leads only to a slight further acceleration. Since
this insignificant improvement comes at the price of two LU factorizations, which have
to be computed in each iteration, the additional effort does not seem to be justified at
all. DPA is the second fastest method and the (nearly) quadratic convergence (Theorem
3.9) can also be observed in the last two iterations in the drop from krv k2 ≈ 5.9 · 10−6
to krv k2 ≈ 2.2 · 10−11 . Due to the nonnormality of A, the standard one-sided RQI
cannot achieve its cubic convergence (Theorem 3.3) and is even slower than DPA. In
the first two iterations we see an increase of the residual norm, illustrating nicely that
Chapter 3. Rayleigh Quotient Iterations
34
101
10−4
RQI
2-RQI
ARQI
DPA
HSRQI
10−9
10−14
2
6
8
4
10
number of iteration
(a) New England, s0 = 9i
norm of residual
norm of residual
101
10−4
RQI
2-RQI
ARQI
DPA
HSRQI
10−9
10−14
5
10
15
number of iteration
20
(b) PEEC, s0 = 20i
Figure 3.1: (a) Convergence histories of RQI, 2-RQI, ARQI, DPA and HSRQI for the New
England test system. (b) The same as (a), but for the PEEC patch antenna
Model [8].
Theorem 3.4 does not hold in the nonnormal case and hence, RQI does not always
produce monotonically decreasing residual norms. Furthermore, RQI delivers only an
approximation for the right eigenvector. For the left eigenvectors we have to apply the
method a second time to A∗ and w0 . The slowest method is by far ARQI which needs
11 iterations to converge but, as stated by Theorem 3.10, it produces a monotonically
decreasing sequence of residual norms. It also delivers an approximate left eigenvector
with krw k2 := kA∗ w − ρk E∗ wk2 ≈ 6.8 · 10−7 .
Next, we experiment with the PEEC model of a patch antenna structure1 [8] which
is a descriptor system of order n = 480 with a normal pair (A, E), but E is singular.
The initial shift is s0 = 20i and the residual norms are plotted in Figure 3.1a. We see
that HSRQI and 2-RQI are again the fastest methods but now followed by RQI and
ARQI. The slowest method in this example is DPA converging in 21 iterations after a
long period of stagnation. However, what is not revealed in the residual plot is that,
although we used the same initial shift for all five methods, HSRQI and 2-RQI converged
to λ1 ≈ −0.0396 + 10.625i which is indeed the eigenvalue closest to s0 = 20i but with
|R| ≈ 2.523 · 10−6 . RQI and ARQI converged to λ2 ≈ −0.1156 with |R| ≈ 3.581 · 10−4 and
only DPA detected the most dominant pole λ3 ≈ −0.1104 + 6.632i with |R| ≈ 1.695 · 10−3 .
This example shows that it is difficult to steer the basic iterations towards a specific
target. Furthermore, since s0 = 20i is located far away from the intrinsic dominant poles,
it can be considered as ’bad’ shift and the generated initial vectors v0 and w0 might be
inappropriate. Another huge drawback of these methods is that they approximate only
one eigentriplet and that the different criteria for modal dominance (Definition 2.2) are
not incorporated. If more than one eigentriplet is sought, we could repeat them with
1
available at http://www.icm.tu-bs.de/NICONET/benchmodred.html
3.6. Numerical example
35
another initial shift, but there is no guarantee that a convergence to the previously found
eigentriplet does not happen again.
To conclude, all five presented basic iterations are not capable of computing a number
of dominant poles and right and left eigenvectors of a given SISO system (A, E, b, c, d)
in a satisfactory way. In the next chapter we expand DPA, 2-RQI and ARQI by subspace
acceleration which will significantly improve the convergence behavior and will enable
us to steer the iterations towards specific targets in the spectrum Λ(A, E), for instance,
towards the dominant poles with respect to a certain dominance definition. These
subspace accelerated methods can be easily expanded by deflation techniques to obtain
several dominant eigentriplets which can then directly be used to construct reduced
order models.
36
Chapter 3. Rayleigh Quotient Iterations
4
Two-sided subspace accelerated eigenvalue methods
The basic iterations of the previous chapter compute only one eigentriplet in each
run and, as seen in the numerical example, in some cases it seems difficult to steer
the iterates towards a specific target. In this chapter we investigate three subspace
accelerated algorithms which can be used to compute a number of eigentriplets or,
more specifically, dominant poles and will have better convergence properties towards
specific targets than the basic Rayleigh quotient iterations. In all these methods, subspace
acceleration stands for the storing of all vector iterates vk and wk produced by the Rayleigh
quotient iterations and using them as basis vectors of two subspaces Vk and Wk . The
original eigenproblem is then projected into these subspaces which leads to an usually
small reduced eigenproblem. The eigenvalues of this projected eigenproblem are used
as approximations of the eigenvalues of the original eigenproblem. This procedure
will accelerate the convergence and, furthermore, it will be possible to compute more
than one approximate eigentriplet or dominant pole. If some eigentriplets have been
computed, deflation techniques can be involved to prevent a repeated computation of
these already found eigentriples. If the dimension of the subspaces, and therefore the
dimension of the projected eigenproblem, becomes too large, restart strategies can be
applied.
We begin in Section 4.1 with the subspace accelerated version of the dominant pole
algorithm (cf. Section 3.3). In Section 4.2 and 4.3 we discuss the two-sided and alternating Jacobi-Davidson method, which can be seen as subspace accelerated variants of the
two-sided and alternating RQI, respectively (cf. Sections 3.2 and 3.4).
For the remainder of this thesis we assume that the pair (A, E) of the systems matrices
is nondefective and that the direct transmission term d is zero.
38
Chapter 4. Two-sided subspace accelerated eigenvalue methods
4.1 The Subspace Accelerated Dominant Pole Algorithm
The Subspace Accelerated Dominant Pole Algorithm (SADPA) was originally proposed by
J. Rommes and N. Martins in [39] and was improved further in [36, Chapter 3]. It is a
generalization of DPA, where subspace acceleration, deflation and restarts are used to
compute the most dominant poles of a scalar transfer function one by one. A schematic
overview of SADPA is illustrated in Algorithm 4.1 which is a simplified version of [39,
Algorithm 3] and [36, Algorithm 3.2].
For clarification and because some of the applied techniques easily carry over to the
Jacobi-Davidson methods, we describe in the following the major steps of Algorithm
4.1. If it does not lead to confusion, the subscript k denoting the current iteration number
will be omitted.
Algorithm 4.1 Subpace Accelerated Dominant Pole Algorithm (SADPA)
Input: System (E, A, b, c), initial pole s1 , tolerance 1, number of wanted poles
pwanted , minimum and maximum search space dimensions kmin < kmax n,
dominance criterion ((2.13)–(2.15)).
Output: Approximate dominant eigentriplets (λi , xi , yi ), i = 1, . . . , p.
1: Set k = 0, p = 0, Λ = X = Y = V = W = [].
2: while p < pwanted do
3:
Solve (sk E − A)v = b for v.
4:
Solve (sk E − A)∗ w = c∗ for w.
5:
Expand V and W (bi-E-)orthogonally by v and w, respectively:
6:
V = [V, v], W = [W, w], k = k + 1.
7:
Construct interaction matrices S := W ∗ AV, T := W ∗ EV.
8:
Compute and sort eigentriplets of (S, T) with respect to dominance criterion
(Λ̃, Q, Z) = Sort(S, T, W ∗ b, cV). {Algorithm 4.2}
9:
Approximate dominant eigentriplet of (A, E) is
(λ̂1 := λ̃1 , x̂1 := Vq1 /kVq1 k2 , ŷ1 := Wz1 /kWz1 k2 ).
10:
while kAx̂1 − λ̂1 Ex̂1 k2 < do
11:
Set X = [X, x̂1 ], Y = [Y, ŷ1 ], Λ = [Λ, λ̂1 ], p = p + 1,
12:
bd = b − Ex(y∗ b), cd = c − (cx)y∗ E,
13:
V = VQ2:k , W = WZ2:k , S = W ∗ AV, T = W ∗ AV, k = k − 1.
14:
Set λ̂1 = λ̃2 , x̂1 = v1 , ŷ1 = w1 .
15:
end while
16:
if k ≥ kmax then
17:
V = VQ1:kmin , W = WZ1:kmin , k = kmin .
18:
Reconstruct reduced matrices S = W ∗ AV, T = W ∗ AV.
19:
end if
20:
Set sk = λ̂1 .
21: end while
4.1. The Subspace Accelerated Dominant Pole Algorithm
39
Subspace expansion Each iteration of SAPDA begins with the solution of the two
linear systems in step 3 and 4 which are exactly the same linear systems as in the
steps 3 and 4, respectively, of DPA (Algorithm 3.3). Again, an efficient solution of both
linear systems is possible by computing the LU-factorization LU = sk E − A and using
U∗ L∗ = sk E∗ − A∗ . Note that we assume for now that we can compute the LU-factors in
an efficient and inexpensive way. This might not be the case for general matrices and
hence one has to apply other techniques, such as iterative solvers for linear systems
[43]. For instance, BiCG could be applied to solve both systems simultaneously since
the involved operators are conjugate transposed to each other.
Remark 4.1:
A subspace accelerated version of 2-RQI (Algorithm 3.2) can be obtained by simply
replacing the fixed right hand sides of the correction equations in step 3 and 4 of
Algorithm 4.1 by Ev and E∗ w, respectively. Furthermore, two initial vectors v0 and
w0 have to be inserted as in 2-RQI. This method will in the following be referred to
as Subspace Accelerated Two-sided Rayleigh Quotient Iteration (SA2RQI).
♦
After the orthogonalization via modified Gram-Schmidt (or any other stable variant
of Gram-Schmidt), the iterates v and w are added in step 5 as new basis vectors to the
search spaces V and W, respectively. All basis vectors for V are stored in the basis matrix
V, which is therefore expanded by v as new column in each iteration. Likewise, the basis
matrix W for W receives w as new column. Here it is possible to scale and normalize
both search spaces orthogonally so that V ∗ V = W ∗ W = I holds, or bi-E-orthogonally
which yields W ∗ EV = I. The latter requires a bi-E-orthogonal Gram-Schmidt process
[32].
Eigenvalue selection In step 7 the two interaction matrices S ≡ Sk and T ≡ Tk of
the reduced eigenproblem are constructed according to an imposed two-sided PetrovGalerkin approach.
Let (λ̃, x̂ := Vq, ŷ := Wz) be an approximate eigentriplet with coefficient vectors
q, z ∈ Ck . The two-sided Petrov-Galerkin condition is given by
rv := AVq − λ̃EVq ⊥ W
rw := A∗ Wz − λ̃E∗ Wz ⊥ V.
From this it follows that (λ̃, q, z) is an eigentriplet of the equivalent reduced eigenproblems handled in step 8


∗

W AVq


z∗ W ∗ AV
= λ̃W ∗ EVq,
= λ̃z∗ W ∗ EV
(4.1)
with the reduced matrices S ≡ Sk = W ∗ AV and T ≡ Tk = W ∗ EV. The approximate
eigentriplets (λ̃, x̂, ŷ) satisfying (4.1) are called (two-sided) Petrov triplets of (S, T) with
40
Chapter 4. Two-sided subspace accelerated eigenvalue methods
Algorithm 4.2 (Λ̃, Q, Z)=Sort(S, T, b, c)
Input: Interaction matrices S, T ∈ Ck×k , input and output vectors b, c∗ ∈ Ck .
Output: Λ̃ ∈ Ck×k diagonal with poles λ̃i in residue order, Q, Z ∈ Ck×k corresponding
eigenvector matrices.
1: Compute eigendecomposition of pair (S, T):
SQ = TQΛ̃, Z∗ S = Λ̃Z∗ T, Z∗ TQ = I
(4.2)
with Λ̃ = diag(λ̃1 , . . . , λ̃k ), Q = [q1 , . . . , qk ], Z = [z1 , . . . , zk ].
Construct approximate residues Ri = (cqi )(z∗i b).
3: Sort Λ̃, Q, Z decreasingly with respect to a dominance measurement ((2.13)–(2.15)).
2:
respect to V and W. Note that the matrices Sk , Tk can be efficiently constructed in
∗ A and W ∗ E of the
each iteration by using the matrices Sk−1 , Tk−1 , AVk−1 , EVk−1 , Wk−1
k−1
previous iteration and adding only one new column and row:
"
#
∗ Av
Sk−1
Wk−1
k
∗
∗
Sk = Wk AVk = [Wk−1 , wk ] A[Vk−1 , vk ] = ∗
,
wk AVk−1 w∗k Avk
"
#
∗ Ev
Tk−1
Wk−1
k
∗
∗
Tk = Wk EVk = [Wk−1 , wk ] E[Vk−1 , vk ] = ∗
,
wk EVk−1 w∗k Evk
where vk ≡ v, wk ≡ w are the new basis vectors for V, W (step 3–4). Since the pair (S, T)
is of small dimension k n, (4.1) can be solved by using full space methods, for example
the QR- or the QZ-method. Note that, if V, W are bi-E-orthogonal, (4.1) transforms to
standard eigenvalue problems. The computation and sorting of the eigentriplets and
corresponding residues is illustrated in Algorithm 4.2. If the eigentriplets (λ̃i , qi , zi )
(i = 1, . . . , k) of the pair (S, T) are scaled such that z∗i Tqi = ŷ∗i Ex̂i = 1, it follows that
the residues R̂i can be computed without the explicit computation of all approximate
eigenvectors x̂i = Vqi , ŷi = Wzi , since
R̂i = (cx̂i )( ŷ∗i b) = (cV) qi z∗i (W ∗ b) .
Compare step 2 of Algorithm 4.2 on this matter. However, this might be numerically
unstable (see for instance the numerical experiments in [36, Section 3.7]) and one might
use the scaling kx̂i k2 = k ŷi k2 = 1 instead and compute the residues as product of the
angles ∠(x̂i , c∗ ) and ∠( ŷi , b). If (λ̃1 , q1 , z1 ) is the most dominant eigentriplet of (4.1), then
the most dominant approximate eigentriplet (Petrov triplet) of (A, E) with respect to V
and W is
λ̂1 = λ̃1 , x̂1 := Vq1 /kVq1 k2 , ŷ1 := Vz1 /kVz1 k2 ,
which is the eigentriplet SADPA selects in step 9. That is, the pole with the largest
(scaled) residue magnitude is extracted. Since the computed triplets (λ̃i , qi , zi ) have
been sorted decreasingly in Algorithm 4.2 in the chosen residue order, it is sufficient to
4.1. The Subspace Accelerated Dominant Pole Algorithm
41
take q1 and z1 as the first Schur vectors of a generalized Schur form of (S, T), instead of
taking the corresponding vectors of the generalized eigendecomposition (4.2).
The new shift for the next iteration of SADPA is sk+1 = λ̂1 in step 20.
Deflation However, if the convergence test in step 10 succeeds, then the found eigentriplet (λ̂1 , x̂1 , ŷ1 ) is deflated from the data and, similar to the strategy in JDQR and
JDQZ [12, Algorithm 2], SADPA continues with the remaining k − 1 right and left Petrov
vectors VQ2:k , WZ2:k as bases vectors in step 13. This can also be considered as implicit
restart with k −1 vectors. Because the current search spaces might contain more than one
accurate eigentriplet, the convergence test is repeated with the second best eigentriplet
approximation in step 14.
A common way to prevent a repeated computation of the already converged triplets is
to apply spectral operators to the matrices. Assume in the following that the columns
of the n × p matrices X and Y contain the already found right and left eigenvectors of
the pair (A, E) corresponding to p found dominant eigenvalues λi , (i = 1, . . . , p) of the
system (E, A, b, c). We may furthermore assume that X and Y are practically scaled
such that Y∗ AX = Λ = diag(λ1 , . . . , λp ) and Y∗ EX = Ip . The goal is to expand this partial
eigendecomposition by a new triplet (λ = λp+1 , x, y) such that
[Y, y]∗ A[X, x] = diag(Λ, λ), [Y, y]∗ E[X, x] = Ip+1 .
With some basic manipulations we find that (λ, x, y) has to be an eigentriplet of the
transformed pencil [12]
(I − EXY∗ )(A − λE)(I − XY∗ E).
(4.3)
Moreover, the system (Ed , Ad , bd , cd ) with
Ed := (I − EXY∗ )E(I − XY∗ E),
Ad := (I − EXY∗ )A(I − XY∗ E),
bd := (I − EXY∗ )b,
cd := c(I − XY∗ E),
has then the same eigenvectors as the original system. But now the found eigenvalues
λi (i = 1, . . . , p) and the corresponding residues Ri (i = 1, . . . , p) are transformed to zero.
The remaining eigenvalues λp+1 , . . . , λn and the corresponding residues do not change
with this transformation. Hence, deflated poles do not appear as dominant poles any
longer. This can be used to avoid that SADPA computes the λi and the corresponding
eigenvectors again by applying the algorithm to the deflated system (Ed , Ad , bd , cd ).
However, the following theorem adapted from [36, Section 3.3.1] states that it is sufficient
to work with the system (E, A, bd , cd ), where only the input and output vectors b and c
are transformed.
42
Chapter 4. Two-sided subspace accelerated eigenvalue methods
Theorem 4.2:
Let Hd (s) = cd (sE − A)−1 bd the transformed transfer function with
bd := (I − EXY∗ )b and cd := c(I − XY∗ E),
(4.4)
X := [x1 , . . . , xp ], Y := [y1 , . . . , yp ] ∈ Cn×p , Y∗ EX = Ip . Then Hd (s) has the same poles λi
and corresponding residues Ri as the original transfer function H(s) = c(sE − A)−1 b,
but the residues Ri which correspond to the found poles λi are transformed to Ri = 0
(i = 1, . . . , p).
♦
Proof. Let (λ, x, y) be an eigentriplet of (A, E) such that x ∈ {x1 , . . . , xp } and y ∈
{y1 , . . . , yp }. The corresponding residue is R = (cx)(y∗ b). Obviously, it holds that
X(Y∗ Ex) = x, (y∗ EX)Y∗ = y∗ , and thus
Y∗ bd = Y∗ (I − EXY∗ )b = 0 and cd X = c(I − XY∗ E)X = 0.
Hence, the corresponding residue of Hd (s) is
Rd = (cd x)(y∗ bd ) = cd X(Y∗ EXx)((y∗ EX)Y∗ )bd = 0.
However, if (λ, x, y) is not deflated from H(s), that is x < {x1 , . . . , xp } and y < {y1 , . . . , yp },
it holds Y∗ Ex = (y∗ EX)∗ = 0 and thus
Rd = (cd x)(y∗ bc ) = (c(I − XY∗ E)x)(y∗ (I − EXY∗ )b) = (cx)(y∗ b) = R.
Because the pair (A, E) is left unchanged, so are its eigentriplets.
Since the residues of the deflated transfer function Hd (s) are transformed to zero, they are
no longer dominant poles and will not be recomputed again by SADPA [36, Corollary
3.3.2]. Moreover, one can show that (subspace accelerated) DPA applied to the systems
(Ed , Ad , bd , cd ) and (E, A, bd , cd ) produces (in exact arithmetic) the same results [36,
Theorem 3.3.3].
Usually, the matrices and vectors of a SISO (E, A, b, c) are real and it is therefore
practical to deflate the complex conjugated triplet (λ, x, y) automatically after the
triplet (λ, x, y) has been computed. We illustrate the effects of this deflation strategy
with the FOM model1 [8] of order n = 1.006. Figure 4.1 shows the Bode magnitude
plot of its original transfer function with the three distinct peaks and of three transfer
function modal equivalents Hi (s) where the dominant poles pi , (i = 1, 2, 3) are deflated
via (4.4). It can be observed that the peak close to the imaginary part (indicated by the
vertical dashed lines
) of the deflated dominant pole is flattened. In the forth modal
equivalent H4 (s), all of the three previous poles are removed and hence H4 (s) shows no
more peaks. This cheap deflation is used in step 12 of Algorithm 4.1. However, e.g.
due to rounding error, it is possible that the subspaces V, W may still contain directions
towards the already found eigenvectors which can induce numerical instabilities. To
1
available at http://www.icm.tu-bs.de/NICONET/benchmodred.html
4.1. The Subspace Accelerated Dominant Pole Algorithm
43
35
H(iω)
H1 (iω)
H2 (iω)
H3 (iω)
H4 (iω)
Im (p)
30
25
Gain (dB)
20
15
10
5
0
−5
101
102
Frequency (rad/sec)
103
Figure 4.1: Bode magnitude plot of original transfer function of the FOM model [8], and
modal equivalents Hi where the dominant pole pi for i = 1, 2, 3 is deflated.
The dominant poles are p1 = −1 ± 100i, p2 = −1 ± 200i and p3 = −1 ± 400i.
H4 shows the result when all three poles are removed. The vertical dashed
lines
mark Im (p j ) for j = 1, 2, 3.
get rid of this contributions, an additional reorthogonalization against all previously
found eigenvectors can be applied. That is, the search spaces are (bi-E-) orthogonally
expanded with
vj =
p
Y
l=1
I−
xl y∗l E
y∗l Exl
!
v j and w j =
p
Y
l=1
I−
yl x∗l E∗
x∗l E∗ x∗l
!
w j , j = 1, . . . , k,
(4.5)
where k denotes the current subspace dimension. It is also possible to invoke this
reorthogonalization only against the most recently found eigentriplet (λp , xp , yp ). See
[36, Algorithms 3.4 and 3.5] for more details of the deflation strategy in SADPA.
Restarts If it happens that the dimension k of the search spaces exceeds a given
number kmax , a restart is invoked in step 16-19 of Algorithm 4.1, where the kmin < kmax
Petrov vectors of both search spaces are kept with respect to the kmin most dominant
eigentriplets of the pair (S, T). The other remaining vectors are simply neglected
44
Chapter 4. Two-sided subspace accelerated eigenvalue methods
and SADPA continues with search spaces of dimension kmin . For numerical stability it
appears reasonable to deflate these reduced search spaces against the found eigenvectors
via (4.5) and reorthogonalizing them afterwards.
Further enhancements As the estimate sk of SADPA converges towards a dominant
pole, it is likely that the solutions v and w of the linear systems in step 3 and 4 of
Algorithm 4.1 are more accurate approximations of eigenvectors than the ones obtained
in the selection procedure in step 9. Hence, the most accurate of both should be taken
in the deflation phase, for instance the eigentriplet with the smallest residual norm.
A further improvement of the accuracy of the computed eigenvector approximations
can be achieved by applying a few steps of DPA (Algorithm 3.3) or two-sided RQI
(Algorithm 3.2), if the current approximate eigentriplet (λ̂, x̂, ŷ) in step 9 satisfies
krk2 = kAx̂ − λ̂Ex̂k2 ≤ RQI , where RQI is an additional error tolerance practically greater
than . Since the vectors x̂, ŷ are already very close to the eigenvectors and λ̂ ≈ λ, a very
small number of iterations of DPA or 2-RQI will normally be sufficient and deflation
of the found eigentriplets is not necessary because of the low risk to converge to some
pole µ , λ. Switching to 2-RQI can also help to avoid stagnation in the final phase of
convergence [36, Chapter 3.3.2]. This stagnation occurs because, as the approximate
pole sk converges towards an eigenvalue λ, the solutions v and w will possibly make a
very small angle with the search spaces. But since v and w are designated to expand the
search spaces, this can induce numerical instabilities in the orthogonalization phase.
In the next section we investigate the two-sided Jacobi-Davidson algorithm that can
also be used for the computation of dominant eigentriplets.
4.2 The two-sided Jacobi-Davidson algorithm
According to Remark 4.1, a subspace accelerated two-sided 2-RQI (SA2RQI) can be
easily obtained by replacing the fixed right hand sides of SADPA (Alg. 4.1). Another
more sophisticated subspace accelerated algorithm is the two-sided Jacobi-Davidson algorithm (2-JD), where also new linear systems are introduced. The 2-JD was originally
proposed by A. Stathopoulos in [50] and M. E. Hochstenbach in [20] and it can be seen
as a generalization of the one-sided Jacobi-Davidson methods [12, 48].
4.2.1 The new correction equations
We begin with the derivation of the correction equations which replace the linear systems
of SADPA and SA2RQI, and which form the main difference to these methods. Let V
and W be two k-dimensional subspaces with basis vectors v1 , . . . , vk and w1 , . . . , wk ,
respectively, which are also the columns of the matrices V, W ∈ Cn×k . As in SADPA, V
4.2. The two-sided Jacobi-Davidson algorithm
45
and W will serve as right and left search spaces. In the sequel we may assume that V and
W are orthogonal or bi-E-orthogonal, that is, either V ∗ V = W ∗ W = I or W ∗ EV = I holds.
Suppose we have approximations v ∈ V and w ∈ W for a right and left eigenvector and
want to use the two-sided generalized Rayleigh quotient (3.3)
w∗ Av
=: θ(u, v)
w∗ Ev
as approximation for the corresponding eigenvalue. For this reason we impose similarly
to SADPA a two-sided Petrov-Galerkin condition on the right and left residual rv and
rw , respectively:
ρ(u, v, A, E) =
rv := Av − θEv ⊥ W,
rw := A∗ w − θE∗ w ⊥ V.
(4.6)
With two coefficient vectors q, z ∈ Ck we write v = Vq and w = Wz whereby (4.6)
transforms to
AVq − θEVq ⊥ W,
A∗ Wz − θE∗ Wz ⊥ V,
or equivalently, (θ, q, z) satisfies the reduced eigenvalue problems


∗
∗

W AVq = θW EVq


V ∗ A∗ Wz = θV ∗ E∗ Wz.
(4.7)
Since S, T ∈ Ck×k with k n, standard full space methods can be used for the computation of the eigentriplets. If we are interested in dominant poles of a SISO system, we
have to provide additionally the input and output vectors b and c and use Algorithm
4.2 for this purpose as in SADPA. Furthermore, v = Vq and w = Wz are again right and
left Petrov vectors with respect to the search spaces V and W. We derive the correction
equations in a course differently to the ones in [20, 50] by using an Newton style formulation similar to the approach for the one-sided JD methods [11, 47]. We note that the
two-sided Jacobi-Davidson method is, to the author’s knowledge, not derived in this
way as accelerated Newton method in the literature. There, the term accelerated refers to
the incorporation of subspace acceleration.
Suppose we have computed such right and left eigenvector approximations (v, w) and
want to find a better approximation
(v+ , w+ ) := (v + s, w + t)
(4.8)
with corrections s, t ∈ Cn . In the two-sided Jacobi-Davidson method, the vectors s and
t are the solutions of two correction equations. One way to obtain those equations is to
apply a Newton scheme to the function F : C2n+2 7→ C2n+2 :


 Av − θEv 
 ∗

A w − ηE∗ w

F(θ, η, v, w) := 
 f ∗ v − 1 
 ∗

g w−1
Chapter 4. Two-sided subspace accelerated eigenvalue methods
46
where we also used the scalars θ, η ∈ C which refer to approximations for λ and λ, respectively. The last two rows represent suitable scaling conditions for the normalization
of the approximate eigenvectors using two scaling vectors f, g ∈ Cn whose choices will
be discussed later. Obviously, if (λ, x, y) is an exact eigentriplet with f ∗ x = g∗ y = 1, then
the quadruplet (λ, λ, x, y) is a root of F. Applying a Newton scheme to the function
above will also yield corrections µr and µl for θ and η. That is, we compute a new
approximation
(θ, η, v+ , w+ ) := (θ + µr , η + µl , v + s, w + t).
The Newton step for the new approximations (4.8) is then given by
∂F(θ, η, v, w)(µr , µl , sT , tT )T + F(θ, η, v, w) = 0
(4.9)
where ∂F denotes the Jacobian of F. Inserting the Jacobian


0
A − θE
0
−Ev



 0
−E∗ w
0
A∗ − ηE∗ 

∂F(θ, η, v, w) = 
∗

0
f
0
 0


∗
0
0
0
g
and F at the point (θ, η, v, w) into (4.9) leads to

  

0
A − θE
0
 µr   −rv 
−Ev

  

 0
−E∗ w
0
A∗ − ηE∗   µl   −rw 





=
  s   1 − f ∗ v  .
0
f∗
0

   
 0


  
1 − g∗ w
t
0
0
0
g∗
(4.10)
The first block row of (4.10) is
−µr Ev + (A − θE)s = −rv
(4.11)
and by the Petrov-Galerkin condition (4.6), a multiplication with w∗ from the left yields
−µr w∗ Ev + w∗ (A − θE)s = −w∗ rv = 0.
Clearly it holds that
µr = (w∗ (A − θE)s)/(w∗ Ev)
which can be inserted back into (4.11) to obtain
−Evw∗
(A − θE)s + (A − θE)s = −rv .
w∗ Ev
A similar expression can be found for the second block row of (4.10) and after some
basic manipulations we get the two correction equations
Evw∗
I− ∗
(A − θE)s = −rv ,
w Ev
(4.12)
E∗ wv∗
I − ∗ ∗ (A − θE)∗ t = −rw ,
vEw
4.2. The two-sided Jacobi-Davidson algorithm
47
where we inserted η = ρ(A∗ , E∗ , w, v) = ρ(A, E, v, w) = θ into the second equation
because of the underlying two-sided Petrov-Galerkin projection. Note that this relation
between θ and η might not hold for general two-sided subspace extraction approaches
([16], see also Section 5.1).
In the sequel we discuss two choices for the scaling vectors f and g. If we assume that
the Petrov vectors v, w are scaled such that v∗ f = w∗ g = 1 holds during the iteration,
it follows from the third and fourth row of (4.10) that f ∗ s = g∗ t = 0 or, equivalently,
s ⊥ f, t ⊥ g. In other words, s and t are taken from the orthogonal complements of f
and g, respectively. Hence, (I − v f ∗ /( f ∗ v))s = s and (I − wg∗ /(g∗ w))t = t holds and the
correction equations in (4.12) become
!
vf∗
Evw∗
I− ∗
(4.13)
(A − θE) I − ∗ s = −rv s ∈ f ⊥ ,
w Ev
f v
!
wg∗
E∗ wv∗
∗
I − ∗ ∗ (A − θE) I − ∗ t = −rw t ∈ g⊥ .
(4.14)
vEw
gw
The choice of f and g is associated with the scaling of the search spaces V and W. As
in SADPA, the corresponding basis matrices V and W can for instance be scaled bi-Eorthogonally which seems to be the natural choice since right and left eigenvectors are
also bi-E-orthogonal. Furthermore, W ∗ EV = I transforms the reduced eigenproblems
(4.7) to a standard ones. In this case obvious choices for the scaling vectors are f = E∗ w
and g = Ev such that we get the orthogonality conditions s ⊥ E∗ w and t ⊥ Ev for the
corrections. This yields the bi-E-orthogonal correction equations
Evw∗
vw∗ E
(A
−
θE)
I
−
s = −rv s ∈ (E∗ w)⊥ ,
∗
∗
w Ev
w Ev
∗ E∗ E∗ wv∗
wv
I − ∗ ∗ (A − θE)∗ I − ∗ ∗ t = −rw t ∈ (Ev)⊥ .
vEw
vEw
I−
(4.15)
(4.16)
Note that using bi-E-orthogonal search spaces has the disadvantage that the E-inner
products are more expensive to evaluate and they might even be not well defined such
that this scaling might introduce some numerical instabilities. However, the operator in
(4.15) is the conjugate transpose of the operator in (4.16), which might be useful when
applying the bi-conjugate gradient method (BiCG) [43].
If the columns of both V and W are kept orthogonal such that V ∗ V = W ∗ W = I, the
scaling vectors are f = v and g = w. We look in this case for corrections s ⊥ v, t ⊥ w and
the corresponding orthogonal correction equations become
Evw∗
vv∗
I− ∗
(A − θE) I − ∗ s = −rv s ∈ v⊥ ,
w Ev
vv
∗
E∗ wv∗
ww
I − ∗ ∗ (A − θE)∗ I − ∗ t = −rw t ∈ w⊥ .
vEw
ww
(4.17)
(4.18)
In both variants, the solutions s and t of the correction equations are (bi-E-)orthogonalized
with respect to the search spaces V and W, respectively, and added as the (k + 1)th basis
Chapter 4. Two-sided subspace accelerated eigenvalue methods
48
Algorithm 4.3 Basic bi-E-orthogonal two-sided Jacobi-Davidson algorithm
Input: Matrix pair (A, E), initial vectors v0 , w0 (w∗0 Ev0 , 0), tolerance 1.
Output: Approximate eigentriplet (λ, x, y) with
min(kAx − λExk2 , kA∗ y − λE∗ yk2 ) ≤ .
1: Set s = v0 , t = w0 , V = W = [].
2: for i = 1, 2, . . . do
3:
Expand V and W bi-E-orthogonally by v := s and w := t, respectively:
4:
V = [V, v], W = [W, w].
5:
Construct reduced matrices S = W ∗ AV, T = W ∗ EV.
6:
Compute a ’suitable’ eigentriplet (θ, q, z) of (S, T).
7:
Compute Petrov vectors
v = Vq/kVqk2 , w = Wz/kWzk2 , (θ = ρ(v, w, A, E)).
8:
Set rv = Av − θEv, rw = A∗ w − θE∗ w.
9:
if min(krv k2 , krw k2 ) ≤ then
10:
Improve second vector at will.
11:
Set x = v, y = w, λ = θ and stop iteration.
12:
end if
13:
Find (approximate) solutions s ⊥ E∗ w, t ⊥ Ev of
Evw∗
vw∗ E
I− ∗
(A − θE) I − ∗
s = −rv ,
w Ev
w Ev
wv∗ E∗
E∗ wv∗
I − ∗ ∗ (A − θE)∗ I − ∗ ∗ t = −rw .
vEw
vEw
14:
end for
vectors in the next iteration. This orthogonalization should be carried out with a stable
Gram-Schmidt variant, for instance, using modified Gram-Schmidt with iterative refinement [3, Algorithm 4.14] for orthogonal search spaces. For bi-E-orthogonal search
spaces an appropriate variant has to be included [32].
A basic two-sided Jacobi-Davidson algorithm for the computation of one eigentriplet
using bi-E-orthogonal search spaces is illustrated in Algorithm 4.3. The eigentriplet
(θ, q, z) of the reduced eigenproblem in step 6 should be chosen appropriately, for
instance the one closest to a specified target τ ∈ C. Otherwise, there is the unpleasant
possibility that the algorithms tries to converge towards an infinite eigenvalue if E is
singular. We are in this thesis mainly interested in the computation of dominant poles
of a SISO system (E, A, b, c) which belong in general not to the infinite eigenvalues. By
using Algorithm 4.2, which selects approximate dominant eigentriplets, a convergence
towards an infinite eigenvalue is very improbable.
Note that the convergence test in step 9 is successful if only one of the two residual norms
drops below the tolerance . That is, only one of the two approximate eigenvectors has
the desired accuracy. However, the accuracy of the other eigenvector can be improved
4.2. The two-sided Jacobi-Davidson algorithm
49
easily. For instance, assume that krv k2 ≤ and krw k2 > . To get a better left eigenvector
approximation w, one could solve the linear system
(A − θE)∗ w̃ = rw
using, e.g., a few steps of an iterative solver like GMRES. Afterwards, the improved
left eigenvector approximation can be obtained by computing w := (w̃ − w)/kw̃ − wk2 .
A similar procedure can be applied for improving the right eigenvector approximation
when the left one has converged. If exact solutions of the linear systems are available,
inverse iteration or 2-RQI (Algorithm 3.2) can also be applied here.
The next theorem sheds some light onto the convergence behavior of 2-JD and is a result
from [20, Theorems 4.1 and 7.1].
Theorem 4.3:
If the correction equations in (4.12) are solved exactly, then 2-JD with bi-E-orthogonal
and orthogonal search spaces converges asymptotically cubically to an eigenvalue
λ, if its algebraic multiplicity (as defined in Section 2.1.1) is α(λ) = 1.
♦
Proof. Consider for this purpose the right correction equation (4.15) with bi-E-orthogonal
search spaces (step 13 in Algorithm 4.3). By using (I − vw∗ E/(w∗ Ev)) s = s it can be rearranged to
(A − θE)s = −rv +
Evw∗
w∗ As
(A
−
θE)s
=
−r
+
Ev
,
v
w∗ Ev
w∗ Ev
where we also have used w∗ Es = 0. Defining µr := (w∗ As)/(w∗ Ev), a multiplication with
the inverse of A − θE yields
s = −v + µr (A − θE)−1 Ev.
(4.19)
For the left correction equation (4.16) we obtain in an analog way
t = −w + µl (A − θE)−∗ E∗ w
(4.20)
with µl := (v∗ At)/(v∗ E∗ w) and in both expressions we recognize the updates v + s and
w + s as multiples of the ones obtained by one iteration of 2-RQI (cf. Section 3.2).
Similar expression can be derived in the case of orthogonal search spaces. The cubic
convergence follows in all variants from Theorem 3.8.
For the sake of completeness, we analyze equations (4.19) and (4.20) in more detail.
Unfortunately, since we do not know s and t, we do not know the scalar quantities µr and
µl , but this can be circumvented by using the orthogonality relations w∗ Es = v∗ E∗ t = 0
again:
0 = w∗ Es = −w∗ Ev + µr w∗ E(A − θE)−1 Ev,
0 = v∗ E∗ t = −v∗ E∗ w + µl v∗ E∗ (A − θE)−∗ E∗ w.
Chapter 4. Two-sided subspace accelerated eigenvalue methods
50
Algorithm 4.4 Efficient exact solution of the correction equations of Algorithm 4.3 (biE-orthogonal 2-JD)
Input: Matrix pair (A, E), approximate eigentriplet (θ, v, w).
Output: Correction vectors s ∈ (E∗ w)⊥ , t ∈ (Ev)⊥ as solutions of (4.15), (4.16).
1: Set v̂ = Ev, ŵ = E∗ w.
2: Compute ŝ = (A − θE)−1 v̂, t̂ = (A − θE)−∗ ŵ,
3: µr = (ŵ∗ v)/(ŵ∗ ŝ), µl = (v̂∗ w)/(v̂∗ t̂).
4: Set s = −v + µr ŝ, t = −w + µl t̂.
This holds if
µr =
w∗ Ev
v∗ E∗ w
.
and
µ
=
l
v∗ E∗ (A − θE)−∗ E∗ w
w∗ E(A − θE)−1 Ev
Both expressions direct us to an efficient way to solve the correction equations exactly
without constructing the projected operators explicitly which is shown in Algorithm 4.4.
The procedure for orthogonal search spaces is very much alike. Note that the situation
will change drastically if the correction equations are solved only approximately, e.g., by
a few steps of an iterative method for linear systems. These issues will be investigated
in Section 4.2.3.
Under some assumptions, SADPA, SA2RQI and 2-JD are equivalent in the sense that
the produced search spaces are the same.
Theorem 4.4:
For a given s0 ∈ C, let v0 := (A−θE)−1 b, and w0 := (A−θE)−∗ c∗ . Then SADPA(A, E, b, c, s0 ),
SA2RQI(A, E, v0 , w0 ) and 2−JD(A, E, v0 , w0 ) are equivalent provided that the linear
systems in all three methods are solved exactly.
♦
Proof. See [36, Section 3.4.2] for the equivalence of SADPA with SA2RQI and 2-JD. The
equivalence of SA2RQI and 2-JD can be concluded from Theorem 4.3 and can also be
found in [20].
Note that SADPA starts with empty search spaces while 2-JD and SA2RQI begin with
one dimensional search spaces because of the inserted initial vectors v0 and w0 . This
equivalence can be observed in numerical experiments, but due to rounding errors, it
vanishes often after some poles have been found. Note that if the linear systems are
not solved exactly, there is no equivalence of SA2RQI and 2-JD with SADPA. However,
in [20, Proposition 5.5] it is shown that inexact SA2RQI and 2-JD are equivalent if the
linear systems occurring in both methods are solved with m + 1 and m steps of BiCG,
respectively.
4.2. The two-sided Jacobi-Davidson algorithm
51
4.2.2 Computing more than one eigentriplet
Since our main goal is to compute more than one dominant eigentriplet of a given system
(E, A, b, c), eigenvalue selection, deflation and restarting strategies can be included
into the two-sided Jacobi-Davidson method similarly to SADPA. A complete two-sided
Jacobi-Davidson algorithm for the computation of pwanted dominant poles with all of
these extensions using bi-E-orthogonal search spaces is illustrated in Algorithm 4.5. In
the sequel, we describe briefly the main parts of this algorithm.
Eigenvalue selection The eigenvalue selection of 2-JD begins in step 6 and follows
exactly the same scheme as SADPA, that is, Algorithm 4.2 can be used to compute a
generalized eigenvalue decomposition of the reduced pair (S, T) and order it decreasingly with respect to one of the dominance measurements defined by (2.13)–(2.15). The
most dominant Petrov triplet of (A, E) (with respect to V, W) is again obtained by
(θ = λ̃1 , v := Vq1 /kVq1 k2 , w := Wz1 /kWz1 k2 ) in step 7. If the convergence test in step 9
succeeds for (θ, v, w), the triplet is deflated from the system and the algorithm continues in step 14 with the second most dominant triplet and with the k − 1 Petrov vectors
as bases for the search spaces. This process is repeated until the associated residuals
are greater than the error tolerance . For the dominant pole computation of a real
system (E, A, b, c) we could also automatically deflate the conjugate triplet (θ, v, w) if
Im (θ) , 0.
Deflation Suppose that already p dominant eigentriplets have been computed and
the right and left approximate eigenvectors are stored as columns in the matrices X ≡
Xp , Y ≡ Yp ∈ Cn×p with Y∗ AX = Λ and Y∗ EX = Ip . As already discussed in Section 4.1,
in order to avoid a repeated computation of these eigentriplets, the transformed pair
(I − EXY∗ ) A (I − XY∗ E) , (I − EXY∗ ) E (I − XY∗ E)
(4.22)
has to be used. In step 12 the input and output vectors b, c are also transformed as in
SADPA. Numerical experiments show that it is wise to additionally deflate the found
eigenvectors from the search spaces. As described in the corresponding paragraph in
Section 4.1, this can be achieved by reorthogonalizing the search spaces via (4.5) against
the found eigenvectors. If this reorthogonalization is invoked against all previously
found eigenvectors, the deflation of the input and output vectors can be neglected.
Subspace Expansion In the basic 2-JD algorithm expansions for the subspaces V, W
are obtained by solving the right and left correction equations (4.15), (4.16) in the biE-orthogonal and (4.17), (4.18) in the orthogonal case. However, if we have already
computed some eigentriplets, the projectors in the correction equations have to be
Chapter 4. Two-sided subspace accelerated eigenvalue methods
52
Algorithm 4.5 Bi-E-orthogonal two-sided Jacobi-Davidson algorithm for dominant pole
computation
Input: System (E, A, b, c), initial vectors v0 , w0 (w∗0 Ev0 , 0), tolerance 1, number
of wanted eigentriplets pwanted , minimum and maximum search space dimensions
kmin < kmax n, dominance criterion ((2.13)–(2.15)).
Output: Approximate eigentriplet (λi , xi , yi ), i = 1, . . . , p .
1: Set k = 0, p = 0, Λ = X = Y = [], V = [], W = [], s = v0 , t = w0 .
2: while p < pwanted do
3:
Expand V and W bi-E-orthogonally by v := s and w := t, respectively:
4:
V = [V, v], W = [W, w], k = k + 1.
5:
Construct reduced matrices S := W ∗ AV, T := W ∗ EV.
6:
Compute and sort eigentriplets of (S, T) with respect to the dominance criterion
(Λ̃, Q, Z) = Sort(S, T, W ∗ b, cV). {Algorithm 4.2}
7:
Most dominant Petrov triplet of (A, E) with respect to V, W is
(θ = λ̃1 , v := Vq1 /kVq1 k2 , w := Wz1 /kWz1 k2 ).
8:
Set rv = Av − θEv, rw = A∗ w − θE∗ w.
9:
while min(krv k2 , krw k2 ) ≤ do
10:
Improve second vector at will.
11:
Set X = [X, v], Y = [Y, w], Λ = [Λ, θ], p = p + 1,
12:
b = b − Ev(w∗ b), c = c − cv(w∗ E),
13:
V = VQ2:k , W = WZ2:k , S = W ∗ AV, T = W ∗ AV, k = k − 1.
14:
Continue with second best triplet v = v1 , w = w1 , θ = λ̃2 .
15:
Set rv = Av − θEv, rw = A∗ w − θE∗ w.
16:
end while
17:
if k ≥ kmax then
18:
Take kmin best Petrov vectors V = VQ1:kmin , W = WZ1:kmin , k = kmin .
19:
Reconstruct reduced matrices S = W ∗ AV, T = W ∗ AV.
20:
end if
21:
Set X̃ := [X, v], Ỹ := [Y, w].
22:
Find (approximate) solutions s ⊥ E∗ Ỹ, t ⊥ EX̃ of
I − EX̃Ỹ∗ (A − θE) I − X̃Ỹ∗ E s = −rv ,
(4.21)
I − E∗ ỸX̃∗ (A − θE)∗ I − ỸX̃∗ E∗ t = −rw .
23:
end while
applied to (4.22). With bi-E-orthogonal search spaces and w∗ Ev = 1 this leads to
(I − Evw∗ )(I − EXY∗ )(A − θE)(I − XY∗ E)(I − vw∗ E)s = −rv ,
(I − E∗ wv∗ )(I − E∗ YX∗ )(A − θE)∗ (I − YX∗ E∗ )(I − wv∗ E∗ )t = −rw ,
which have to be solved for s ⊥ [E∗ Y, E∗ w] and t ⊥ [EX, Ev]. With X̃ := [X, v], Ỹ := [Y, w]
and since w∗ EX = v∗ E∗ Y = 0 we obtain the deflated correction equations in (4.21).
4.2. The two-sided Jacobi-Davidson algorithm
53
As before in SADPA and the basic 2-JD, the solutions s, t are (bi-E-)orthogonalized with
respect to V and W and the overall process is repeated with increasing search space
dimension until pwanted triplets have converged.
Restarts Similar to SADPA, if the subspace dimension k exceeds kmax , a restart with
the kmin most promising Petrov vectors is initiated in step 18. Again, it might be
practicable for reasons of numerical stability to deflate these kmin dimensional search
spaces against the found eigenvectors and reorthogonalize them as well. It was observed
in numerical experiments in [12], that due to the use of the deflated operators in the
correction equation, the deflation and reorthogonalization of the search spaces and
therefore of the reduced matrices is not necessary for JDQR and JDQZ (see also [3]).
However, numerical experiments confirm that the bi-E-orthogonal 2-JD appears to be
more sensitive to this loss of orthogonality after a restart and after the deflation of a
found eigentriplet.
Further numerical enhancements The spectrum of a matrix pair (A, E) with singular
E contains infinite eigenvalues. By selecting approximations to eigenvalues with respect
to their approximate residues using Algorithm 4.2, convergence towards the infinite is
usually not observed in practice. However, it can still happen that directions associated
to the eigenvalues at infinity enter the search spaces and hamper the convergence. In
[37] a purification strategy is described which keeps the algorithm free of these directions
right from the start.
We have already mentioned that SADPA can end up in stagnation because it can happen
that the expansion vectors have small angles with the current search spaces. Since 2-JD
expands the search spaces with corrections from the orthogonal complement of the current eigenvector approximations, it suffers less from this problem. However, including
a few steps of 2-RQI if krk2 < RQI with RQI > can still enhance the performance of
(exact) 2-JD and may be cheaper then solving the correction equations. The value of
RQI should be chosen carefully, e.g. 10−7 ≤ RQI ≤ 10−2 if = 10−8 .
Due to rounding errors, it can also happen that in the eigenvalue selection step a
eigentriplet is selected which is of worse accuracy than the previous one. This flaw can
be circumvented by eigenvalue tracking [12, Section 4.4], where the current eigenvalue
approximation θ is used as target τ (shift) in both the correction equations and in the
eigenvalue selection if krk2 < tr . Here the value of tr also has to be set with caution,
for instance tr = 10−3 if = 10−8 . If RQI and tr are too large the method may become
greedy and select eigenvalues more or less randomly. Too small values, on the other
hand, may prevent a possible improvement of the process. Note that tracking can also
be applied directly if the correction equations are solved inexactly.
54
Chapter 4. Two-sided subspace accelerated eigenvalue methods
4.2.3 Inexact solution and preconditioning of the correction equations
In this section we discuss the inexact solution of the correction equations of 2-JD using
Krylov solvers and preconditioning. In this context, inexact solution stands normally
for a solution of the correction equations obtained after running a few steps of the
iterative solver. However, the direct application of such an iterative Krylov method
for linear systems is obstructed because the domain and image spaces of the operators
in (4.15)-(4.18) are different. In particular, the operators in the right (4.15) and left
(4.16) correction equation in the bi-E-orthogonal case map from (E∗ w)⊥ to w⊥ and from
(Ev)⊥ to v⊥ , respectively. Furthermore, in the orthogonal case, the operator in the right
correction equation (4.17) maps v⊥ to w⊥ while the operator in (4.18) maps w⊥ to v⊥ .
Recall that a Krylov method for the solution of a linear system Mx = y constructs a
Krylov subspace of dimension m
n
o
K(M, z1 , m) := z1 , Mz1 , . . . , Mm−1 z1
for a given initial vector z1 ∈ Cn . However, if the image and domain space of the operator
M do not coincide, powers of M can not be formed [7, Section 3.3]. A usual way to
circumvent this problem is to use a suitable preconditioner. For instance, a practicable
(left) preconditioner for the right correction equation (4.15) in the bi-E-orthogonal case
is
Kr := (I − Evw∗ )K(I − vw∗ E)
with a given nonsingular preconditioner K ≈ A − θE and w∗ Ev = 1. It clearly maps
(E∗ w)⊥ to w⊥ , similar to the associated operator in the correction equation (4.15). In the
current context, Kr will be used to solve f ⊥ (E∗ w) from
(I − Evw∗ )K(I − vw∗ E) f = z, z ⊥ w
(4.23)
within a Krylov solver. Since w∗ E f = 0, it holds
(I − Evw∗ )K f = z
and with the assumed nonsingularity of K we find that f can be obtained by
f = K−1 z + K−1 Ev(w∗ K f ).
(4.24)
Using the orthogonality relation w∗ E f = 0 again by a multiplication with w∗ E leads,
after some simple manipulations, to
−1
(w∗ K f ) = − w∗ EK−1 Ev w∗ EK−1 z.
Inserting this relation back into (4.24) and we get for the solution f of (4.23) the relation
−1
∗ −1
f = I − K−1 Ev w∗ EK−1 Ev w∗ E K−1 z = (I − yr h−1
r qr )K z
4.2. The two-sided Jacobi-Davidson algorithm
55
with qr := E∗ w, yr := K−1 Ev and hr := q∗r yr . It follows that, if hr , 0, the inverse of the
preconditioner for (4.15) is given by
∗ −1
Kr−1 := (I − yr h−1
r qr )K
It can be considered as the correct application of the projected preconditioner Kr to
(4.15) since it maps w⊥ onto (E∗ w)⊥ . The derived preconditioner is a special case of the
result of the following theorem which is adapted from [36, Section 3.6.1] and yields the
(left) preconditioner for the deflated correction equations (4.21). It can be also seen as
a two-sided variant of the result in [12, Section 2.6]. There approximate Schur vectors
are used instead of approximate left and right eigenvectors. Further manipulations
reveal that the (left) preconditioned right correction equation for s ⊥ E∗ w = qr in the
bi-E-orthogonal case can be written as
∗ −1
−1 ∗
−1 ∗ −1
(I − yr h−1
r qr )K (A − θE)(I − yr hr qr )s = −(I − yr hr qr )K rv .
This new operator maps now (E∗ w)⊥ onto (E∗ w)⊥ which admits the use of Krylov
methods for linear systems. A related, but more general preconditioned correction
equation, which also incorporates deflation, will be derived in the subsequent Corollary
4.6.
Theorem 4.5:
Let X, Y ∈ Cn×p be the right and left eigenvector matrices containing p already found
eigenvectors such that Y∗ AX = Λ, Y∗ EX = Ip . Furthermore, let (θ, v, w) be a Petrov
triplet with w∗ EX = v∗ E∗ Y = 0 and K an invertible preconditioner for A − θE. If the
projected (left) preconditioner for the deflated right correction equation (4.21) is of
the form
Kr := I − EX̃Ỹ∗ K I − X̃Ỹ∗ E
with X̃ := [X, v], Ỹ := [Y, w], then the solution f ⊥ E∗ Ỹ of
I − EX̃Ỹ∗ K I − X̃Ỹ∗ E f = zr , zr ⊥ w
(4.25)
is given by f = Kr−1 zr where
Kr−1 := I − Yr Hr−1 Q∗r K−1 ,
(4.26)
Yr := K−1 EX̃, Qr := E∗ Ỹ ∈ Cn×(p+1) and Hr := Q∗r Yr ∈ C(p+1)×(p+1) , provided that Hr is
nonsingular. If K∗ is used as preconditioner for (A − θE)∗ and
Kl := I − E∗ ỸX̃∗ K∗ I − ỸX̃∗ E∗
is the projected preconditioner for the deflated left correction equation in (4.21), then
the solution g ⊥ EX̃ of
I − E∗ ỸX̃∗ K∗ I − ỸX̃∗ E∗ g = zl , zl ⊥ v
56
Chapter 4. Two-sided subspace accelerated eigenvalue methods
is (if Hl is nonsingular) g = Kl−1 zl with
Kl−1 = I − Yl Hl−1 Q∗l K−∗ ,
Yl := K−∗ E∗ Ỹ, Ql := EX̃ ∈ Cn×(p+1) and Hl := Q∗l Yl ∈ C(p+1)×(p+1) .
♦
Proof. Using Ỹ∗ E f = 0, (4.25) is equivalent to
K f − EX̃Ỹ∗ K f = zr
and since K is assumed to be nonsingular we obtain
f = K−1 zr + K−1 EX̃m
with m := Ỹ∗ K f ∈ Cp+1 . Exploiting Ỹ∗ E f = 0 again we find by multiplication with Ỹ∗ E
that
−1 Ỹ∗ EK−1 zr
m = − Ỹ∗ EK−1 EX̃
if Ỹ∗ EK−1 EX̃ is invertible. Hence, the solution f becomes
−1
f = I − K−1 EX̃ Ỹ∗ EK−1 EX̃ Ỹ∗ E K−1 zr ,
from which the result follows by inserting the definitions of matrices Yr , Qr , Hr and
Kr−1 . The expression for the left correction g can be obtained in a similar way.
Corollary 4.6:
With the assumptions and notations of Theorem 4.5, the deflated correction equations
for s ⊥ Qr and t ⊥ Ql in the bi-E-orthogonal case (4.21), preconditioned by Kr and Kl ,
are equivalent to
I − Yr Hr−1 Q∗r K−1 (A − θE) I − Yr Hr−1 Q∗r s = − I − Yr Hr−1 Q∗r K−1 rv ,
(4.27)
I − Yl Hl−1 Q∗l K−∗ (A − θE)∗ I − Yl Hl−1 Q∗l t = − I − Yl Hl−1 Q∗l K−∗ rw .
♦
Proof. Without deflation, the proof is a slight generalization of the proof of [7, Theorem
7.3] which states a similar result within the (one-sided) JDQR / JDQZ framework. When
deflation is included one simply has to work with the matrices Y, Q and H instead of
the vectors y, q and scalars h (cf. [12, Section 2.6 and Remark 8]). Consider the right
deflated correction in (4.21):
I − EX̃Ỹ∗ (A − θE) I − X̃Ỹ∗ E s = −rv .
(4.28)
4.2. The two-sided Jacobi-Davidson algorithm
57
Applying a preconditioner K to a linear system Mx = y leads to the preconditioned
system K−1 Mx = K−1 y. With Kr−1 from (4.26) applied to (4.28) we get
I − Yr Hr−1 Q∗r K−1 I − EX̃Ỹ∗ (A − θE) I − X̃Ỹ∗ E s = − I − Yr Hr−1 Q∗r K−1 rv .
(4.29)
The product of the projected inverse preconditioner and the deflation operator can be
expanded:
I − Yr Hr−1 Q∗r K−1 I − EX̃Ỹ∗ = K−1 − Yr Hr−1 Qr K−1 I − EX̃Ỹ∗
= K−1 − Yr Hr−1 Qr K−1 − K−1 EX̃Ỹ∗ + Yr Hr−1 Q∗r K−1 EX̃Ỹ∗
= I − Yr Hr−1 Q∗r K−1 − Yr Ỹ + Yr Hr−1 Q∗r Yr Ỹ∗
= I − Yr Hr−1 Q∗r K−1 − Yr Ỹ + Yr Hr−1 Hr Ỹ∗
= I − Yr Hr−1 Q∗r K−1 ,
(4.30)
where we used Yr = K−1 EX̃ and Hr = Q∗r Yr . For the sought correction s it holds the
orthogonality condition s ⊥ Qr = E∗ Ỹ and thus
s = I − X̃Ỹ∗ E s = I − Yr Hr−1 Q∗r s.
(4.31)
Substituting the corresponding parts in (4.29) by (4.30) and (4.31) yields finally the
desired result
I − Yr Hr−1 Q∗r K−1 (A − θE) I − Yr Hr−1 Q∗r s = − I − Yr Hr−1 Q∗r K−1 rv .
The preconditioned left deflated correction equation in (4.27) can be derived in a similar
way.
Remark 4.7:
If the search spaces are scaled orthogonally, so that V ∗ V = W ∗ W = I, an appropriate
projected and deflated preconditioner for the right correction equation is
Kr := I − EX̃Ỹ∗ K I − X̃[E∗ Y, v]∗
with X̃ := [X, v], Ỹ := [Y, w/(v∗ E∗ w)]. Using similar steps as in the proof of Theorem
4.5 we find that the inverse operator is (if Hr is invertible) again of the form
Kr−1 = I − Yr Hr−1 Q∗r K−1
but with Yr := [K−1 EX̃], Qr := [E∗ Y, v] and Hr = Q∗r Yr . For the right preconditioner
we have then Yl := [K−∗ E∗ Ỹ], Ql := [EX, w] and Hl := Q∗l Yl .
♦
As the number of found eigenvectors increases, also the size of the matrices Yr , Yl , Qr , Ql
and Hr , Hl in the preconditioned correction equations grows. But similar to the results
in [12, Remark 7], [36, Section 3.6.1], these matrices Y, Q and H can be updated efficiently
Chapter 4. Two-sided subspace accelerated eigenvalue methods
58
in each step. Consider for example the right correction equation in (4.27) for the biE-orthogonal case. By storing Yrk := K−1 EX, Qkr := E∗ Y, the matrices Yr and Qr can be
updated easily by adding only the new columns yr = K−1 Ev and qr = E∗ w, respectively,
in every iteration. In the same way the interaction matrices S, T for the eigenvalue
selection are updated, Hr can be computed in each iteration as
#
Hrk Qk∗
r yr
,
Hr = ∗ k
qr Yr q∗r yr
"
where Hrk := Qk∗
r Yr . Furthermore, if a Krylov solver is applied to obtain approximate
solutions s̃, t̃ of (4.27) and the initial vectors satisfy Q∗r s0 = Q∗l t0 = 0, then it holds
Q∗r s̃ = Q∗l t̃ = 0, too. Consequently, applications of the preconditioned operators reduce
to
I − Yr Hr−1 Q∗r K−1 (A − θE)s,
I − Yl Hl−1 Q∗l K−∗ (A − θE)∗ t
(see, e.g., [12, Remark 9]).
Although it is possible to update the preconditioner K ≈ A − θE in each iteration, the
usual practice is to keep it fixed, e.g., by K ≈ A − τE for τ ∈ C\Λ(A, E). This might
indeed work well if one is interested in the eigenvalues close to τ, see for instance the
numerical experiments in [7, 12]. To motivate fixed preconditioners, we follow a similar
argumentation as in [12, Section 3.3]. Let K = A − τE + R be such a preconditioner
for a fixed τ where R denotes the approximation error. If θ is the current eigenvalue
approximation, it follows that
I − EX̃Ỹ∗ [A − θE] I − X̃Ỹ∗ E = I − EX̃Ỹ∗ [A − τE + (τ − θ) E] I − X̃Ỹ∗ E
= I − EX̃Ỹ∗ K I − X̃Ỹ∗ E
− I − EX̃Ỹ∗ R I − X̃Ỹ∗ E + (τ − θ) E I − X̃Ỹ∗ E .
We see that the preconditioning error is enlarged mainly by the projected error
(I − EX̃Ỹ∗ )R(I − X̃Ỹ∗ E) and by the shift (τ − θ) E. Since the projected error will be
significantly smaller then R itself, the main contribution comes from the shift, which
is small if the current eigenvalue approximation θ is close to the target τ. However,
since dominant poles often show a widespread distribution in the complex plane, the
next computed value of θ can in this case very likely be located far away from the
initial target τ and the previous argument might not hold anymore. Hence, updating
the preconditioner appears to be a reasonable strategy. Our numerical examples in the
corresponding paragraph in Chapter 6 confirm this proposition. A possible time for
this update is after a pole has converged or after a restart. Of course, this generates
additional costs. On the one hand the update of the preconditioner itself, and on the
other hand the matrices Yr , Yl (and thus Hr , Hl ) might have to be reconstructed.
4.3. The Alternating Jacobi-Davidson algorithm
59
The iterations of the employed Krylov method are referred to as inner iterations and a
usual practice is to restrict their number. It is also possible to adjust the number of steps
of the iterative solver with the accuracy of the current eigenvector approximation. For
the one-sided Jacobi-Davidson methods this is investigated in [29] for symmetric and
Hermitian eigenproblems using the CG method, and in [51, 52] with the QMR method.
An approach to control the inner iterations for general linear eigenproblems is discussed
in [19]. In this thesis we restrict ourselves to the cases when the correction equations
are solved exactly or when the number of inner iterations is limited by a fixed number.
4.3 The Alternating Jacobi-Davidson algorithm
The two-sided methods of the previous two chapters have the disadvantage that they
require at least twice the computational effort compared to the one-sided methods. The
additional simultaneous computation of approximations for left eigenvectors leads to
the solution of two reduced eigenproblems, two linear systems and an orthogonalization against two subspaces per iteration. In this section we discuss subspace accelerated variants of the alternating Rayleigh Quotient Iteration (ARQI) that can compute
approximate eigentriplets but require the solution of only one of these subproblems
per iteration. In the next section ARQI is improved by subspace acceleration to overcome the slow convergence of the basic Algorithm 3.4. If the subspace expansion is
carried out in a Jacobi-Davidson style, that is, by solving projected linear systems, we
get an alternating Jacobi-Davidson scheme [20, Section 6]. As before, the computation
of dominant poles, deflation, restarts and inexact solves with preconditioning can be
introduced as well and is described subsequently in Subsection 4.3.2-4.3.3.
4.3.1 An alternating subspace accelerated scheme
In this section we consider the ARQI in Algorithm 3.4 and try to improve it by subspace
acceleration and by a Jacobi-Davidson like subspace expansion. Throughout the rest of
this section we assume that right and left eigenvectors are approximated in the odd and
even iterations of the alternating scheme, respectively. To include subspace iteration we
build up one orthogonal search space V whose basis vectors are the columns of the matrix
V := [v1 , . . . , vk ] ∈ Cn×k . Now let v := Vq with q ∈ Ck be a represention of an eigenvector
approximation v in the subspace V associated to the eigenvalue approximation θ. For
odd k we take v as approximation of a right eigenvector and impose a Ritz-Galerkin
condition on the right residual:
r := AVq − θEVq ⊥ V.
Equivalently, (θ, q) is an eigenpair of the reduced eigenproblem Sq = θTq with S :=
V ∗ AV, T := V ∗ EV. Otherwise, if k is even, v is considered as an approximation of a left
Chapter 4. Two-sided subspace accelerated eigenvalue methods
60
Algorithm 4.6 Alternating Jacobi-Davidson method
Input: Matrices A, E, initial vector v0 , tolerance 1.
Output: Approximate eigentriplet (λ, x, y).
1: Set s = v0 , V = [].
2: for k = 1, 2, . . . do
3:
Expand V orthogonally by v := s
4:
V = [V, v].
5:
Compute interaction matrices S = V ∗ AV, T = V ∗ EV.
6:
Compute and sort eigendecompositions
SQ = EQΛ̃ of (S, T) (k odd),
S∗ Q = E∗ QΛ̃ of (S∗ , T∗ ) (k even).
7:
Select suitable approximate eigenpair (θ := λ̃1 , v := Vq1 /kVq1 k2 ),
8:
Compute residuals
r := Av − θEv (k odd),
r := A∗ v − θE∗ v (k even).
9:
if krk2 < then
10:
Improve second vector at will and stop iteration.
11:
end if
12:
Solve s ⊥ v (approximately) from
Evv∗
(A
−
θE)
I−
v∗ Ev
E∗ vv∗
I − ∗ ∗ (A − θE)∗ I −
vEv
I−
13:
vv∗
s = −r (k odd),
v∗ v
vv∗
s = −r (k even).
v∗ v
(4.32)
end for
eigenvector and the Ritz-Galerkin condition is imposed on the left residual:
r := A∗ Vq − θE∗ Vq ⊥ V.
In this case (θ, q) is an eigenpair of the complex conjugated reduced eigenproblem
S∗ q = θT∗ q. In both cases (θ, v := Vq) and (θ, v := Vq) are Ritz pairs of (A, E) and
(A∗ , E∗ ) with respect to V. New basis vectors can be obtained from a solution s of a
linear system that incorporates the current eigenpair approximation (θ, v). The solution
vector s is orthonormalized against V and the result is added as new basis vector. This
alternating scheme is repeated until krk2 drops below a given tolerance .
If the subspace expansion is done in a Jacobi-Davidson style, that is by solving a
suitable correction equation, we obtain the alternating Jacobi-Davidson method (AJD)
[20, Section 6] as it is illustrated in Algorithm 4.6. As in the basic 2-JD (Algorithm 4.3),
the term suitable in step 7 means that, e.g., the approximate eigenvalue closest to a target
τ is selected, together with its corresponding (right) eigenvector. This is crucial in order
to prevent a convergence towards infinite eigenvalues, when E is singular. If E ≡ I, the
4.3. The Alternating Jacobi-Davidson algorithm
61
equations in (4.32) are exactly the original Jacobi-Davidson correction equations [48] for
the standard eigenvalue problems Ax = λx in odd and A∗ y = λy in even iterations. Note
that there are other correction equations possible for the generalized eigenproblem [7].
This alternating scheme has already been generalized for nonlinear eigenvalue problems
in [45, Algorithm 18].
Remark 4.8:
If we substitute (4.32) in step 12 of Algorithm 4.6 by the corresponding linear systems
of ARQI (Algorithm 3.4), that is, (A − θE)s = Ev in odd and (A − θE)∗ s = E∗ v
in even iterations, we get a subspace accelerated version of ARQI. We refer to this
modification in the following as Subspace Accelerated Alternating Rayleigh Quotient
Iteration (SAARQI).
♦
The next theorem states the equivalence between AJD and SAARQI. Although it is
derived very similar to Theorem 4.3, there is no comparable result in the existing
literature known to the author.
Theorem 4.9:
If AJD and SAARQI are started with the same initial vector v0 and the occurring
linear systems are solved exactly, and the same eigenpair selection is used, then both
methods produce, up to a multiplication by a constant, the same vector iterates in
the odd and even iterations.
♦
Proof. Similar to the proof of Theorem 4.3, we rewrite the correction equation (4.32) for
s ⊥ v in the odd steps of AJD as
(A − θE) s −
Evv∗
(A − θE) s = −r.
v∗ Ev
Assuming that A − θE is nonsingular, we get for s the relation
s = −v + (A − θE)−1 Ev
v∗ (A − θE)
s.
v∗ Ev
Replacing (v∗ (A − θE) s) / (v∗ Ev) by µodd yields
s + v = µodd (A − θE)−1 Ev
(4.33)
Obviously, this is a multiple of the iterate obtained with one odd iteration of ARQI.
Using s ⊥ v, we find by multiplying (4.33) with v∗
µodd =
v∗ v
.
v∗ (A − θE)−1 Ev
Similar expressions can be obtained for the update vectors in the even iterations.
62
Chapter 4. Two-sided subspace accelerated eigenvalue methods
4.3.2 Computing dominant poles
The application of the alternating eigenvalue methods for dominant poles computation
appears to be novel and requires some further adjustments. If we want to compute
dominant poles of a SISO system (E, A, b, c), we also need approximate right and
left eigenvectors in each iteration in order to obtain approximate residues. Hence,
the eigenvalue selection in step 6 has to be adjusted appropriately. Let (So , To ) and
(Se , Te ) denote the reduced matrix pairs in the odd and even iterations, respectively.
One intuitive way to obtain the missing left (or right) Ritz vectors is to compute the left
eigenvectors of (So , To ) and (Se∗ , Te∗ ) in odd and even iterations in step 6 of Algorithm
4.6 as well. In particular, if the search space of an odd iteration is k-dimensional, the
eigenvalue decomposition of the reduced pair (So , To ) is of the form
Zo∗ So Qo = Λ̃o = diag λ̃o1 , . . . , λ̃ok , Zo∗ To Qo = Ik ,
(4.34)
where Qo , Zo ∈ Ck×k are the right and left eigenvector matrices. Since k n, the reduced
eigenproblem is of a low dimension so that full space methods can be applied, and the
additional computation of the left eigenvector matrix Zo introduces only a slight amount
of extra work. If V o ∈ Cn×k is the orthogonal basis matrix of the current search space, the
right and left Ritz vectors are then given by X̃o = V o Qo and Ỹo = V o Zo . The associated
approximate residues can be computed as
Ri = cx̃oi ỹo∗
i b , i = 1, . . . , k.
The eigenvalue decomposition (4.34) can then be sorted in the scaled magnitude order
|Ri |/|λ̃oi | or with respect to any other dominance measurement ((2.13) – (2.15)). Afterwards, the Ritz pair (λ̃ j , v := x̃ j = V o qoj ), which corresponds to the most dominant
eigenvalue λ̃oj , is used as current right eigenpair approximation in step 7. Obviously,
the next iteration is an even one and the new search space is now k + 1-dimensional.
Denote its basis matrix by V e ∈ Cn×(k+1) . The new eigenvalue decomposition is now
given by
Ze∗ Se∗ Qe = Λ̃e = diag λ̃e1 , . . . , λ̃ek+1 , Ze∗ Te∗ Qe = Ik+1
with right and left eigenvector matrices Qe , Ze ∈ Cn×(k+1) . However, it is clear that Qe , Ze
are also the left and right eigenvector matrices of the conjugate transposed pair (Se , Te )
from which the usage of the right and left Ritz vectors X̃e = V e Ze and Ỹe = V e Qe appears
plausible. After the computation and ordering of the approximate residues R, the most
dominant left Ritz pair (λ̃ej , v = ỹej := V e qej ) is selected since we assumed that every even
iteration approximates a left eigenvector.
4.3.3 Deflation, restarts and inexact solution of the correction equations
Using the same implicit deflation strategy we embedded in SADPA and 2-JD in the two
previous Sections 4.1 and 4.2 of this chapter, AJD continues, e.g. in odd iterations, with
4.3. The Alternating Jacobi-Davidson algorithm
63
o after one eigentriplet has converged. To prevent a anewed computation
Ṽ o = V o X2:k
of the already found triplets, deflation operators have to be included in the correction
equations. An additional measure is to reorthogonalize the search space against the
found triplets.
Since every odd and even iteration of AJD can be considered as one single iteration
of the standard one-sided Jacobi-Davidson [7, 12, 48] applied to (A, E) and (A∗ , E∗ ),
respectively, similar deflation techniques can be involved here. As before, let X, Y ∈ Cn×p
with Y∗ EX = Ip be the matrices containing the already found eigenvectors. Then this
suggests to use the deflated correction equations
Evv∗
(I − EXY∗ ) (A − θE) (I − XY∗ E) (I − vv∗ ) s = −r,
I− ∗
v Ev
E∗ vv∗
I − ∗ ∗ (I − E∗ YX∗ ) (A − θE)∗ (I − YX∗ E∗ ) (I − vv∗ ) s = −r,
vEv
(4.35)
where the upper one is for the odd and the lower one for the even iterations. It is advised
to reorthogonalize the search space against the found eigenvectors by computing
vj =
p
Y
l=1
I−
xl y∗l E
y∗l Exl
!
v j , j = 1, . . . , k.
This way the search space is purged of any direction to already found eigenvectors that
had possibly entered it.
If the search space reaches a maximum dimension kmax , a restart can be initiated using
the kmin < kmax most promising Ritz vectors.
As in 2-JD, solving the correction equations exactly is often neither feasible nor required.
Instead, inexact solutions involving iterative Krylov methods and preconditioners can
be embedded. Again, this is a straightforward application of the results for the onesided JD methods. Let K ≈ A − θE be a nonsingular preconditioner and let K∗ be also
used as preconditioner for (A−θE)∗ . For its application it has to be projected accordingly:
Evv∗
(I − EXY∗ ) K (I − XY∗ E) (I − vv∗ ) ,
Ko : = I − ∗
v Ev
E∗ vv∗
e
K : = I − ∗ ∗ (I − E∗ YX∗ ) K∗ (I − Y∗ X∗ E∗ ) (I − vv∗ ) ,
vEv
where Ko refers to the projected preconditioners in the odd iterations and Ke for the even
iteration. Both preconditioners have to be applied to their dedicated correction equation
in (4.35). We remark that there is no obvious compact formulation of the deflated and
preconditioned correction equations, in contrast to the result obtained for 2-JD in (4.27).
The presented correction equations are in some sense not consistent with the original
idea behind AJD [20, Section 6] since we applied oblique projectors in order to preserve
the eigenvectors. This is for the computation of dominant poles necessary since left and
right (approximate) eigenvectors are required to construct the approximate residues.
64
Chapter 4. Two-sided subspace accelerated eigenvalue methods
The AJD presented in [20] uses orthogonal projectors which might be more stable from
a numerical point of view. In this case the deflated correction equation would have the
form
(I − vv∗ ) (I − XX∗ ) (A − θE) (I − XX∗ ) (I − vv∗ ) s = −r,
(I − vv∗ ) (I − YY∗ ) (A − θE)∗ (I − YY∗ ) (I − vv∗ ) s = −r,
where the upper equation is used in the odd and the lower one in the even iterations.
The matrices X, Y ∈ Cn×p would then, however, contain approximate right and left
Schur vectors and not eigenvectors.
5
Further improvements and generalizations
In this chapter we discuss some improvements of the methods presented in the previous
chapter. At first we investigate in the next section the use of an harmonic subspace
extraction to enhance the performance of the eigenvalue methods when computing
eigenvalues which lie in the interior of the spectrum.
The subsequent Section 5.2 addresses some generalizations of the eigenvalue methods
for the computation of dominant poles of multivariable transfer functions of MIMO
systems.
5.1 Harmonic subspace extraction
All of the subspace accelerated eigensolvers of Chapter 4 use at some point eigentriplet
approximations obtained from a projected eigenproblem of a small size. In the two-sided
methods, this small eigenproblem itself is obtained by imposing a two-sided (Petrov)Galerkin condition on the right and left residuals corresponding to a approximate triplet
(θ, v, w):
rv := Av − θEv ⊥ W,
rw := A∗ w − θE∗ w ⊥ V.
The approximate eigenvectors are given by v = Vq ∈ V, w = Wz ∈ W with coefficient
vectors q, z ∈ Ck . The matrices V, W ∈ Cn×k contain the basis vectors for the search
spaces V and W. The easier one-sided counterpart is, for instance, a (Ritz)-Galerkin
condition for an approximate eigenpair (θ, v):
r := Av − θEv ⊥ V,
where v is defined exactly as in the two-sided case. With increasing search space dimension, the pairs (θ, v) and triplets (θ, v, w) can be good approximations to eigentriplets
66
Chapter 5. Further improvements and generalizations
(λ, x, y) when the eigenvalue λ lies well separated in the exterior of the spectrum
Λ(A, E), but they might have problems if interior eigenvalues are sought, e.g. if one is
interested in dominant poles of a system (E, A, b, c). Therefore we discuss in the following the so called harmonic subspace extractions that can overcome these problems. A
good overview over one- and two-sided, standard and harmonic extraction approaches
for the eigenvalue problem Ax = λx can be found in [16]. For reasons of simplicity we
start with a recapitulation of the one-sided harmonic extraction.
5.1.1 One-sided harmonic subspace extraction
Assume we want to compute approximate interior eigenvalues of a matrix A ∈ Cn×n
near a target τ ∈ C\Λ(A). Since the standard Galerkin extraction often works well for
exterior eigenvalues, we apply it to the transformed problem:
(A − τI)−1 x = (λ − τ)−1 x.
Clearly, an interior eigenvalue λ close to τ of the original problem Ax = λx is an
exterior eigenvalue of the above transformed one. We look for an approximate triplet
(θ, v) ≈ (λ, x) using a (low-dimensional) search space V. With v ∈ V this leads to
(A − τI)−1 v − (θ − τ)−1 v ⊥ X,
where X is a suitable test space such that the inverse (A − τI)−1 vanishes. The common
choice is
X := (A − τI)∗ (A − τI)V
resulting in the (Petrov-)Galerkin condition
(A − τI)v − (θ − τ)v ⊥ (A − τI)V =: W.
Equivalently, using a basis matrix V for the search space V, we get that the harmonic Ritz
pairs (θ, v) can be obtained from the eigenpairs (ξ, q) of the generalized eigenproblem
V ∗ (A − τI)∗ (A − τI)Vq = ξV ∗ (A − τI)∗ Vq
by θ = ξ + τ and v = Vq (cf. [16, Section 2]). This can be easily carried over to the
generalized eigenproblem [21, Section 2]:
V ∗ (A − τE)∗ (A − τE)Vq = ξV ∗ (A − τE)∗ EVq.
(5.1)
Note that for generalized eigenproblems, the approximations θ are sometimes called
harmonic Petrov values. Multiplying the reduced eigenproblem (5.1) from the left by q∗
and using the Cauchy-Schwarz-inequality leads to
v∗ (A − τE)∗ (A − τE)v = ξv∗ (A − τE)Ev ≤ kξv∗ (A − τE)∗ k2 kEvk2
5.1. Harmonic subspace extraction
67
and thus
k(A − τE)vk2 ≤ |ξ|kEvk2 .
(5.2)
Hence, if θ = τ + ξ is an approximate harmonic Petrov value close to the target τ, the
residual of the associated harmonic Petrov vector v is small. The usage of this harmonic
extraction in the eigenvalue selection step of the (one-sided) Jacobi-Davidson methods
has already been discussed in the original publication [48, Section 5] on JD and later
in the JDQR, JDQZ algorithms. For more informations and implementation details we
refer, for instance, to [12, Section 2] and [3, Algorithms 4.19 and 7.1.9].
Since the AJD is very similar to the one-sided JD, alternately applied to (A, E) and
(A∗ , E∗ ), the usage of the one-sided harmonic extraction is possible. That is, in every odd
step harmonic Petrov pairs for (A, E) are obtained from the small reduced eigenproblem
(5.1) and in every even step for (A∗ , E∗ ) from
V ∗ (A − τE)(A − τE)∗ Vq = ξV ∗ (A − τE)E∗ Vq.
Note that this is not the conjugate transposed eigenproblem (5.1) of the odd step.
5.1.2 Two-sided harmonic subspace extraction
The two-sided Ritz- and Petrov-Galerkin extraction for exterior eigentriplets is well
known and used, for instance, in the two-sided eigenvalue methods SADPA, SA2RQI
and 2-JD we discussed in Section 4.1 and 4.2. Other methods that fit into this framework
are two-sided Lanczos [56, Chapter VIII] and Arnoldi [42] style eigensolvers but also
BiCG [43, Section 7.1.3] for the solution of nonsymmetric linear systems. However, a
two-sided harmonic extraction approach is only rarely found in the literature.
We follow the derivations in [16, Section 3.2] and, for reasons that will become clear
later, we start with the standard eigenproblems



Ax = λx,


A∗ y = λy.
(5.3)
Assume we want to compute approximations for an eigentriplet (λ, x, y) of (5.3) with
λ close to a target τ ∈ C\Λ(A). With the same motivation as in the one-sided case, we
consider the spectrally transformed problems

−1


(A − τI) x


(A − τI)−∗ y
= (λ − τ)−1 x,
−1
= λ − τ y.
(5.4)
Again, we would like to have an extraction that yields an approximation (θ, v, w) ≈
(λ, x, y) without using the inverse of the shifted matrix. Let V, W be again two search
68
Chapter 5. Further improvements and generalizations
spaces and v ∈ V, w ∈ W. This leads to
(A − τI)−1 v = (θ − τ)−1 v ⊥ ((A − τI)∗ )2 W,
−1
(A − τI)−∗ w = λ − τ w ⊥ (A − τI)2 V,
or, equivalently,
(A − τI)v − (θ − τ)v ⊥ (A − τI)∗ W,
(A − τI)∗ w − (θ − τ)w ⊥ (A − τI)V.
(5.5)
Using representations v = Vq and w = Wz of the vectors v and w in the search spaces V
and W, respectively, it follows from (5.5) that the harmonic Ritz triplets can be obtained
from the triplets (ξ, q, z) of the generalized eigenproblems


∗
2

= ξW ∗ (A − τI)Vq,
W (A − τI) Vq
(5.6)


V ∗ ((A − τI)∗ )2 Wz = ξV ∗ (A − τI)∗ Wz,
via (θ = ξ + τ, v = Vq, w = Wz). Left-multiplying the upper eigenproblem in (5.6) with
z∗ yields
w∗ (A − τI)2 v = ξw∗ (A − τI)v
which can, by using the law of cosines, be rearranged to
k(A − θI)∗ wk2 k(A − θI)vk2 k cos ((A − θI)∗ w, (A − θI)v)
= |ξ|k(A − θI)∗ wk2 kvk2 k cos ((A − θI)∗ w, v) .
For the lower equation in (5.6) we find after left-multiplying with q∗ a similar result.
By taking absolute values on both sides we get after some basic manipulations two
relations for the quality of the harmonic Ritz vectors v and w:
k(A − τI)vk2
cos (v, (A − τI)∗ w)
= |ξ| ,
kvk2
cos ((A − τI)v, (A − τI)∗ w) k(A − τI)wk2
cos ((A − τI)v, w)
.
= |ξ| ∗
kwk2
cos ((A − τI)v, (A − τI) w) This reveals that, unless (A − τI)v and (A − τI)∗ w are nearly orthogonal, if the two-sided
harmonic Ritz value θ is close to the target τ, the two-sided harmonic Ritz vectors v and
w are good approximate eigenvectors because of the small residual norms. Furthermore,
if v and w converge to the right and left eigenvector, respectively, both fractions with
the cosines tend to one [16].
Now we try to extent these formalism to generalized eigenproblems



Ax = λEx,


A∗ y = λEy.
5.1. Harmonic subspace extraction
69
To the authors knowledge, a two-sided harmonic extraction for generalized eigenproblems is not present in the literature. Hence, we propose in the sequel two possible
novel approaches. A straightforward application of the two-sided harmonic extraction
process yields a condition similar to (5.5)
(A − τE)v − (θ − τ)Ev ⊥ (A − τE)∗ W,
(A − τE)∗ w − (θ − τ)E∗ w ⊥ (A − τE)V.
Equivalently, ξ = θ − τ, q, and z are the solutions of


∗
2

= ξ1 W ∗ (A − τE)EVq,
W (A − τE) Vq


V ∗ ((A − τE)∗ )2 Wz = ξ2 V ∗ (A − τE)∗ E∗ Wz
(5.7)
which reveals that one now has to deal also with two generalized eigenproblems which
are not conjugate transposed to each other, since the right hand side matrix T1 :=
W ∗ (A − τE)EV in the upper equation in (5.7) is not the conjugated transpose one of
T2 := V ∗ (A − τE)∗ E∗ W in the lower one. Therefore we have to differ between the
eigenvalues ξ1 = θ1 − τ and ξ2 = (θ2 − τ) of both problems. Using S := W ∗ (A − τE)2 V,
(ξ2 , q) is an eigenpair of (S, T1 ) and (ξ2 , z) is one of (S∗ , T2 ).
Hence, we have to choose which approximate eigenvalue, θ1 = τ + ξ1 or θ2 = τ + ξ2 , we
take into account. This is a remarkable difference to the two-sided harmonic extraction
(5.6) for the standard eigenvalue problem and thus it is not clear if it is justified to refer
to the resulting triplets (θ1 , v = Vq, w = Wz) and (θ2 , v = Vq, w = Wz) as two-sided
harmonic Petrov triplets of (A, E) with respect to the search spaces V, W and the test
spaces X := (A − τE)∗ W, Y := (A − τE)V. Nevertheless, we refer to this extraction in the
sequel as generalized two-sided harmonic extraction.
Another possible way to obtain harmonic triplets is to apply the one-sided harmonic
Petrov extraction (5.1) two times simultaneously, that is, once to (A, E) with the search
space V and the second time to (A∗ , E∗ ) using W as search space:
(A − τE)Vq − (θ − τ)EVq ⊥ (A − τE)V,
(A − τE)∗ Wz − (θ − τ)E∗ Wz ⊥ (A − τE)∗ W.
This will also produce two separated eigenproblems



S1 q = ξ1 T1 q,


S2 z = ξ2 T2 z,
(5.8)
with the following notations
S1 := V ∗ (A − τE)∗ (A − τE)V,
S2 := W ∗ (A − τE)(A − τE)∗ W,
T1 := V ∗ (A − τE)EV,
T2 := W ∗ (A − τE)∗ E∗ W,
ξ1 := θ1 − τ
ξ2 := θ2 − τ.
But now the bound (5.2) holds for the Petrov pair (ξ1 = θ1 − τ, v = Vq) and similarly
for (ξ2 = θ2 − τ, w = Wz). In the following, we will call this approach simply double
one-sided harmonic Petrov-Galerkin extraction.
Chapter 5. Further improvements and generalizations
70
Algorithm 5.1 (Λ̃, Q, Z)=SortHarm(S1 , S2 , T1 , T2 , b, c, V, W, τ, γ)
Input: Interaction matrices S1 , S2 , T1 , T2 ∈ Ck×k , input and output vectors b ∈ Rn , c∗ ∈
Rn , basis matrices V, W ∈ Cn×k ,target τ ∈ C, scaling factor γ ∈ [0, 1].
Output: Λ̃ ∈ Ck×k diagonal with poles θi close to τ and in Rγ order, Q, Z ∈ Ck×k
corresponding eigenvector matrices.
1: Compute eigendecompositions of the pairs (S1 , T1 ) and (S2 , T2 ):
S1 Q = T1 QΞ1 , S2 Z = T2 ZΞ2 ,
(j)
(j)
with Ξ j = diag(ξ1 , . . . , ξk ) ( j = 1, 2), Q = [q1 , . . . , qk ], Z = [z1 , . . . , zk ].
2: Compute approximate eigenvectors and residues of (A, E):
X = VQ, Y = WZ, Ri = (cxi )(y∗i b).
3: Select Ξ and compute approximate eigenvalues, e.g. Λ̃ = Ξ1 + τIk .
4: Sort Λ̃, Q, Z decreasingly with respect to (2.13) (or any other dominance criterion)
and (5.9):
Rγ,i = γ|Ri | + (1 − γ)
1
, i = 1, . . . , k.
|θi − τ|
In both approaches one has, unfortunately, to select one of the two obtained values for
the eigenvalue approximation θ. We suggest to take the one with the smallest value of
ξ, since this is apparently the approximate eigenvalue closest to τ. However, since the
harmonic approaches usually give good approximate eigenvectors but the eigenvalue
itself might be of low quality, it is advised to take an additional generalized two-sided
Rayleigh quotient ρ(v, w, A, E) as eigenvalue approximation.
A complete theoretical investigation of the properties of both proposed extraction approaches for the generalized eigenproblem would be beyond the scope of this thesis.
Therefore we present only some numerical examples of the extraction processes included in 2-JD in Chapter 6.
When one is interested in dominant poles, the harmonic extractions enable the computation of the dominant poles which are close to a certain specific target τ ∈ C. There is no
literature known to the author concerning dominant pole computation in combination
with harmonic subspace extrations.
Using one of the harmonic extraction methods, the obtained approximate eigenvalues
θ are close to the target τ. But the closeness alone is no indicator for the dominance of
approximate pole θ. Therefore, in the eigenvalue selection step one has to weight the
distance |θ − τ| (= |ξ|) with one of the dominance measurements, e.g. the magnitude
|R| of the approximate residues. We assume here that, in case of generalized eigenproblems, ξ has been chosen adequately from ξ1 and ξ2 . If θ is an approximate dominant
pole, the value of |R| is large, and if θ is additionally close to τ, the reciprocal distance
|θ − τ|−1 is obviously also large. Therefore, we propose to put both quantities together,
5.2. MIMO Systems
71
for instance, by constructing the convex combination
Rγ := γ|R| + (1 − γ)|θ − τ|−1 , τ , θ,
(5.9)
for an appropriately chosen scaling value γ ∈ [0, 1]. In the dominant pole selection
step we select then the triplets with a large value of Rγ . Note that this is approach
can also be used with the standard Petrov-Galerkin extraction by providing a target τ.
The complete harmonic selection for dominant poles is summarized in Algorithm 5.1
where the involved matrix pairs (S1 , T1 ) and (S2 , T2 ) refer to the interaction matrices of
the generalized two-sided harmonic extraction (5.7) (S2 = S∗ ) and the double one-sided
harmonic Petrov-Galerkin extraction (5.8), respectively. It is possible that using the
reciprocal distance |θ − τ|−1 will lead to the favored selection of eigenvalues close to the
target τ which might be less dominant. Thus, we advise to choose the parameter γ close
to one. We leave a better balancing of these two quantities for future work.
5.2 MIMO Systems
In Chapter 4 we used different eigensolvers for the computation of dominant poles of
scalar transfer function of SISO systems (E, A, b, c, d). Since the number of input and
outputs is in practice often greater than one, we investigate in this section generalizations
of SADPA and 2-JD to MIMO systems (E, A, B, C, D).
5.2.1 Multivariable transfer functions
For a linear time invariant MIMO system (E, A, B, C, D) with E, A ∈ Rn×n , B ∈ Rn×m , C ∈
Rp×n and D ∈ Rp×m , the transfer function H(s) : C 7→ Cp×m is given by
H(s) = C(sE − A)−1 B + D
(recall the derivation in Section 2.3). Since H(s) is a p × m matrix for any complex
number s ∈ C, the gain is not uniquely defined in contrast to the SISO case. The
common generalization of the gain concept is the usage of the smallest and largest
singular values of H(iω) for frequencies ω ∈ R+ , which is motivated by
σmin (H(iω)) ≤
kH(iω)u(iω)k2
≤ σmax (H(iω))
ku(iω)k2
which holds for a square transfer function (p = m). The vector u(iω) denotes the input
vector at the frequency ω. For non-square transfer functions (p , m), only the upper
bound holds. The smallest and largest singular values are also called smallest and largest
principal gains and can be plotted against the frequency ω in a sigma plot.
Similar to the SISO case there exists an expression of H(s) of the form
H(s) =
r
X
j=1
Rj
s − λj
+ D + R∞
(5.10)
72
Chapter 5. Further improvements and generalizations
with residue matrices R j = (Cx j )(y∗j B) over the r finite eigentriplets (λ j , x j , y j ) (y∗i Exk =
δik ). The term R∞ corresponds again to the constant contribution of the infinite poles.
This represention admits similar definitions of modal dominance as for scalar transfer
function, for instance, a pole λ j of H(s) is called (MIMO) dominant pole if σmax (R j )
is large (see (2.17)–(2.19) in Definition 2.3). Since these dominance definitions do not
depend on D, we may assume without loss of generality in the following D = 0. In
fact, a nonzero D will only increase the dominance measures by an additive constant.
Dominant poles can be observed as peaks in the σmax -plot of H(s) at frequencies close
to the imaginary parts of these poles. Modal approximation for MIMO systems uses
then modal equivalents Hk (s) consisting of the k terms in (5.10) that correspond to the k
most dominant poles. Note that there are at least min(m, p) different poles necessary to
obtain a modal equivalent with a nonzero σmin -plot [27].
5.2.2 The Subspace Accelerated MIMO Dominant Pole Algorithm
Recall the derivation of the original DPA as Newton scheme in Section 3.3. The approach is expanded to multivariable transfer functions in [38] and [36, Chapter 4]. We
follow this approach and start for simplicity with square transfer functions (m = p)
and E = I. For general MIMO transfer function it holds that σmax (H(s)) → ∞ as s ∈ C
approaches a dominant pole of H(s). If H(s) is square, there is the equivalent statement
that λmin (H(s)−1 ) → 0 at a dominant pole s ∈ C. This motivates to apply a Newton
scheme to the objective function
f : C → C : s 7→ λmin (H(s)−1 )
(5.11)
in order to find those values s ∈ C for which f is zero. Now denote by (µ(s), u(s), z(s))
an eigentriplet with z∗ (s)u(s) = 1 of H(s)−1 , that is


−1

H (s)u(s)


H−∗ (s)z(s)
= µ(s)u(s),
= µ(s)z(s).
(5.12)
Assuming that H−1 (sk ) has distinct eigenvalues, the derivative of the parameter dependent eigenvalue µ(s) of H−1 (s) is, as stated in [25], given by
dµ(s)
dH−1 (s)
= z∗ (s)
u(s).
ds
ds
And since
dH−1 (s)
dH(s) −1
= H−1 (s)
H (s),
ds
ds
dH(s)
= C(sI − A)−2 B,
ds
5.2. MIMO Systems
73
this leads together with (5.12) to
dµ(s)
= z∗ (s)H−1 (s)C(sI − A)−2 BH−1 (s)u(s)
ds
= µ2 (s)z∗ (s)C(sI − A)−2 Bu(s).
Now let (µmin , umin , zmin ) be the smallest eigentriplet of H−1 (s). A Newton step then
becomes
sk+1 = sk −
= sk −
= sk −
f (sk )
f 0 (sk )
µmin
2
∗
µmin z C(sk I −
A)−2 Bu
1
µmin z∗ C(sk I
− A)−2 Bu
1 1
,
µmin w∗ v
(5.13)
(sk I − A)v = Bumin and (sk I − A)∗ w = C∗ zmin .
(5.14)
= sk −
where v, w ∈ Cn are the solutions of
which we recognize as modified versions of the linear systems of DPA (Algorithm 3.3)
for scalar transfer functions. An algorithm that computes a single dominant pole of
H(s) is the MIMO Dominant Pole Algorithm (MPD) [38, Algorithm 1], [27].
For a non-square transfer functions, an appropriate objective function for the Newton
scheme is
f : C → R : s 7→ [σmax (H(s))]−1 .
In this case the vectors u ∈ Rm , z ∈ Rp are the right and left singular vectors associated
to the largest singular value σmax of H(s). For more details on the non-square case we
refer to [38, Section IV.C] and [36, Section 4.4.3]. In both cases, computing u and v
requires the construction of the transfer function H(s) in each iteration. Note that the
above framework can easily be adapted to MIMO descriptor systems (E, A, B, C) by
replacing the identity I by E in most places [36, Section 4.4.1].
If the solutions v, w of (5.14) are kept in (bi-E-)orthogonal search spaces with V, W
as basis matrices, we can invoke subspace acceleration, so that instead of the Newton
update (5.13), a small eigenproblem with (S := W ∗ AV, T := W ∗ EV) is solved in every
iteration. However, compared to SADPA (Algorithm 4.1), the only significant change in
this step is that the approximate p×m residue matrices are computed instead of the scalar
quantities in the SISO case. The Petrov triplets are ordered, e.g., with respect to kR̂k2 =
σmax (R̂), and the most dominant one is selected as current eigentriplet approximation.
See Algorithm 5.2 for the complete eigenvalue extraction routine of SAMDP.
74
Chapter 5. Further improvements and generalizations
Algorithm 5.2 (Λ̃, Q, Z)=SortM(S, T, B, C, V, W)
Input: Interaction matrices S, T ∈ Ck×k , input and output mappings B ∈ Rn×p , C ∈ Rm×n ,
basis matrices V, W ∈ Cn×k .
Output: Λ̃ ∈ Ck×k diagonal with poles λ̃i in residue order, Q, Z ∈ Ck×k corresponding
eigenvector matrices.
1: Compute eigendecomposition of pair (S, T):
SQ = TQΛ̃, Z∗ S = Λ̃Z∗ T,
with Λ̃ = diag(λ̃1 , . . . , λ̃k ), Q = [q1 , . . . , qk ], Z = [z1 , . . . , zk ].
Compute Petrov vectors of (A, E): X = VQ, Y = WZ.
3: Compute residue matrices Ri = (Cxi )(y∗i B).
4: Sort Λ̃, Q, Z decreasingly with respect to a MIMO system dominance measurement
((2.17)-(2.19)).
2:
The other important strategies: deflation and restarts are included in almost the same
way as in SADPA. The resulting algorithm is called Subspace Accelerated MIMO Dominant
Pole Algorithm (SAMDP) [38, Algorithm 2].
To conclude, the following three main changes have to be made in SADPA for the
computation of MIMO dominant poles:
• Compute the smallest eigentriplet (µmin , u, z) of H−1 (sk ) if m = p or the largest
singular triplet (σmax , u, z) of H(sk ) if m , p at the current shift sk . In both cases,
inverse iteration can be applied since only one triplet is sought.
• Solve the modified linear systems (5.14) using the vectors u, z.
• Use Algorithm 5.2 to reveal the most dominant approximate triplet.
In the next section we discuss how this modified steps can be carried over to admit the
computation of MIMO dominant poles in 2-JD.
5.2.3 Computation of MIMO dominant poles with 2-JD
The application of 2-JD for dominant pole computations of SISO systems is already
discussed in [36, Section 3.6.]. However, a similar usage of 2-JD for the computation of
multivariable transfer function dominant poles is not covered in the existing literature.
From the previous Chapters 3 and 4 we know that the main difference of SADPA,
SA2RQI and 2-JD are the involved linear systems whose solutions are used to expand
the search spaces. The linear systems in SADPA have the input and output vectors b and
c as right hand sides. Likewise, SAMDP uses the input and output mappings B and C in
the right hand sides in (5.14). However, since the right hand sides of 2-JD do not depend
on b and c, but on the current eigenvector approximations v and w, this gives hope that
5.2. MIMO Systems
75
we can compute MIMO dominant poles with 2-JD without the (possibly expensive)
computations of the directions u, z in each iteration. The MPD linear systems (5.14)
might still be a practical way to generate initial vectors v0 , w0 for 2-JD, for which we
used v0 = (A − s0 E)−1 b and w0 = (A − s0 E)−∗ c∗ for a given initial shift s0 ∈ C in the SISO
case. In a 2-JD variant for MIMO dominant pole computation, it appears reasonable to
compute at first the right and left singular vectors u, z that correspond to the largest
singular value σmax (H(s0 )). Afterwards, the initial vectors are obtained from
v0 = (A − s0E)−1 Bu and w0 = (A − s0E)−∗ C∗ z.
(5.15)
In the eigenvalue selection step, one has to use Algorithm 5.2. Although the linear
system needed for generating H(s0 ) = c(A − s0 E)−1 b and for the initial shifts have to
be computed exactly, the correction equations can still be solved inexactly by iterative
methods.
It can be observed in numerical experiments that 2-JD with this two small changes is
indeed capable of computing MIMO dominant poles, although the accuracy of the resulting modal equivalents is sometimes worse than the accuracy of the SAMDP reduced
order models
The same strategy for generating the initial vectors can, theoretically, also be applied
more than this single time. This way, additional valueable input and output information
can be inserted into the search spaces, for instance, after a dominant pole is detected
or after a restart has been invoked. The construction of this combination of 2-JD and
SAMDP might be designated for future considerations.
76
Chapter 5. Further improvements and generalizations
6
Numerical examples
After we have investigated some algorithms that can compute dominant poles of LTI
systems for our modal approximation approach, we finally run some numerical experiment with these methods. We run experiments concerning mainly numerical issues of
the eigenvalue computation as well as some actual modal approximation experiments.
All experiments have been carried out in MATLAB ® 7.10 on a machine with an
Intel ® Pentium ® 4 CPU with 3.2 GHZ and 1 GB RAM.
Example 1 (Orthogonalization of the search spaces) In our first example we start
with a small experiment that shows the effects of the different orthogonalization possibilities for the search spaces. For this purpose we use the PEEC model [8] which is
a descriptor system (E, A, b, c) with singular E of order n = 480. The small order of
this system will enable us to compute the complete eigendecomposition with the QZ
method for comparison reasons (as executed in Example 2).
The minimum and maximum dimensions for the search spaces are kmin = 2 and kmax =
10, and the error tolerance is set to = 10−8 . The starting value required by SADPA
(Algorithm 4.1) is s0 = 1i and the initial vectors for SA2RQI and 2-JD (Algorithm 4.5)
are generated via v0 = (A − s0 E)−1 b and w0 = (A − s0 E)−∗ c∗ . All occurring linear systems
are solved using LU decompositions and deflation is carried out via the input and
output vectors as in (4.4). Additionally, the search spaces are reorthogonalized against
the eigenvectors corresponding to the most recently detected eigentriplet. Extensive
numerical tests revealed that this seems to be the most cost efficient and stable deflation
strategy. We let the methods run 25 iterations and use the scaled residue magnitudes
(2.14) as dominance criterion.
In Figure 6.1 the convergence histories are displayed where the plotted lines represent
min(krv k, |rw k) and the horizontal black dotted line shows the error tolerance. Each time
a residual norm falls below this error tolerance line, a dominant eigentriplet is detected.
To the left, Figure 6.1a shows the results when the search spaces are kept bi-E-orthogonal.
Chapter 6. Numerical examples
78
100
norm of residual
norm of residual
100
10−5
10−10
10−15
10−10
2JD
SA2RQI
SADPA
5
20
10
15
number of iteration
(a) W ∗ EV = I
10−5
25
10−15
2JD
SA2RQI
SADPA
5
20
10
15
number of iteration
25
(b) W ∗ W = V ∗ V = I
Figure 6.1: Convergence histories for 2-JD, SA2RQI and SADPA for the PEEC model [8]
(n = 480) with bi-E-orthogonal (a) and orthogonal (b) search spaces.
Despite some minor deviations, all three methods behave equivalently until iteration
12, where a restart is initiated (see also Theorem 4.4). After that the deviations become
slightly larger and SADPA begins to stagnate after the fourth pole (pair of complex
conjugate eigentriplets) is detected in iteration 18. The situation with orthogonal search
spaces is illustrated in Figure 6.1b on the right side. The convergence histories of SADPA
and SA2RQI are almost the same as before until iteration 18 whereas 2-JD converges
slower and less regular than in the bi-E-orthogonal case. Further experiments lead
to the observation that this is a quiet typical behavior of orthogonal 2-JD, potentially
caused by the deflation and projection operators in the correction equations. Although
not shown here, the difference between bi-orthogonal and orthogonal search spaces is
less drastic for state-space systems (E = I).
This motivates use bi-E-orthogonal 2-JD (for descriptor systems) as we will do in the
sequel. From a numerical point of view this may appear risky since it may happen that
a matrix-vector product involving the singular matrix E is zero. This is in particular
problematic in the orthogonalization of the correction vectors from the linear systems.
To circumvent this problem one can monitor the orthogonalization process and add a
random vector to the search space if it holds indeed Es = 0 for a current correction s.
We remark that this was only rarely observed in our experiments and that especially
the orthogonal variant of 2-JD showed often a worse performance.
For the PEEC model, 2-JD seems to be able to compute more accurate eigentriplets than
the other two methods, such that, if the error tolerance is lowered to = 10−10 , SADPA
and SA2RQI had severe problems to converge to the desired accuracy. Furthermore,
because of the fixed right hand sides in the linear systems in SADPA, the obtained
solutions v and w make a very small angle with the current search spaces and thus
79
there is the possibility that the iteration stagnates as it can be in Figure 6.1a. See also
the numerical example in [36, Section 3.5.2], where the author suggested to use a few
additional steps of 2-RQI to overcome this stagnation. This might also be an adequate
strategy to speed up the convergence and avoid unnecessary stagnation in exact 2-JD.
Example 2 (Dominance definitions) We use again the PEEC model and investigate
the effects of the different criteria of modal dominance (2.13)-(2.15). To see which one
leads to the best modal approximation we compute at first the complete generalized
eigendecomposition using the QZ method and select the k = 80 most dominant poles
with respect to all three dominance definitions. These three partial eigendecomposition
are afterwards used to construct reduced order models H̃ j ( j = 1, 2, 3) where H̃1 denotes
the result when the dominance criterion (2.13) (|R|) was used, while H̃2 and H̃3 are the
results with (2.14) (|R|/| Re (λ)|) and (2.15) (|R|/|λ|), respectively.
We let also 2-JD compute k = 80 eigenvalues with respect to all three criteria using
s0 = 1i. The error tolerance is set to = 10−8 and other settings are kept as before. To
improve the performance and avoid stagnation we also used switching to 2-RQI and
eigenvalue tracking with RQI = 10−6 and tr = 10−3 . The results are plotted in Figure 6.2
which shows the Bode plots and the relative errors |H(s) − H̃ j (s)|/|H(s)| ( j = 1, . . . , 3) of
the modal equivalents of order k = 80 obtained with all three dominance definitions and
both methods. The original transfer function shows high frequency oscillations with
numerous peaks in the frequency interval [100 , 101 ] which are caused by many poles
whose imaginary parts lie in this region. These poles are dominant with respect to all
dominance measurements. The results for the direct approach using the QZ method are
illustrated in Figure 6.2a and 6.2b. Apparently, the dominant poles of the highly oscillating area are selected by all measurements and hence the relative error in this frequency
interval is small. However, in each case there is a number of other dominant poles
which are different in each case, for instance the scaled residue criterion |R|/| Re (λ)|
is in favor of poles with small real part. This approach leads to the selection of less
dominant, almost imaginary eigenvalues which do not have a significant contribution
to the frequency response. The associated modal equivalent H̃2 has the worst accuracy
compared to H̃1 and H̃3 which match the original model H by far better in the Bode plot,
as well as in terms of the relative error.
Another peak is located close to ω = 50 and caused by the dominant pole λ∗ ≈
−3.233 ± 49.995i which was not selected by any criterion. Observe that this particular peak is not as high as the rest of the transfer function at the frequencies before. It is
therefore less dominant than the poles in the oscillating region which explains why it is
not detected by the dominance criteria.
In the iterative eigenvalue methods eigentriplets are selected with respect to the approximate residues in the subspace extraction step (cf. Algorithm 4.2). The chosen
dominance definition determines which approximate eigentriplet is selected and therefore it basically steers the process. This has a noticeable impact on the convergence
Chapter 6. Numerical examples
80
0
Relative error
Gain (dB)
10−1
−100
−200
H exact
H̃1 |R|
|R|
| Re (λ)|
H̃3 |R|
|λ|
H̃2
−300
10−1
100
102
101
Frequency (rad/sec)
10−3
10−5
H̃1 |R|
|R|
| Re (λ)|
H̃3 |R|
|λ|
H̃2
10−7
103
10−1
(a) Bode plot, QZ
100
102
101
Frequency (rad/sec)
103
(b) relative error, QZ
0
Relative error
Gain (dB)
10−1
−100
−200
H exact
H̃1 |R|
|R|
| Re (λ)|
H̃3 |R|
|λ|
H̃2
−300
10−1
100
102
101
Frequency (rad/sec)
(c) Bode plot, 2-JD
10−3
10−5
H̃1 |R|
|R|
| Re (λ)|
H̃3 |R|
|λ|
H̃2
10−7
103
10−1
100
102
101
Frequency (rad/sec)
103
(d) relative error, 2-JD
Figure 6.2: (a) Bode plot and (b) relative error of the original PEEC model and the modal
equivalents of order k = 80 obtained directly with the QZ algorithm. Figure
(c) and (d) show the results obtained with 2-JD.
behavior which is in line with the results illustrated in Figures 6.2c and 6.2d. Again,
using |R|/| Re (λ)| results in the worst approximation, but this time it is better than in the
direct approach which is clearly visible in the Bode and relative error plot. A possible
explanation for this somewhat suprising result is that some dominant poles might be
missed by the eigenvalue selection step and other slightly less dominant ones are detected instead. However, these poles might induce a more accurate approximation at
other frequencies and therefore yield an overall better approximation. This was especially observed if single peaks in Bode plot are caused by several poles which are located
81
relatively close to each other. If the method misses one of these clustered poles, it might
possibly detect another one instead which is responsible for a peak located somewhere
else.
The results for both other measurements are very similar to the ones obtained with
the QZ method with one glaring difference. With the residue magnitude |R| without
the scaling by | Re (λ)| or |λ|, the algorithm was able to detect the pole λ∗ . Therefore,
the corresponding modal equivalent H̃1 matches the original transfer function around
this peak much better than the other two approximations. Further similar experiments
revealed that this might indeed be the optimal dominant measurement for this example.
However, this result is to be considered differently in terms of the required computational effort because the different criteria also have a huge impact on the convergence
speed. For instance, 2-JD needed 621 iterations to detect the 80 eigenvalues for H̃1 ,
210 for H̃2 and 173 iterations for H̃3 . Similar observation were made for SADPA and
SA2RQI, although they had bigger problems for the PEEC example to converge to the
desired accuracy.
Numerous other comparable experiments lead to the conclusion that, when using iterative eigenvalue methods, the choice of the optimal dominance criterion is often to
some degree problem dependent. Using the scaled residues magnitudes |R|/| Re (λ)| or
|R|/|λ| resulted in some cases in a good performance when the transfer function has
several dominant poles located close to the imaginary axis or close to the origin and
with relatively small distances to each other. This is, for instance, the case for the PEEC
model in the frequency region [100 , 101 ]. However, if the Bode plot shows peaks at
low high frequencies as well, as is the case for the pole λ∗ of the PEEC model, the plain
residue magnitude |R| worked often better since it emphasizes only the residues and
not the magnitudes of | Re (λ)| or |λ|, too.
Example 3 (Exact solves) Now we test SADPA, SA2RQI and 2-JD on the much larger
Brazilian Interconnected Power System1 (BIPS) [39] of order n = 13.251. The BIPS is
a SISO descriptor system with singular E which is diagonal and has only the entries
0 or 1. We show that the equivalence of the three methods (cf. Theorem 4.4) holds
in practice only certain times. Motivated by the first example, we use bi-E-orthogonal
search spaces and work with the cheap deflation via the input and output vectors, and
reorthogonalize the search spaces only against the recently found eigentriplet. The
settings for kmin , kmax are the same as in the previous example, but the error tolerance is
= 10−10 . The initial shift for SADPA is s0 = 1i and the initial vectors for SA2RQI and 2JD are also generated as before. We let the methods compute 100 dominant eigentriplets
again with respect to the scaled residue magnitude |R|/| Re (λ)|.
Figure 6.3 shows the convergence history of the first 50 iterations of all three methods.
Apart from some small deviations, the equivalence of the methods is apparent until the
first 10 eigenvalues (5 complex conjugate pairs) are detected. Afterwards, at iteration
1
available at http://sites.google.com/site/rommes/software
Chapter 6. Numerical examples
82
norm of residual
10−1
10−6
10−11
10−16
2-JD
SA2RQI
SADPA
5
10
15
20
25
30
35
number of iteration
40
45
50
Figure 6.3: Convergence histories for 2-JD, SA2RQI and SADPA for the BIPS model
(n = 13.251). All linear systems were solved exactly.
Pole
λ1
λ2
λ3
λ4
λ5
λ6
λ7
λ8
λ9
λ10
λ11
λ12
−0.0335
−0.5208
−0.5567
−2.9445
−0.1151
−6.4446
−4.9203
−7.5118
−2.3488
−10.068
−1.4595
−0.75884
±
±
±
±
±
±
1.0787i
2.8814i
3.6097i
4.8214i
0.2397i
0.0715i
±
±
±
±
±
0.2321i
11.001i
1.1975i
10.771i
4.9367i
scaled residues
|R|/| Re (λ j )|
2.7558 · 10−3
1.448 · 10−3
1.3427 · 10−3
1.0262 · 10−3
9.3647 · 10−4
8.0099 · 10−4
6.8812 · 10−4
4.8257 · 10−4
4.3162 · 10−4
3.8148 · 10−4
3.6353 · 10−4
2.3467 · 10−4
found in iteration
SADPA SA2RQI 2-JD
4
4
4
13
13
13
17
17
17
41
(122)
(53)
9
9
9
(92)
−
−
(104)
−
−
(86)
−
−
(51)
−
−
(81)
−
−
32
28
28
44
(100)
41
Table 6.1: Excerpt of the found poles and corresponding residues of the BIPS system for
the three methods. Iteration numbers marked with brackets represent poles
that were found after the first 50 iterations while a minus sign indicates that
the pole was not found by the particular method.
23, a second restart is initiated and rounding errors begin to spoil the processes such
that the methods behave differently but still converge to a number of further eigenvalue
pairs. However, these pairs found by each method after the restart are now different.
Table 6.1 is a summary of the first few dominant poles detected in this experiment
and also shows the iteration number in which the pole was found. A minus sign
indicates that the methods did not compute the particular pole, while numbers in
brackets refer to poles that where found after the 50 iterations of Figure 6.3. Note
−50
100
−60
10−2
relative error
Gain (dB)
83
−70
Orig. Model
R.o.m. SADPA
R.o.m. 2-JD
R.o.m. SA2RQI
−80
2
6
8
4
Frequency (rad/sec)
(a) Bode plot
10−4
10−6
10
R.o.m. SADPA
R.o.m. 2-JD
R.o.m. SA2RQI
2
6
8
4
Frequency (rad/sec)
10
(b) relative error
Figure 6.4: (a) Bode plot and (b) relative error of the original BIPS model and the reduced
order models (r.o.m.) of order k = 100 obtained with 2-JD, SA2RQI and
SADPA.
that the order in which the poles are found is not the order of the scaled residues.
Some dominant poles are missed by 2-JD and SA2RQI, e.g., both methods fail to detect
λ6 ≈ −6.445 ± 0.071i. Although some of the missed poles are possibly found during the
subsequent iterations, further similar experiments revealed that 2-JD and SA2RQI tend
to miss some dominant poles and compute more less dominant ones instead. The better
convergence of SADPA towards dominant poles can be explained by the input and
output vectors b and c which are placed as fixed right hand sides for the linear systems.
In 2-JD and SA2RQI information from b and c enters the process only approximately
during the subspace extraction procedure. Moreover, as more poles are detected, it
could be observed that 2-JD begins to compute a lot less dominant poles with very
small or zero imaginary part. An explanation could be that, caused by the oblique
projectors in the correction equations, it suffers more badly from rounding errors than
the other methods. It is therefore advised to compute the eigentriplets more accurately,
e.g. by using = 10−12 where 2-JD showed a better performance. Likewise, with
= 10−8 it suffered even more severe from the possible numerical instabilities and the
convergence was much less regular. The orthogonal variant of 2-JD showed worse
performance in most experiments. The detection of more less dominant poles was still
present. Because of this and the fact that the linear systems of 2-JD are more expensive
to solve than the ones of SADPA and SA2RQI, we conclude, that if exact solves are
affordable, SADPA is the methods of choice for this example.
We continued this experiment and used k = 100 detected eigentriplets to construct
reduced order models (r.o.m.) for all three algorithms. The original BIPS transfer
function, the three modal equivalents of order k = 100, and the corresponding relative
84
Chapter 6. Numerical examples
errors are plotted in Figure 6.4. It is no surprise that SADPA delivers the most accurate
modal equivalent since it computed the most dominant poles. With the fewest computed
truly dominant poles, the modal equivalent of 2-JD is the worst approximation in this
example. The relative error plot in Figure 6.4b shows that around the frequency range
where the protruding peaks in the Bode magnitude plot are located, the approximation
obtained with SA2RQI is also quiet good but less accurate in the higher frequency
region. Also in this area, the relative error of the 2-JD and SA2RQI reduced order
models is very small around two particular locations. This is no result from which one
should conclude a better approximation there since these two dips are located exactly
where the modal equivalents cross the original transfer function.
Example 4 (Inexact solutions and preconditioning) In our next experiment we
solve the linear systems only approximately by applying 10 steps of GMRES with the
exact factorization LU = s0 E − A (s0 = 1i) as fixed preconditioner throughout the whole
process which is a common strategy mentioned in the literature [12]. All other settings
are kept as in the previous example. The convergence histories are shown in Figure
6.5a.
All methods detect the first dominant pole λ1 ≈ −0.0335 ± 1.079i after the first iterations
almost as fast as in the previous example with exact solves. Afterwards, SADPA
ends up in stagnation while 2-JD converges, although irregularly and slower, to λ5 ≈
−0.115 ± 0.239i in iteration 24. The subscripts indicate the position of the poles in
Table 6.1 from the previous example. SA2RQI also detects λ5 but needs 34 iterations.
An interesting observation is that the farther away the approximate pole is from the
initial value s0 = 1i, and therefore from the shift for the preconditioner, the stronger the
convergence seems to deteriorate. This suggests that the usage of a fixed preconditioner
is not always a good choice for the computation of dominant poles because they are
usually scattered in the whole complex plane. An update of the preconditioner might
be a necessary strategy.
Figure 6.5b shows the convergence histories when the preconditioner is updated each
time after a pole is converged, and after a restart. The new preconditioner is computed
as LU = θE − A with the current eigenvalue approximation θ. This accelerates the
convergence speed of all methods significantly, e.g, all methods detect the poles λ1 and
λ5 after 12 iterations. Afterwards, SADPA stagnates again for a while until it detects
two less dominant poles around iteration 43. SA2RQI and 2-JD find, at a different speed,
the poles λ2 ≈ −0.521 ± 2.881i and λ3 ≈ −0.557 ± 3.61i. Later on, 2-JD also converges
to λ12 ≈ −0.758 ± 4.937i and λ13 ≈ −0.693 ± 3.253i (not contained in Table 6.1) around
iteration 35. At first glance, updating the preconditioner appears promising, but it is
difficult to find an optimal update time during the iteration. This can be observed
around (and after) iteration 25 where the chosen shift is inappropriate and the methods
loose track for a period of time. The usage of multiple user provided shifts, for instance
the approximate location of the dominant poles, could help to prevent this issue. If the
original transfer function is known, for example if it can be identified by measurements,
85
100
norm of residual
norm of residual
100
10−5
10−10
10−15
10−10
2-JD
SA2RQI
SADPA
20
30
10
40
number of iteration
(a) fixed preconditioner LU = iE − A
10−5
50
10−15
2-JD
SA2RQI
SADPA
20
30
10
40
number of iteration
50
(b) updated preconditioner LU = θE − A
Figure 6.5: (a) Convergence histories for 2-JD, SA2RQI and SADPA for the BIPS system
(n = 13.251). All linear systems were solved with 10 steps GMRES and LU =
iE − A as fixed preconditioner. (b) The same as (a), but the preconditioner is
updated after a triplet has been detected and after a restart.
one could look at the frequencies ω where the peaks are located and choose iω as shifts.
Furthermore, using other more cost-efficient preconditioners, e.g. incomplete LU decompositions, can be used to reduce the additional computational effort brought by the
updates. Unfortunately it is difficult to obtain useful incomplete factorizations for the
generalized eigenvalue problem, that is for pencils A − θE. For the BIPS model, for
instance, the drop tolerance for an inexact LU decomposition had to be set very small
to ≈ 10−6 in order to get nonsingular factors, such that one could keep the exact factors. Experiments using the modified inexact LU decomposition also did no significant
improvements.
Example 5 (Alternating eigenvalue methods) In this experiment we use the clamped
beam model2 [8] of order n = 348 to demonstrate the performance of the alternating
eigenvalue algorithms which were introduced in Section 4.3. The following settings
are used: = 10−9 , kmin = 4, kmax = 15, s0 = 1i. We inserted two initial vectors
v0 = (A − s0 E)−1 b and w0 = v1 = (A − s0 E)−∗ c∗ ,which are orthogonalized against each
other to form a two-dimensional initial subspace. Note that the intrinsic alternating
methods require only one initial vector, but with respect to dominant pole computation
the additional vector w0 appears reasonable since it brings additional information from
the output vector c into the initial search space. With the scaled residue magnitude
(2.14) we test SAARQI and AJD (Algorithm 4.6) for this system of rather moderate size.
2
available at http://www.icm.tu-bs.de/NICONET/benchmodred.html
Chapter 6. Numerical examples
86
norm of residual
103
10−3
10−9
SAARQI
AJD
10−15
5
10
15
20
25
30
35
number of iteration
40
45
50
Figure 6.6: Convergence histories for SAARQI and AJD for the clamped beam model
(n = 348). All linear systems were solved exactly using LU decompositions.
101
100
relative error
Gain (dB)
50
0
−50
10−1
Orig. Model
R.o.m. SAARQI
R.o.m. AJD
100
101
Frequency (rad/sec)
(a) Bode plot
10−1
10−2
10−3
102
10−4 −1
10
R.o.m. SAARQI
R.o.m. AJD
100
101
Frequency (rad/sec)
102
(b) relative error
Figure 6.7: (a) Bode plot and (b) relative error of the original beam model and the k = 14
modal equivalents obtained with AJD and SAARQI.
The convergence history of the first 50 iterations is plotted in Figure 6.6.
Although the proposed equivalence of both alternating methods (Theorem 4.9) is observable in the first iterations, it quickly vanishes after the first pole is detected and
deflated in iteration 18. Furthermore, SAARQI and AJD suffer from a loss of track
early in the process around iteration 10. Unfortunately, this was observed often during
this experiment. A possible explanation is that, for instance in the odd iterations, the
computed approximate left eigenvectors are of low quality due to the usage of a single
search space (see Section 4.3.2). In the even iterations a similar argument holds for the
87
approximate right eigenvectors. This leads to approximate residues which might be
only of moderate or even low accuracy. Hence, it is very likely that the errors in the
residues lead to a misselection of the dominant eigentriplets. This prolongs the process
remarkably since the algorithms lose track of the iteration more often. The experiment
was continued until iteration 300 but both alternating methods managed to detect in
total only 14 dominant eigentriplets.
The obtained modal equivalents are nevertheless shown in Figure 6.7. The Bode magnitude plot in Figure 6.7a shows that the first outstanding peaks of the exact model are
matched accurately but the subsequent peaks are not reproduced by both reduced order models obtained with both alternating methods. Hence, the relative error in Figure
6.7b is small in the lower but large in the higher frequency region resulting in a modal
equivalent of overall poor quality. This and the many other similar experiments with
different systems disqualify the alternating methods for the computation of dominant
poles. A more robust computation strategy for the approximate residues is needed here.
On the contrary, further experiments investigating intrinsic eigenvalue computations,
that is without the selection of eigentriplets based on the approximate residues, revealed a much better performance of the alternating methods. In most examples the
convergence was, for instance compared to 2-JD, rather slow.
Example 6 (Harmonic subspace extractions) In our next experiment we test the
two-sided harmonic extraction approaches which were introduced in Section 5.1. We
re-run Example 4 with the BIPS model and use 2-JD with the standard, generalized
two-sided harmonic (5.7) and the double one-sided harmonic Petrov-Galerkin extraction (5.8). The target is τ = s0 = 1i and the scaling factor for Algorithm 5.1 is γ = 0.95.
The target also serves as shift for the preconditioner LU = iE − A which is kept fixed
and the other parameters are set as in Example 4.
Figure 6.8 shows the convergence history for 100 iterations in all three cases. Apparently,
with the generalized two-sided harmonic approach, 2-JD did not converge and stagnates
with min(krv k2 , krw k2 ) ≈ 10−2 , whereas with the double one-sided harmonic PetrovGalerkin extraction it finds five poles and nearly converges to a sixth one in the end.
Using the standard extraction results in three detected eigentriplets, but the convergence
is remarkably slower and less regular. For both approaches that did find eigentriplets,
the found poles are summarized in Table 6.2.
The double one-sided harmonic Petrov-Galerkin extraction does not only find the most
poles, the found ones are also more dominant. However, the term dominant has to be
put in a relative context here since the associated scaled residue magnitudes are very
low. The extraction approach with γ = 0.95 seems still to emphasize the distance to the
target τ more than the actual dominance. We leave the construction of a possibly more
balanced weighting between distance and dominance for future work.
Chapter 6. Numerical examples
norm of residual
88
10−2
10−6
10−10
10−14
Gen. 2-harm.
Double 1-harm.
Standard
20
10
30
50
60
70
40
number of iteration
80
90
100
Figure 6.8: Convergence histories for 2-JD with standard, generalized two-sided harmonic (gen. 2-harm.) and double one-sided harmonic Petrov-Galerkin
(double 1-harm.) extraction for the BIPS model and τ = i, γ = 0.95. All
linear systems were solved with 10 steps of GMRES and LU = τE − A.
Pole
λ1
λ2
λ3
λ4
λ5
−0.0335
−0.4137
−0.4562
−0.4709
−0.4184
±
±
±
±
±
1.0787i
0.4825i
0.5427i
0.8273i
0.7000i
scaled residues
|R|/| Re (λ j )|
2.7558 · 10−3
9.8394 · 10−6
9.0457 · 10−7
6.2224 · 10−8
1.4981 · 10−9
found in iteration
double 1-harm. std.
5
5
70
−
57
−
27
39
32
50
Table 6.2: Summary of the found poles and corresponding residues of the BIPS system
using different subspace extractions. A minus sign indicates that the pole was
not found.
Example 7 (Computation of MIMO dominant poles) Now we experiment with the
methods from Section 5.2 and compute dominant poles of multivariable transfer functions. The International Space Station (ISS) model3 [8] is a small square MIMO system
of order n = 270 with three inputs and outputs. We compare SAMDP [38] and 2-JD with
our slight modifications by using s0 = 1i, = 10−10 and orthogonal search spaces since
the ISS model is a state-space system (E = I). No switching to RQI or eigenvalue tracking is used, the dominance criterion is (2.17) (kRk2 ), and the occurring linear systems
are solved exactly. All other settings are kept as before. According to (5.15), the initial
vectors for 2-JD are now generated by using the maximum singular triplet of H(s0 ). To
save computational costs in SAMDP, the required search directions u and z associated to
σmax (H(s)) are updated only after a dominant pole has been detected and after a restart.
The Figures 6.9 and 6.10 show the sigma plots of the complete model, the modal equiv3
available at http://www.icm.tu-bs.de/NICONET/benchmodred.html
89
−20
−40
Gain (dB)
−60
−80
−100
−120
Exact σmax
Exact σmin
Reduced σmax
Reduced σmin
kH(iω)−Hk (iω)k2
−140
−160
10−1
100
101
Frequency (rad/sec)
102
103
Figure 6.9: Sigma plot of complete model of the ISS system 3 × 3 transfer function and
k = 40 modal equivalents computed with 2-JD.
alents obtained with 2-JD and SAMDP, and the associated errors kH(iω) − Hk (iω)k2 . The
approximations were constructed with the first k = 40 eigentriplets (20 pairs / states)
computed by both algorithms. To compute these eigenvalues, 2-JD needed 78 iterations
while SAMDP terminated after 87 iterations. Both modal equivalents match the σmax
line almost perfectly, such that one can barely see a difference in the σmax plots between
both reduced order models. In the σmin plot, however, the dip around the frequency
100 is not reproduced by the reduced order model computed with 2-JD. There, the neglected Newton updates (singular vectors u, z corresponding to σmax of H) have their
price: the error line in both figures shows that the SAMDP approximation is much more
accurate in the whole considered frequency region than the reduced order model by
2-JD. Apparently it computed some triplets which are less dominant compared to the
ones detected by SAMDP. The average norm of the residue matrices associated to the
−4
poles found by SAMDP is slightly larger: kRkSAMDP
≈ 2.84·10−4 vs. kRk2−JD
avg ≈ 2.82·10 .
avg
This is similar to the effect we saw in the SISO case (cf. Example 3): apart from the
initial vectors, information from the input and output mappings B and C, respectively,
is only inserted approximately in the eigenvalue extraction step of 2-JD. Hence if the
number of detected poles increases, the more the method tends to compute more less
dominant poles or even arbitrary ones. For the MIMO case this drawback might even
Chapter 6. Numerical examples
90
−20
−40
Gain (dB)
−60
−80
−100
−120
Exact σmax
Exact σmin
Reduced σmax
Reduced σmin
kH(iω)−Hk (iω)k2
−140
−160
10−1
100
101
Frequency (rad/sec)
102
103
Figure 6.10: Sigma plot of complete model of the ISS system 3 × 3 transfer function and
k = 40 modal equivalents computed with SAMDP.
−100
−50
Gain (dB)
Gain (dB)
−50
Exact σmax
Exact σmin
Reduced σmax
Reduced σmin
kH(iω)−Hk (iω)k2
5
10
15
Frequency (rad/sec)
(a) 2-JD
20
−100
Exact σmax
Exact σmin
Reduced σmax
Reduced σmin
kH(iω)−Hk (iω)k2
5
10
15
Frequency (rad/sec)
20
(b) SAMDP
Figure 6.11: Sigma plot of complete model of the BIPS 8 × 8 transfer function and k = 251
modal equivalents computed with 2-JD (a) and SAMDP (b).
91
have more impact as it can be seen for larger multivariable transfer functions.
Figure 6.11 shows the result of an experiment that has been carried out similarly for
the BIPS 8 × 8 MIMO model 4 [38] of order n = 13.309. All settings were kept as before
except that we used switching to 2-RQI with RQI = 10−8 . The figures show again the
sigma plots of the exact model and the modal equivalents of order k = 251 computed
with 2-JD in Figure 6.11a and with SAMDP in Figure 6.11b. The σmax -plot of the 2-JD
model deviates from the exact model at the higher frequencies. The σmin -plot of the
modal equivalent does not match the one of the original model in the whole considered
frequency region. In contrast, both σmax - and σmin -plot are matched almost perfectly
by the SAMDP model. This is also displayed by the lower error for all frequencies
≈ 3.84 · 10−2 whereas
and by the average norm of the residue matrices: kRkSAMDP
avg
2−JD
kRkavg
≈ 3.07 · 10−2 .
To improve the accuracy of the 2-JD reduced order models, the strategy for generating
the initial vectors could in principle also be applied after a (implicit) restart has been
initiated in order to add extra basis vectors to the search spaces.
The generation of the singular vectors u, z corresponding to σmax (H) requires the exact
solution of m linear systems for the computation of the transfer function. Consequently,
the permanent use of inexact solutions of the occurring linear systems is obstructed.
However, first experiments showed no significant improvement and the convergence
of 2-JD was sometimes even slowed down remarkably. A more reliable combination of
2-JD and the Newton step of SAMDP might be an interesting topic for future research.
4
available at http://sites.google.com/site/rommes/software
92
Chapter 6. Numerical examples
7
Summary and Outlook
7.1 Conclusions
The approximation of large-scale linear time invariant systems has become a popular
research area in the last decades. In this thesis we discussed modal approximation as
such model order reduction technique. Using a Petrov-Galerkin style projection, the
reduced order model is obtained by projecting the original system onto the right and left
eigenspaces corresponding to a certain subset of the eigenvalues of the system matrices.
Although this idea is relatively simple, by choosing appropriate eigenvalues, namely
the dominant poles of the system’s transfer function, it is possible to obtain accurate
reduced order models. The dominant poles have a large contribution in the partial
fraction representation of the transfer function and cause peaks in the Bode magnitude
plot. Consequently, the application of modal approximation requires the computation
of several dominant eigentriplets.
This motivated to investigate certain two-sided eigenvalue methods in this thesis that
are able to compute dominant poles and the corresponding right and left eigenvectors
for SISO systems. These algorithms are subspace accelerated variants of basic Rayleigh
quotient style iterations we reviewed in Chapter 3. In Chapter 4, we discussed at first
SADPA [39, 36], which was intrinsically designed for the computation of dominant
poles. Afterwards, the two-sided Jacobi-Davidson method (2-JD) [20, 50] was described
in more detail. It is a two-sided modification of the standard Jacobi-Davidson algorithms
that is able to approximate left eigenvectors simultaneously to the right ones. We
showed that, similar to the one-sided JD variants [11, 47], 2-JD can also be derived as
Newton scheme.
Another Jacobi-Davidson variant is the alternating Jacobi-Davidson method (AJD) [20],
where right and left eigenvectors are approximated alternately. We proved that this
method is, under some assumptions, equivalent to a subspace accelerated version of the
alternating Rayleigh quotient iteration [31]. We also introduced some new adjustments
94
Chapter 7. Summary and Outlook
to make these alternating methods capable of computing dominant poles.
For an efficient computation of several dominant poles, techniques such as deflation,
restarts, inexact solves and preconditioning have been considered for the investigated
eigenvalue methods.
Since dominant eigenvalue can be located in the interior of the spectrum, harmonic
subspace extractions have been investigated in Chapter 5.1. It turned out that it is
difficult to construct a two-sided harmonic extraction for the generalized eigenproblem
and we proposed two different novel approaches for this purpose. A possible combination of harmonic subspace extractions and the combination of dominant poles has been
introduced, too.
The computation of dominant poles of multivariable transfer functions of MIMO systems is, compared with the SISO case, an even more challenging task which has been
analyzed in Chapter 5.2. We reviewed SAMDP [38], a modification of SADPA for the
computation of MIMO dominant poles, and examined if and how 2-JD can be upgraded similarly. The proposed modifications of 2-JD do not obstruct the robustness
of the method with respect to inexact solves, although exact solutions of linear system
might be applied once to obtain good initial vectors.
The investigated eigenvalue algorithms have been tested in various numerical experiments with respect to different aspects and the results were presented in Chapter 6. It
turned out that, if exact solves of the encountered linear systems are available, SADPA
is often the most robust method for dominant pole computation. If only inexact solves
are affordable, then 2-JD is the method of choice. Notably, it was revealed that for the
computation of dominant eigentriplets, some of the usual strategies for this task have to
be altered. For instance, updating of the involved preconditioner can greatly improve
the performance for this purpose.
7.2 Future research perspectives
All the reviewed methods incorporate several parameters that influence various subprocesses in each iteration. For instance, the minimum and maximum dimensions
kmin , kmax of search space dimensions, the different normalization possibilities, the error tolerance . For inexact solves there is also the number of the inner iterations, the
applied iterative method and the preconditioner. To find an optimal tuning of all this
settings could be a very formidable or almost impossible task. However, adjusting the
number of inner iterations with the accuracy of the current approximate eigentriplets
within 2-JD appears to be within the range of possibilities since as it was achieved for
the one-sided JD variants [19, 29, 51, 52].
We have seen in the numerical examples that updating the preconditioner when solving
the linear systems only approximately, can greatly improve the performance. Since
dominant poles are often scattered over the complex plain, it seems to be even advised
7.2. Future research perspectives
95
to invoke such an update. However, it is not clear when and how to update. A naive
timing, for example after a triplet has been found and after a restart can become very
inefficient. Finding a preconditioner that is easily recomputeable appears to be difficult,
too, especially for the generalized eigenproblem.
Another observation was that SADPA detected sometimes more dominant poles in
comparison with the other methods but showed the worst performance when the linear
systems are solved inexactly. Both effects can result from the fixed right hand sides in
the involved linear systems. Due to the projection onto the orthogonal complements of
the previous eigenvector approximations, 2-JD is more robust with respect to inexact
solves. A possible combination of both methods could be the insertion of one or more
vectors generated with DPA from time to time into the search spaces of 2-JD. This might
even be more crucial for the computation of dominant poles of MIMO systems with
2-JD using the proposed modifications.
Two-sided harmonic subspace extraction approaches for the generalized eigenvalue
problem are also almost not present in the literature. The two newly proposed versions require further theoretical examination before a practicable implementation in
a two-sided eigenvalue methods such as 2-JD can unfold its full potential. If there
is such an extraction which works as reliable as the one-sided harmonic approaches
[16, 48], subspace accelerated algorithms for dominant pole computation could become
significantly more efficient. Other variations of subspace extractions [17, 21] and their
potential deployment for this purpose might be also worth examining.
Moreover, the new derivation of 2-JD as Newton scheme might reveal more general
correction equations when different harmonic subspace extractions are involved as it is,
e.g., the case in the JDSVD method [18] for singular value problems.
The oblique projectors in 2-JD can cause numerical instabilities, especially when the
method is applied to generalized eigenproblems. In [45, 46], two-sided Jacobi-Davidson
variants for nonlinear eigenproblems are proposed that work with correction equations
where only orthogonal operators are involved. For the linear eigenproblems discussed
in this thesis, the correction equation would then be of the form
(I − ww∗ )(A − θE)(I − vv∗ )s = −(A − θE)v = −rv (s ⊥ v),
(I − vv∗ )(A − θE)∗ (I − ww∗ )t = −(A − θE)∗ w = −rw (t ⊥ w),
where (θ, v, w) denotes again an approximate eigentriplet of (A, E). Using these
correction equations could yield a more stable method.
In view of nonlinear eigenvalue problems, computing dominant poles of second-order
systems can also be achieved with modifications of the investigated eigenvalue solver.
As derived in [36, 40], the transfer function of a second-order system is given by
H(s) = C(s2 M + sL + K)−1 B + D =
2n
X
j=1
Rj
s − λj
+D
96
Chapter 7. Summary and Outlook
with residues R j = (Cx j )(y j B)λi corresponding to the eigentriplets (λ j , x j , y j ) of the associated quadratic eigenvalue problem. Modal equivalents of the second-order system
can again be obtained by collecting the eigentriplets with large (scaled) residues. A
DPA style algorithm for this task is the Subspace Accelerated Quadratic Dominant Pole
Algorithm (SAQDPA) [40]. Of course, a two-sided Jacobi-Davidson style algorithms for
the quadratic eigenvalue problem [7, 20, 21, 49] can also be applied for this purpose.
However, in order to get even more accurate reduced order models, combinations of
modal approximation with other model order reduction techniques, such as Krylov
subspace or balanced truncation methods, are possible. For instance, one can combine
the retrieved right and left eigenspaces with the bases obtained by a rational Krylov
method [14] as it is examined in [36, Section 3.8].
A combination of modal truncation and balanced truncation [28] is proposed in [44],
where the dominant poles are used as shifts for the ADI method that solves the occurring
Lyapunov equations.
Bibliography
[1] L. A. Aguirre, Quantitative measure of modal dominance continuous systems, Proceedings of the 32nd IEEE Conference on Decision and Control, 3 (1993), pp. 2405–2410.
16, 18
[2] A. C. Antoulas, Approximation of large–scale dynamical systems, SIAM, Philadelphia,
PA, USA, 2005. 2, 10, 14
[3] Z. Bai, J. Demmel, J. Dongarra, A. Ruhe, and H. A. Van der Vorst, Templates for
the Solution of Algebraic Eigenvalue Problems: A Practical Guide, SIAM, Philadelphia,
PA, USA, 2000. 7, 48, 53, 67
[4] P. Benner, Control theory, in Handbook of Linear Algebra, L. Hogben, ed., Discrete
Mathematics and Its Applications, Chapman & Hall/CRC, Boca Raton, Florida,
2006, ch. 57. 10
[5]
, Numerical linear algebra for model reduction in control and simulation, GAMM
Mitteilungen, 29 (2006), pp. 275–296. 2, 14
[6] T. Betcke and H. Voss, A Jacobi–Davidson-type projection method for nonlinear eigenvalue problems, Future Generation Computer Systems, 20 (2004), pp. 363 – 372.
8
[7] A. G. L. Booten, D. R. Fokkema, G. L. G. Sleijpen, and H. A. van der Vorst, Jacobi–
Davidson type methods for generalized eigenproblems and polynomial eigenproblems, 36
(1996). 8, 54, 56, 58, 61, 63, 96
[8] Y. Chahlaoui and P. Van Dooren, A collection of benchmark examples for model reduction of linear time invariant dynamical systems, Tech. Rep. 2002–2, SLICOT Working
Note, Feb. 2002. Available from www.slicot.org. ix, 11, 12, 34, 42, 43, 77, 78, 85, 88
[9] B. N. Datta, Numerical methods for linear control systems: design and analysis, Elsevier
Bibliography
98
Academic Press, 2004. 10
[10] E. R. Davidson, The iterative calculation of a few of the lowest eigenvalues and corresponding eigenvectors of large real-symmetric matrices, Journal of Computational Physics,
17 (1975), pp. 87–94. 8
[11] D. R. Fokkema, G. L. G. Sleijpen, and H. A. Van der Vorst, Accelerated inexact
Newton schemes for large systems of nonlinear equations, SIAM Journal on Scientific
Computing, 19 (1998), pp. 657–674. 9, 45, 93
[12]
, Jacobi–Davidson style QR and QZ algorithms for the reduction of matrix pencils,
SIAM Journal on Scientific Computing, 20 (1998), pp. 94–125. 8, 41, 44, 53, 55, 56,
57, 58, 63, 67, 84
[13] G. H. Golub and C. F. Van Loan, Matrix Computations (3rd ed.), Johns Hopkins
University Press, Baltimore, MD, USA, 1996. 3, 6, 7, 8, 24
[14] E. J. Grimme, Krylov Projection Methods For Model Reduction, PhD thesis, University
of Illinois, 1997. 2, 14, 96
[15] M. Günther, U. Feldmann, and E. J. W. ter Maten, Modelling and discretization
of circuit problems, in Numerical Analysis in Electromagnetics, Special Volume of
Handbook of Numerical Analysis, W. H. A. Schilders and E. J. W. ter Maten, eds.,
vol. XIII, 2005, pp. 523–659. Elsevier Science BV. 1
[16] M. Hochbruck and M. E. Hochstenbach, Subspace extraction for matrix functions,
Preprint, Department of Mathematics, Case Western Reserve University, Cleveland, Ohio, USA, September 2005. Submitted. 47, 66, 67, 68, 95
[17] M. E. Hochstenbach, Variations on harmonic Rayleigh–Ritz for standard and generalized
eigenproblems, Preprint. 95
[18]
, A Jacobi–Davidson type SVD method, SIAM Journal on Scientific Computing,
23 (2001), pp. 606–628. 8, 95
[19] M. E. Hochstenbach and Y. Notay, Controlling inner iterations in the Jacobi–Davidson
method, SIAM Journal on Matrix Analysis and Applications, 31 (2009), pp. 460–477.
59, 94
[20] M. E. Hochstenbach and G. L. G. Sleijpen, Two-sided and alternating Jacobi–Davidson,
Linear Algebra and its Applications, 358(1-3) (2003), pp. 145–172. 9, 21, 27, 44, 45,
49, 50, 59, 60, 63, 64, 93, 96
[21]
, Harmonic and refined extraction methods for the polynomial eigenvalue problem,
Numerical Linear Algebra with Applications, 15 (2008), pp. 35–54. 8, 66, 95, 96
[22] C. G. J. Jacobi, Über eine neue Auflösungsart der bei der Methode der kleinsten Quadrate
vorkommende linearen Gleichungen, Astronomische Nachrichten, (1845), pp. 297–306.
9
[23] T. Kailath, Linear Systems, Englewood Cliffs, NJ, 1980. 10, 16
Bibliography
99
[24] P. Kunkel and V. Mehrmann, Differential–Algebraic Equations - Analysis and Numerical Solution, Textbooks in Mathematics, European Mathematical Society, 2006. 1,
13
[25] P. Lancaster, On eigenvalues of matrices dependent on a parameter, Numerische Mathematik, 6 (1964), pp. 377–387. 72
[26] N. Martins, L. Lima, and H. Pinto, Computing dominant poles of power system transfer
functions, IEEE Transactions on Power Systems, 11 (1996), pp. 162 –170. ix, 16, 17,
28
[27] N. Martins and P. Quintao, Computing dominant poles of power system multivariable
transfer functions, IEEE Transactions on Power Systems, 18 (2003), pp. 152 – 159. 72,
73
[28] B. Moore, Principal component analysis in linear systems: Controllability, observability,
and model reduction, IEEE Transaction on Automatic Control, AC-26 (1981), pp. 17–
32. 2, 14, 96
[29] Y. Notay, Combination of Jacobi-Davidson and conjugate gradients for the partial symmetric eigenproblem, Numerical Linear Algebra with Applications, 9 (2000), pp. 21
– 44. 59, 94
[30] A. M. Ostrowski, On the convergence of the Rayleigh quotient iteration for the computation of the characteristic roots and vectors. III (generalized Rayleigh quotient and
characteristic roots with linear elementary divisors), Archive for Rational Mechanics
and Analysis, 3 (1959), pp. 325–240. 26, 27
[31] B. N. Parlett, The Rayleigh quotient iteration and some generalizations for nonnormal
matrices, Mathematics of Computation, 28 (1974), pp. 679–693. 23, 25, 26, 27, 30, 31,
32, 93
[32]
, Reduction to tridiagonal form and minimal realizations, SIAM Journal on Matrix
Analysis and Applications, 13 (1992), pp. 567–593. 39, 48
[33]
, The Symmetric Eigenvalue Problem, SIAM, Philadelphia, PA, USA, 1998. 3
[34] G. Peters and J. H. Wilkinson, Inverse iteration, ill-conditioned equations and Newton’s
method, SIAM Review, 21 (1979), pp. 339–360. 9, 26
[35] T. Reis, Model reduction of electrical circuits, Sept. 2009. Casa Autumn School on
Future Developments in Model Order Reduction, Terschelling, The Netherlands.
1
[36] J. Rommes, Methods for eigenvalue problems with applications in model order reduction,
PhD thesis, Universiteit Utrecht, 2007. 29, 38, 40, 41, 42, 43, 44, 50, 55, 57, 72, 73, 74,
79, 93, 95, 96
[37]
, Arnoldi and Jacobi-Davidson methods for generalized eigenvalue problems Ax =
λBx with singular B, Mathematics of Computation, 77 (2008), pp. 995–1015. 8, 53
100
Bibliography
[38] J. Rommes and N. Martins, Efficient computation of multivariable transfer function
dominant poles using subspace acceleration, IEEE Transactions on Power Systems, 21
(2006), pp. 1471 –1483. 72, 73, 74, 88, 91, 94
[39]
, Efficient computation of transfer function dominant poles using subspace acceleration, IEEE Transactions on Power Systems, 21 (2006), pp. 1218 –1226. 21, 30, 38, 81,
93
[40]
, Computing transfer function dominant poles of large-scale second-order dynamical
systems, SIAM Journal on Scientific Computing, 30 (2008), pp. 2137–2157. 95, 96
[41] J. Rommes and G. L. G. Sleijpen, Convergence of the dominant pole algorithm and
Rayleigh quotient iteration, SIAM Journal on Matrix Analysis and Applications, 30
(2008), pp. 346–363. 28, 29
[42] A. Ruhe, The two-sided Arnoldi algorithm for nonsymmetric eigenvalue problems, in Matrix Pencils, B. Kågström and A. Ruhe, eds., vol. 973, Springer Berlin / Heidelberg,
1983, pp. 104–120. 21, 67
[43] Y. Saad, Iterative Methods for Sparse Linear Systems, SIAM, Philadelphia, PA, USA,
2003. 9, 39, 47, 67
[44] J. Saak, Efficient Numerical Solution of Large Scale Algebraic Matrix Equations in PDE
Control and Model Order Reduction, PhD thesis, TU Chemnitz, July 2009. 96
[45] K. Schreiber, Nonlinear Eigenvalue Problems: Newton-type Methods and Nonlinear
Rayleigh Functionals, PhD thesis, TU Berlin, 2008. 8, 32, 61, 95
[46] K. Schreiber and H. Schwetlick, A primal-dual Jacobi-Davidson-like method for nonlinear eigenvalue problems, tech. rep., 2007. 8, 95
[47] G. L. G. Sleijpen and H. A. Van der Vorst, The Jacobi–Davidson method for eigenvalue problems and its relation with accelerated inexact Newton scheme, BIT Numerical
Mathematics, 36 (1996), pp. 595–633. 9, 45, 93
[48]
, A Jacobi–Davidson iteration method for linear eigenvalue problems, SIAM Rev., 42
(2000), pp. 267–293. 8, 9, 44, 61, 63, 67, 95
[49] G. L. G. Sleijpen, H. A. van der Vorst, and M. B. van Gijzen, Quadratic eigenproblems
are no problem, SIAM News, 29 (1996), pp. 8 – 9. 8, 96
[50] A. Stathopoulos, A case for a biorthogonal Jacobi–Davidson method: Restarting and
correction equation, SIAM Journal on Matrix Analysis and Applications, 24 (2002),
pp. 238–259. 9, 44, 45, 93
[51]
, Nearly optimal preconditioned methods for Hermitian eigenproblems under limited
memory. Part I: Seeking one eigenvalue, SIAM Journal on Scientific Computing, 29
(2007), pp. 481–514. 59, 94
[52] A. Stathopoulos and J. R. McCombs, Nearly optimal preconditioned methods for Hermitian eigenproblems under limited memory. Part II: Seeking many eigenvalues, SIAM
Bibliography
101
Journal on Scientific Computing, 29 (2007), pp. 2162–2188. 59, 94
[53] G. W. Stewart, Matrix Algorithms, vol. II: Eigensystems, SIAM, Philadelphia, PA,
USA, 2001. 7
[54] F. Tisseur and K. Meerbergen, The quadratic eigenvalue problem, SIAM Review, 43
(2001), pp. 235–286. 6, 14
[55] P. Tsiotras, The relation between the 3-D Bode diagram and the root locus: insights into
the connection between these classical methods, IEEE Control Systems Magazine, 25
(2005), pp. 88 – 96. 11, 16
[56] H. A. Van der Vorst, Computational Methods for Large Eigenvalue Problems, vol. VIII,
Ciarlet, J.L. Lions, 2000. 7, 8, 67
[57] A. Varga, Enhanced modal approach for model reduction, Mathematical Modelling of
Systems, 1 (1995), pp. 91–105. 18
[58] H. Voss, A new justification of the Jacobi-Davidson method for large eigenproblems, Linear
Algebra and its Applications, 424 (2007), pp. 448 – 455. 9
[59] J. H. Wilkinson, ed., The Algebraic Eigenvalue Problem, Oxford University Press,
Inc., New York, NY, USA, 1988. 3
102
Bibliography
Theses
1. In this Master’s thesis we discussed modal approximation as model order reduction technique for linear time invariant dynamical control systems. Modal
approximation refers to the projection of the original system onto the left and
right eigenspaces corresponding to a certain set of eigenvalues.
2. Dominant poles are poles of the system’s transfer function which have a large
contribution in the frequency response. They usually form a small subset of the
spectrum and are hence designated for modal truncation.
3. Modal approximation based on dominant poles requires eigenvalue algorithms
that are capable of computing eigenvalues and the associated right and left eigenvectors of large and sparse matrices. Some basic iterations based on the Rayleigh
quotient can be used for this task but compute only one eigentriplet at a time.
4. Including subspace acceleration to these rather simple iterations leads to algorithms in which the original eigenvalue problem is transformed into a reduced
one of smaller size. The eigentriplets of this reduced eigenproblem serve as approximations for the eigentriplets of the original, large-scale problem.
5. We investigated the Subspace Accelerated Dominant Pole Algorithm and the
two-sided and alternating Jacobi-Davidson method which work in this way and
are capable of computing several dominant poles of a single-input-single-output
(SISO) system.
6. The two-sided Jacobi-Davidson method is, similar to the one-sided versions, also
an accelerated Newton scheme.
7. All these eigenvalue algorithms depend on the solution of at least one linear system
in each iteration. There, Jacobi-Davidson style methods come with the advantage
that they are most robust with respect to inexact solves of these systems.
104
Theses
8. Since the dominant poles can be located in the interior of the spectrum, we discussed the application of harmonic subspace extractions to get an improved convergence behavior towards eigenvalues close to a specified target. It turned out
that it is difficult to obtain a two-sided harmonic extraction process for the generalized eigenvalue problem. We proposed two novel approaches which where
also combined with the computation of dominant poles. A detailed theoretical
investigation is left for future research.
9. The computation of dominant poles of multi-input-multi-output systems is, compared to the SISO case, much more difficult and challenging. The subspace accelerated MIMO dominant pole algorithm is an existing modification of SADPA for
this purpose and we examined how the two-sided Jacobi-Davidson method can
be altered similarly.
10. We run several numerical tests with the discussed methods. It turned out that, if
exact solutions of the involved linear systems are available, SADPA is often the
most robust method due to its better convergence towards dominant poles.
11. On the other hand, the two-sided Jacobi-Davidson method is superior when the
linear systems are solved inexactly, although dominant poles might require some
additional adjustments, such as updating of the applied preconditioner.
12. The numerical tests with the alternating eigenvalues methods that we modified for
dominant poles computations are not satisfying. In theory, this method requires
half the computational effort compared to the two-sided Jacobi-Davidson method.
Unfortunately, slow convergence and numerical instabilities make it a less reliable
method for the computation of dominant poles.
13. Computing dominant poles of multivariable transfer functions is possible with
the described methods, although the introduced new modification of 2-JD can
currently not compete with SAMDP with respect to the accuracy of the reduced
order models.
Declaration of Authorship
Hereby I certify that I have completed the present thesis independently. I have not
submitted it previously for examination purposes and I have used no others than the
stated references. All consciously used excerpts, quotations and contents of other
authors have been properly marked as those.
Chemnitz June 14, 2010
Selbstständigkeitserklärung
Hiermit erkläre ich, daß ich die vorliegende Arbeit selbstständig angefertigt, nicht
anderweitig zu Prüfungszwecken vorgelegt und keine anderen als die angegebenen
Hilfsmittel verwendet habe. Sämtliche wissentlich verwendete Textausschnitte, Zitate
oder Inhalte anderer Verfasser wurden ausdrücklich als solche gekennzeichnet.
Chemnitz, den 14. Juni, 2010
Patrick Kürschner