LQ Optimal Control

LQ Optimal Control
• Stability and the Lyapunov equation
• Linear Quadratic Optimal Control
• Solution with completion of squares
• The algebraic Riccati equation
• Cheap control and asymptotic properties
• Robustness properties
Related Reading
[KK]: 9.1-9.2, 10 and [AM]: 4.4, 6.4
1/60
Lyapunov Functions for Linear Systems
We have analyzed asymptotic stability of the linear system
ẋ = Ax = f (x), A ∈ Rn×n
by a direct consideration of eAt . It sheds an new light on linear stability
analysis and prepares for later if we use Lyapunov theory.
Since the system is linear, let’s try to use a (homogenous) quadratic
Lyapunov function V : Rn → R. Such functions are described by
V (x) = xT P x with a symmetric matrix P ∈ Rn×n .
For applying the Lyapunov theorem (Lecture 2) we need to consider
∂V (x)f (x) = 2xT P Ax = xT [AT P + P A]x.
Remark. If some formulas for derivatives are not familiar to you, you
should verify them by arguing on the basis of those rules that you know.
2/60
Recap: Simple Facts from Linear Algebra
Let Q ∈ Rn×n and R ∈ Rm×m be symmetric (Q = QT and R = RT ).
1. Q is positive semi-definite (Q < 0) iff we have either:
- xT Qx ≥ 0 for all x ∈ Rn
- all eigenvalues of Q are non-negative
- Q can be written as C T C (with C of full row rank)
2. R is positive definite (R 0) iff we have either:
- uT Ru > 0 for all u ∈ Rm that are not zero
- all eigenvalues of R are positive
- R can be written as U T U with a square and invertible U .
3. If a positive semi-definite matrix has a zero on the diagonal, then
the corresponding row and column must be zero.
For any vector x ∈ Rn the Euclidean norm
√
xT x is denoted by kxk.
3/60
Lyapunov Conditions for Asymptotic Stability
Theorem 2-13 requires to make sure that
xT P x > 0 and xT [AT P + P A]x < 0 for all x 6= 0.
We hence arrive at the following result.
Theorem 1 If there exists a P 0 such that AT P + P A ≺ 0 then
ẋ = Ax is (globally) asymptotically stable.
This result follows from general Lyapunov theory. On the next slide we
provide a direct proof. In practice, the following recipe is often applied.
Theorem 2 For any Q = QT ≺ 0 (such as for example Q = −I)
consider the following linear equation in P :
AT P + P A = Q.
If it has a unique positive definite solution then A is Hurwitz. Otherwise A is not Hurwitz.
4/60
Proof of Theorem 1
For some small α > 0 the matrix AT P + P A + αP is still negative
definite. Therefore xT [AT P + P A + αP ]x ≤ 0 for all x ∈ Rn and hence
xT [AT P + P A]x ≤ −α xT P x.
(?)
For ξ ∈ Rn we need to show that x(t) = eAt ξ → 0 for t → ∞. Define
v(t) = x(t)T P x(t) ≥ 0.
We then infer with the help of (2) that
d
v̇(t) = x(t)T P x(t) = x(t)T [AT P +P A]x(t) ≤ −αx(t)T P x(t) = −αv(t).
dt
Hence r(t) = v̇(t) + αv(t) ≤ 0. By the variation-of-constants formula
Rt
0 ≤ v(t) = v(0)e−αt + 0 e−α(t−τ ) r(τ ) dτ ≤ v(0)e−αt → 0 for t → ∞.
Therefore limt→∞ v(t) = 0. Since P is positive definite, it can be written
as V T V , V invertible. Then v(t) = x(t)T V T V x(t) = kV x(t)k2 → 0;
hence V x(t) → 0 and thus V −1 V x(t) = x(t) → 0 for t → ∞.
5/60
Proof of Theorem 2
In what follows we present the algebraic analogue of the trajectoryoriented proof given the previous slide.
Suppose that P 0 and AT P + P A = Q ≺ 0. Let Ax = λx with
x ∈ Cn \ {0}. We infer
0 > x∗ (AT P + P A)x = λ̄x∗ P x + λx∗ P x = 2Re(λ)x∗ P x.
Since x∗ P x > 0, this implies Re(λ) < 0.
The converse follows from Theorem 3.
6/60
Lyapunov Equation
Theorem 3 Let A ∈ Rn×n be Hurwitz.
• For every symmetric matrix Q ∈ Rn×n the Lyapunov equation
AT P + P A = Q
does have a unique symmetric solution P ∈ Rn×n .
• If Q is negative semi-definite then P is positive semi-definite.
• If Q 4 0 and (A, Q) is observable then P is positive definite.
The equation is well-studied also in the case that A is not Hurwitz.
Then, for any symmetric and positive definite Q:
• either the Lyapunov equation has no solution;
• or there exists a solution but it is not unique;
• or there exists a unique solution but it is not positive definite.
7/60
Proof
Since eAt decays exponentially to zero for t → ∞ the matrix
Z ∞
T
P =−
eA t QeAt dt
0
is well-defined. Moreover we have
Z ∞
h T
i h T
i
T
T
A t
At
A t
At
A P + PA = −
A e Qe
+ e Qe A dt =
0
Z
∞
t=∞
d h AT t At i
AT t
At = Q.
dt = − e Qe =−
e Qe
dt
t=0
0
Hence P solves the Lyapunov equation.
If P̃ is another solution we infer for ∆ = P̃ − P that AT ∆ + ∆A = 0. If
T
we define M (t) = eA t ∆eAt we have M (∞) := limt→∞ M (t) = 0 and
T
T
T
Ṁ (t) = eA t AT ∆eAt + eA t ∆AeAt = eA t [AT ∆ + ∆A]eAt = 0.
Hence M (·) is constant; thus ∆ = M (0) = M (∞) = 0; hence P = P̃ .
8/60
Proof
T
If Q 4 0 we infer eA t QeAt 4 0 for all t ≥ 0 and thus
Z ∞
T
P =−
eA t QeAt dt < 0.
0
Now suppose that, in addition, (A, Q) is observable. If x ∈ N (P ) then
0 = xT (AT P + P A − Q)x = −xT Qx
which implies Qx = 0. Hence
0 = (AT P + P A − Q)x = P Ax
which leads to Ax ∈ N (P ).
In summary, AN (P ) ⊂ N (P ) and N (P ) ⊂ N (Q). By Theorem 4-7 we
infer N (P ) = {0}. This implies P 0.
9/60
Example
The command lyap(A,R) solves the equation AX + XAT + R = 0:
A=[-2 3;1 1];P=lyap(A’,eye(2));eig(P)=[-0.8090;0.3090]
%%
As=[-2 3;1 1]-1.8*eye(2);P=lyap(As’,eye(2))
eig(P)=[0.1089;68.2607]
%%
ev=eig(A);
As=A-ev(1)*eye(2);
P=lyap(As’,eye(2))
??? Error using ==> lyap
Solution does not exist or is not unique.
10/60
LQ Optimal Control
There are many ways to find controls u(.) such that the state of
ẋ = Ax + Bu, x(0) = ξ ∈ Rn
converges to zero for t → ∞. Designing feedback gains by pole-placement
is not simple since it is difficult to balance the speed of convergence of
x(.) and the “size” of the corresponding control action u(.).
This motivates to quantify the average distance of the statetrajectory from 0 and the effort involved in the control action as
Z ∞
Z ∞
T
x(t) Qx(t) dt and
u(t)T Ru(t) dt
0
0
respectively, where Q and R are symmetric weighting matrices
that are positive semi-definite and positive definite respectively.
The weighting matrices allow to put individual emphasis on the different
components of the state- and control-trajectories.
11/60
LQ Optimal Control
Achieving fast state-convergence to zero with the least possible effort
then amounts to minimizing the cost function
Z ∞
x(t)T Qx(t) + u(t)T Ru(t) dt
(C)
0
over all trajectories satisfying
ẋ(t) = Ax(t) + Bu(t), x(0) = ξ and lim x(t) = 0.
t→∞
(S)
Definition 4 Solving the optimal control problem of minimizing the
quadratic cost function (C) over all controls u(.) that satisfy (S) is
the linear quadratic (LQ) optimal control problem (with stability).
We stress that other cost criteria might better reflect the desired objectives; these would result in more general problems of optimal control.
The choice of a quadratic cost for linear systems is motivated by a
beautiful problem solution and fast solution algorithms.
12/60
Choice of Weighting Matrices
Often Q = diag(q1 , . . . , qn ) and R = diag(r1 , . . . , rm ) are taken to
be diagonal and the cost then reads as
n Z ∞
m Z ∞
X
X
2
qk xk (t) dt +
rk uk (t)2 dt.
k=1
0
k=1
0
The scalars qk ≥ 0 and rk > 0 allow us to balance the emphasis put on
the state- and input-components:
• Large values of qk or rk penalize the component xk (t) or uk (t) heavier.
Therefore these components are expected to be pushed to smaller
values by optimal controllers.
• Small values of qk or rk allow for larger deviations of xk (t) from zero
or for a larger action of uk (t).
• With qk = 0 no emphasis is put on xk (t). For technical reasons rk = 0
is not allowed: All control components have to be penalized.
13/60
Completion of Squares
For any symmetric matrix P and any state-trajectory with (S) we have
d
x(t)T P x(t) = ẋ(t)T P x(t) + x(t)T P ẋ(t) =
dt
= (Ax(t) + Bu(t))T P x(t) + x(t)T P (Ax(t) + Bu(t)) =
= x(t)T (AT P + P A)x(t) + x(t)T P Bu(t) + u(t)B T P x(t).
Let us add on both sides x(t)T Qx(t) and u(t)T Ru(t). Now suppose
R = U T U with some square invertible U . We infer
x(t)T P Bu(t)+u(t)B T P x(t)+u(t)T Ru(t) = −x(t)T P BR−1 B T P x(t)+
+x(t)T P BR−1 B T P x(t)+x(t)T P Bu(t)+u(t)B T P x(t)+u(t)T Ru(t) =
= −x(t)T P BR−1 B T P x(t) + kU u(t) + U −T B T P x(t)k2 .
This latter step is called completion of the squares. Purpose?
14/60
Completion of Squares
We have derived the following key relation along any system trajectory:
d
x(t)T P x(t) + x(t)T Qx(t) + u(t)T Ru(t) =
dt
= x(t)T [AT P + P A − P BR−1 B T P + Q]x(t)+
+ kU u(t) + U −T B T P x(t)k2 .
This motivates to choose P = P T as a solution of the following so-called
algebraic Riccati equation (ARE)
AT P + P A − P BR−1 B T P + Q = 0.
If that was possible we could infer
d
x(t)T P x(t) + x(t)T Qx(t) + u(t)T Ru(t) =
dt
= kU u(t) + U −T B T P x(t)k2 .
15/60
Completion of Squares
If we integrate over [0, T ] for T > 0 we finally arrive at
Z
T
x(T ) P x(T ) +
T
x(t)T Qx(t) + u(t)T Ru(t) dt =
0
Z T
T
= ξ Pξ +
kU u(t) + U −T B T P x(t)k2 dt .
{z
}
|0
≥0
• For any trajectory of (S) we have x(T ) → 0 for T → ∞ and thus
Z ∞
x(t)T Qx(t) + u(t)T Ru(t) dt ≥ ξ T P ξ.
0
The cost is not smaller than ξ T P ξ, no matter which stabilizing
control function is chosen.
• Equality is achieved exactly when U u(t) + U −T B T P x(t) = 0 or
u(t) = −R−1 B T P x(t) for all t ≥ 0.
16/60
Insights
• Any solution P = P T of the ARE gives us a lower bound ξ T P ξ on
the cost function for all admissible control functions.
• The lower bound is attained if we can choose the control function
to satisfy u(t) = −R−1 B T P x(t). This could be assured as follows:
1. Solve ẋ = [A − BR−1 B T P ]x with x(0) = ξ for x(.).
2. Then define the control function by u∗ (t) = −R−1 B T P x(t).
But we need to make sure that lim x(t) = 0 which requires that
t→∞
−1
A − BR B T P is Hurwitz.
If there exists a P as indicated then the constructed input u∗ (·) is
indeed a unique optimal open-loop control function.
• Moreover, the optimal control function can actually be implemented
by a feedback strategy u = −F x with gain F = R−1 B T P .
17/60
Example
Consider Harrier at vertical take-off ([AM] pp.53,141,191) modeled as
mẍ = F1 cos(θ) − F2 sin(θ) − cẋ,
mÿ = F1 sin(θ) + F2 cos(θ) − mg − cẏ,
J θ̈ = rF1 .
18/60
Example
With state z = (x, y, θ, ẋ, ẏ, θ̇) and input u = (F1 , F2 ) put the system into a first-order description and linearize at the equilibrium ue =
(0, mg) and ze = (xe , ye , 0, 0, 0, 0). This leads to


0
0 0 0
1
0
0 0
0 0 0
0
1
0 0
0 



0
0
1 0
0 .
A B =0 0 0
 0 0 −g −c/m
0
0 1/m 0 
0 0 0
0
−c/m 0 0 1/m
For a scale model choose the parameters
m = 4; J = 0.0475; r = 0.25; g = 9.81; c = 0.05.
For Q and R we compute with
[F, P, E] = lqr(A, B, Q, R)
the LQ-gain F , the stabilizing ARE solution P and the closed-loop
eigenvalues E = eig(A − BF ).
19/60
Example
For Q = I, R = I, ξ = (1, −1, 0.35, 0, 0, 0) get closed-loop responses
Control
2
0
−2
0
1
2
1
2
3
4
First 3 States
5
6
7
5
6
7
1
0
−1
0
3
4
The second state is very slow. Also the first should be somewhat faster.
This motivates to increase the penalty (weight) on these states e.g. to
Q = diag(10, 100, 1, 1, 1, 1).
20/60
Example
The responses are faster, at the expense of a larger control action:
Control
20
0
−20
0
1
2
1
2
3
4
First 3 States
5
6
7
5
6
7
1
0
−1
0
3
4
Let’s now allow for an even larger control action by reducing the input
weight to R = 0.1I.
21/60
Example
This speeds up the responses further, but again at the expense of larger
control actions:
Control
50
0
−50
0
0.5
1
First 3 States
1.5
2
0.5
1
1.5
2
1
0
−1
0
By reducing ρ > 0 in R = ρI we put less weight on the control input.
This typically comes along with high gains in the state-feedback matrix.
22/60
Riccati Theory
Definition 5 Let A ∈ Rn×n , B ∈ Rn×m , Q ∈ Rn×n , R ∈ Rm×m
satisfy Q = QT and R 0. The quadratic matrix equation
AT P + P A − P BR−1 B T P + Q = 0
(ARE)
in the unknown P ∈ Rn×n is called algebraic Riccati equation
(ARE) (for the linear system described by (A, B) and the quadratic
cost function defined with (Q, R)).
Any solution P of the ARE which also satisfies
eig(A − BR−1 B T P ) ⊂ C−
it is said to be a stabilizing solution.
We are typically only interested in symmetric solutions P of the ARE.
Can we characterize the existence of (stabilizing) solutions of the ARE?
Can we compute them (efficiently) if they exist? Yes we can ...
23/60
Riccati
Jacopo Francesco Riccati (1676-1754)
23/60
The Hamiltonian Matrix
Definition 6 The Hamiltonian matrix of the ARE is defined as
A −BR−1 B T
H=
∈ R2n×2n .
−Q
−AT
Why is H relevant? If P solves the ARE we have
−Q − AT P = P [A − BR−1 B T P ].
This leads to the relation
I 0
I 0
A − BR−1 B T P
−BR−1 B T
H
=
P I
P I
0
−[A − BR−1 B T P ]T
and thus
−1 I 0
I 0
A − BR−1 B T P
−BR−1 B T
H
=
P I
P I
0
−[A − BR−1 B T P ]T
Hence a solution P of the ARE allows to transform H by similarity into
a block-triangular form. Many insights can be extracted from here.
24/60
Riccati Theory: Main Result I
Let P be a stabilizing solution of the ARE. We then infer from
eig(H) = eig(A − BR−1 B T P ) ∪ eig(−[A − BR−1 B T P ]T )
and since A − BR−1 B T P is Hurwitz that
H has no eigenvalues on the imaginary axis.
Of course, we also conclude that (A, B) is stabilizable. This proves one
direction of the following key result.
Theorem 7 (ARE) has a stabilizing solution iff (A, B) is stabilizable
and the corresponding Hamiltonian matrix H has no eigenvalues on
the imaginary axis.
The proof is constructive and allows to determine “the” stabilizing solution of the ARE by solving an eigenvalue problem.
Let’s first prove uniqueness.
25/60
Uniqueness
Let P be a stabilizing solution of the ARE. We then infer that
X
I
R
=
ker[(H − λI)2n ],
P
−
λ∈C
which is the complex generalized eigenspace of H related to its eigenvalues in C− (and also often just called the stable subspace of H).
If P1 and P2 are two stabilizing solutions we hence conclude
I
I
R
=R
P1
P2
and thus P1 = P2 . This proves the following result.
Lemma 8 (ARE) has at most one stabilizing solution.
Remark. This holds for so-called indefinite AREs as well, which are
defined with some non-singular and merely symmetric matrix R.
26/60
A Property of the Hamiltonian Matrix
The eigenvalues of the real matrix H are clearly located symmetrically
with respect to the real axis in the complex plane. Due to the particular
structure of H the same holds with respect to the imaginary axis.
Lemma 9 If λ is an eigenvalue of H with algebraic multiplicity
k then −λ̄ is as well an eigenvalue of H with the same algebraic
multiplicity.
Proof. Define the skew-symmetric (and orthogonal) matrix
0 −I
J=
.
I 0
One easily checks that JH is symmetric. This implies JH = (JH)T =
H T J T = −H T J and thus
JHJ −1 = −H T .
Similarity of H and −H T proves the statement.
27/60
Proof of ‘if’ in Theorem 7
Since H has no eigenvalue on the axis and by Lemma 9, H has n
eigenvalues in the open left- and in the open right half-plane respectively.
Therefore there exists an invertible T ∈ C2n×2n such that
M M12
−1
T HT =
with M ∈ Cn×n being Hurwitz.
0 M22
Just choose T1 ∈ C2n×n as a basis of the stable subspace of H:
X
R(T1 ) =
N [(H − λI)2n ];
λ∈C−
then extend with T2 to a non-singular matrix T = (T1 T2 ). Since R(T1 )
is H-invariant, the above structure follows.
Now partition T as H into four n × n-blocks as
U T12
U
T =
implying HZ = ZM for Z :=
.
V T22
V
28/60
Proof of ‘if’ in Theorem 7
The key step is to show: U is invertible and V U −1 is real symmetric
(even though T is computed over C to block-triangularize H).
Step 1. V ∗ U = U ∗ V.
HZ = ZM implies Z ∗ JHZ = Z ∗ JZM . Since the l.h.s. is a Hermitian matrix, so is the right-hand side. This implies (Z ∗ JZ)M =
M ∗ (Z ∗ J ∗ Z) = −M ∗ (Z ∗ JZ) by J ∗ = −J. Since M is Hurwitz, we
infer from M ∗ (Z ∗ JZ) + (Z ∗ JZ)M = 0 that Z ∗ JZ = 0 (Theorem 3)
which is indeed nothing but V ∗ U − U ∗ V = 0.
Step 2. U x = 0 ⇒ B T V x = 0 ⇒ U M x = 0.
U x = 0 and the first row of ZM x = HZx imply U M x = (AU −
BR−1 B T V )x = −BR−1 B T V x and thus x∗ V ∗ U M x = −x∗ V ∗ BR−1 B T V x.
By Step 1 we get −x∗ V ∗ BR−1 B T V x = x∗ U ∗ V M x = 0 and thus
B T V x = 0. From U M x = −BR−1 B T V x we infer U M x = 0.
29/60
Proof of ‘if’ in Theorem 7
Step 3. U is invertible.
Suppose N (U ) 6= {0}. Since U x = 0 implies U M x = 0 (Step 2),
N (U ) is M -invariant. Since non-trivial, there exists an eigenvector of
M in N (U ), i.e., an x 6= 0 with M x = λx and U x = 0. Now the
second row of HZ = ZM yields (−QU − AT V )x = V M x and thus
AT V x = −λV x. Since U x = 0, we have B T V x = 0 (Step 2). Because
(A, B) is stabilizable and Re(−λ) > 0, we infer V x = 0. Since U x = 0,
this implies Zx = 0 and hence x = 0 because Z has full column rank.
Contradiction!
Step 4. P := V U −1 is Hermitian.
V ∗ U = U ∗ V implies U −∗ V ∗ = V U −1 and hence (V U −1 )∗ = V U −1 .
30/60
Proof of ‘if’ in Theorem 7
Step 5. P is a (and hence the) stabilizing solution of the ARE.
From HZ = ZM we infer HZU −1 = ZU −1 (U M U −1 ) and hence
A − BR−1 B T P
I
I
=H
=
(U M U −1 ).
P
P
−Q − AT P
The first row implies A−BR−1 B T P = U M U −1 such that A−BR−1 B T P
is Hurwitz. The second row reads as −Q − AT P = P (A − BR−1 B T P ),
which just means that P satisfies the ARE.
Step 6. P is real.
Since the data matrices are real, we have
AT P + P A − P BR−1 B T P + Q = AT P̄ + P̄ A − P̄ BR−1 B T P̄ + Q
and A − BR−1 B T P = A − BR−1 B T P̄ . Hence P and P̄ are both
stabilizing solutions of the ARE, which implies P = P̄ (Lemma 8).
31/60
How to Block-Triangularize the Hamiltonian?
Let us mention three possibilities to block-triangularize the Hamiltonian:
• Choose T which block-diagonalizes H.
One can e.g. transform H into the (suitably ordered) real or complex
Jordan canonical form and extract the first n columns of T .
In practice H is often diagonizable. Then these first n columns of T
can be taken equal to n linearly independent eigenvectors of H that
correspond to the eigenvalues of H in the open left half-plane.
• A numerically much more favorable way is to use the ordered Schur
decomposition: Recall that one can always compute a unitary T
which achieves the required block-triangular form of H.
• Modern algorithms (for large matrices) construct T with symplectic
transformations on H that preserve the Hamiltonian structure.
32/60
Example
Here is some (very naive) Matlab code that computes the stabilizing
ARE solution (instead of using are or care):
% Check stabilizability of (A,B)
% Check whether or not H has eigenvalues on imaginary axis
H=[A -B*inv(R)*B’;-Q -A’];eig(H)
% Determine Z if H is diagonizable
[n,n]=size(A);[T,D]=eig(H);Z=[];
for j=1:2*n;
if real(D(j,j))<0;Z=[Z T(:,j)];end;
end;
% Compute P
if size(Z,2)==n;
U=Z(1:n,:);V=Z(n+1:2*n,:);P=V*inv(U);
end;
33/60
Riccati Theory: Additional Fact I
Typically, the ARE has infinitely many solutions. In the solution set of the
ARE the stabilizing solution (if existing) has a particularly nice location.
Theorem 10 The stabilizing solution of (ARE) is largest among all
other solutions.
Remark. Here the partial ordering among symmetric matrices P1 , P2 is
defined through P1 4 P2 ⇔ 0 4 P2 − P1 .
Proof. Let P be the stabilizing and X any other solution of the ARE.
With Â = A − BR−1 B T P and ∆ = X − P one easily checks by
subtracting the two AREs that
ÂT ∆ + ∆Â = ∆BR−1 B T ∆.
Since Â is Hurwitz, Theorem 3 implies ∆ 4 0 thus X 4 P . This shows
that P is largest among all solutions.
34/60
A Second Property of the Hamiltonian Matrix
The case Q < 0 is particularly nice. Then the eigenvalues of H on the
imaginary axis
C0 := {λ ∈ C | Im(λ) = 0} = {iω | ω ∈ R}
are determined by the uncontrollable modes of (A, B), denoted as
eig(A − sI B) = {λ ∈ C | rk(A − λI B) < n}
and the unobservable modes of (A, Q), denotes as
A − sI
A − λI
= {λ ∈ C | rk
< n},
eig
Q
Q
on the imaginary axis.
Theorem 11 If Q < 0 then
A − sI
0
eig(H) ∩ C = eig(A − sI B) ∪ eig
∩ C0 .
C
35/60
Proof
If H has the eigenvalue λ ∈ C0 we infer
A −BR−1 B T
e1
e1
e1
=λ
for some
6= 0.
e2
e2
e2
−Q
−AT
With R = U T U and Q = C T C we get
Ae1 − BU −1 [BU −1 ]T e2 = λe1 and − C T Ce1 − AT e2 = λe2 . (?)
By left-multiplying e∗2 and e∗1 we infer
e∗2 Ae1 − ke∗2 BU −1 k2 = λe∗2 e1 and − kCe1 k2 − e∗1 AT e2 = λe∗1 e2 .
The conjugate of the latter is −kCe1 k2 − e∗2 Ae1 = λ̄e∗2 e1 . Adding to
the first and exploiting λ̄ + λ = 0 (since λ ∈ C0 ) implies ke∗2 BU −1 k2 +
kCe1 k2 = 0 and thus e∗2 B = 0 and Ce1 = 0; therefore Qe1 = 0. By
(3) we hence have (A−λI)e1 = 0 and (AT − λ̄I)e2 = 0. Since either
e1 6= 0 or e2 6= 0, λ is either an unobservable mode of (A, Q) or an
uncontrollable mode of (A, B). The converse is shown by reversing the
arguments.
36/60
Riccati Theory: Main Result II
Theorem 12 If Q < 0, the ARE AT P +P A−P BR−1 B T P +Q = 0
has a stabilizing solution if and only if (A, B) is stabilizable and
(A, Q) has no unobservable modes on the imaginary axis.
Proof. If the ARE has a stabilizing solution, Theorem 7 implies that
(A, B) is stabilizable and H has no eigenvalues in C0 ; by Theorem 11
we infer that (A, Q) cannot have unobservable modes in C0 .
If (A, B) is stabilizable and (A, C) has no unobservable modes in C0 ,
then H has no eigenvalues in C0 by Theorem 11; hence Theorem 7
shows the existence of the stabilizing solution of the ARE.
Remark. The unobservable modes of (A, Q) coincide with those of
(A, C) in case that Q = C T C; hence the existence of the stabilizing ARE
solution is guaranteed if (A, B) is stabilizable and (A, C) is detectable.
37/60
Riccati Theory: Additional Fact II
Theorem 13 If Q < 0, the stabilizing solution P of (ARE) (if
existing) is positive semi-definite. If, in addition, (A, Q) is observable,
then P 0.
Proof. If P is any symmetric solution of the ARE we infer
(A − BR−1 B T P )T P + P (A − BR−1 B T P ) = −P BR−1 B T P − Q.
With Â = A − BR−1 B T P and Q̂ = −P BR−1 B T P − Q we infer
ÂT P + P Â = Q̂.
If P is the stabilizing solution then Â is Hurwitz. Since Q < 0, we have
Q̂ 4 0 and thus P < 0 by Theorem 3.
If (A, Q) is observable then (Â, Q̂) is: Âx = λx, Q̂x = 0 implies
x∗ Q̂x = 0 and thus B T P x = and Qx = 0; the former implies Ax = λx;
since (A, Q) is observable we get together with the latter x = 0.
Then Theorem 3 even allows to conclude P 0.
38/60
Riccati Theory: Additional Fact III
Theorem 14 Suppose that P < 0 satisfies (ARE). If (A, Q) is
detectable then P is the stabilizing solution.
Proof. As above we re-arrange the ARE to
(A − BR−1 B T P )T P + P (A − BR−1 B T P ) = −P BR−1 B T P − Q.
Now suppose that (A−BR−1 B T P )x = λx with x 6= 0. Left-multiplication
with x∗ and right-multiplication with x leads to
Re(λ)x∗ P x = −x∗ P BR−1 B T P x − x∗ Qx.
If x∗ P x = 0 we infer P x = 0 and thus x∗ Qx = 0 and thus Qx = 0.
Due to Ax = λx we infer with the Hautus-test that Re(λ) < 0.
If x∗ P x > 0 we directly get Re(λ) ≤ 0; the latter inequality is strict
since Re(λ) = 0 leads to a contradiction if following the arguments
above.
39/60
Riccati Theory: Summary
Let us summarize all individual statements for the ARE
AT P + P A − P BR−1 B T P + C T C = 0
under the hypothesis that (A, B) is stabilizable and (A, C) is detectable.
• The ARE has a unique stabilizing solution.
• The stabilizing solution is largest among all other solutions.
• The stabilizing solution is positive semi-definite.
• If P is positive semi-definite, it is the stabilizing solution.
These are the standard hypothesis as they are often formulated in the
literature and used in applications.
40/60
Solution of the LQ-Problem: Summary
Suppose that (A, B) is stabilizable and that (A, Q) with Q < 0 has
no unobservable modes on the imaginary axis.
• Then one can compute the unique solution P < 0 of the ARE
AT P + P A − P BR−1 B T P + Q = 0
for which A − BR−1 B T P is Hurwitz.
• The LQ-optimal control problem has a unique solution.
• The optimal value is ξ T P ξ and the optimal control strategy can
be implemented as a static state-feedback controller:
u = −R−1 B T P x.
The closed-loop eigenvalues are equal to those eigenvalues of the
Hamiltonian that are contained in the open left half-plane.
This fundamental result follows directly from page 17 and Theorem 12.
In Matlab the solution is made available with the command lqr.
41/60
Example: Segway
With the data of [AM] p. 189 and the linearization in
the upright position (zero input), we designed a static
state-feedback controller by pole-placement in Lecture 3 (blue). With
R = 0.1, Q = diag(100, 1, 1, 1) the LQ-responses (green) are improved:
100
0
−100
0
5
10
p
15
20
1
0
−1
0
5
10
θ
15
20
0.05
0
−0.05
0
5
10
15
20
42/60
Recap: Schur-Complement
For a block-matrix with invertible D we have
I −BD−1
A B
A − BD−1 C 0
=
.
0
I
C D
C
D
Schur-determinant-formula: If D is invertible then
A B
det
= det(A − BD−1 C) det(D).
C D
If A = AT , C = B T and D = DT is invertible then
A B
I
0
I −BD−1
A − BD−1 C 0
=
.
0
I
0
D
BT D
−D−1 B T I
Schur-complement-lemma:
A B
0 ⇐⇒ D 0 and A − BD−1 C 0.
BT D
Different variants are easily proved similarly.
43/60
Varying Input Weight
The closed-loop eigenvalues for the LQ-optimal gain are equal to the
eigenvalues of the Hamiltonian in the open left half-plane.
With some fixed positive definite matrix R0 suppose that we choose
R = ρR0 for some scalar ρ ∈ (0, ∞) to get
A − ρ1 BR0−1 B T
H=
.
−Q
−AT
For large ρ we try to keep the control effort small. Since − ρ1 BR0−1 B T
approaches 0 for ρ → ∞, the limiting closed-loop eigenvalues are equal
to the stable eigenvalues of
A
0
H=
.
−Q −AT
Hence they equal the stable eigenvalues of A (open-loop eigenvalues)
and of −AT (open-loop eigenvalues mirrored on imaginary axis).
44/60
Cheap Control
For small ρ we allow for a large control effort (i.e. control is “cheap”).
Let us use
Q = C T C, R0−1 = U0 U0T (U0 invertible), G(s) = C(sI − A)−1 BU0 .
With the Schur-determinant formula we get
det(sI − H) = det(sI − A) det(sI + AT − Q(sI − A)−1 BR0−1 B T /ρ)
= det(sI − A) det(sI + AT ) det(I − (sI + AT )−1 C T G(s)U0T B T /ρ) =
= det(sI − A) det(sI + AT ) det(I − U0T B T (sI + AT )−1 C T G(s)/ρ) =
1
= det(sI − A) det(sI + AT ) det(I + G(−s)T G(s)).
ρ
In general the zeros of this polynomial are not easy to analyze for ρ → 0.
One can show that some zeros move off to ∞, and others move to the
zeros of det(G(−s)T G(s)) if this polynomial does not vanish identically.
45/60
Cheap Control - Butterworth Pattern
If G(s) is SISO define d(s) = det(sI − A) with zeros p1 , ..., pn and
n(s) = d(s)G(s) with zeros z1 , ..., zm . We need to analyze the zeros of
1
(?)
d(−s)d(s) + n(−s)n(s) = 0.
ρ
For ρ → 0 the following holds (Kwakernaak, Sivan, 1972):
• 2m zeros of (?) approach ±z1 , . . . , ±zm .
• 2(n − m) move to ∞ asymptotically along straight lines through
the origin with the following angles to the positive real axis:
kπ
, k = 0, 1, . . . , 2n − 2m − 1, n − m odd
n−m
(k + 21 )π
, k = 0, 1, . . . , 2n − 2m − 1, n − m even.
n−m
Those in the open left half-plane are the closed-loop eigenvalues.
46/60
Example
Segway with Q = C T C and C = 1 1 0 0 as well as R0 = 1.
Magenta: Zeros d(s). Green: Zeros n(s). Eigenvalues for ρ ∈ (10−6 , 100):
4
3
2
1
0
−1
−2
−3
−4
−4
−3.5
−3
−2.5
−2
−1.5
−1
−0.5
0
47/60
A Version of Barbalat’s Lemma
It is convenient to use Ln2 := L2 ([0, ∞), Rn ). Moreover let Ln2,loc be all
measurable x : [0, ∞) → Rn with x(.) ∈ L2 ([0, T ], Rn ) for every T > 0.
Lemma 15 Suppose that x ∈ Ln2 is locally absolutely continuous
and that ẋ ∈ Ln2 . Then x(t) → 0 for t → ∞.
Proof. By partial integration we have
Z t
2
x(τ )T ẋ(τ ) dτ = kx(t)k2 − kx(0)k2 .
0
T
Since x(.) ẋ(.) ∈ L1 [0, ∞), the left-hand side has a limit for t → ∞.
This implies that there exists α ≥ 0 with kx(t)k → α for t → ∞.
If α > 0, there exists T > 0 such that
kx(t)k2 ≥ α2 /2 for t ∈ [T, ∞)
which clearly contradicts the fact that x(.) has a finite L2 -norm. Hence
α = 0 and thus x(t) → 0 for t → ∞.
48/60
Young’s Inequality for Convolutions
Lemma 16 For 1 ≤ p ≤ q ≤ ∞ choose the unique a ∈ [1, ∞] with
1 1
1
+ =1+ .
p a
q
If M ∈ La ([0, ∞), Rk×m ) and u ∈ Lp ([0, ∞), Rm ) define
Z •
y(•) =
M (• − τ )u(τ ) dτ on [0, ∞).
0
Then y ∈ Lq ([0, ∞), Rk ) and
kykq ≤ kM ka kukp .
With the spectral norm k.k for matrices and if a < ∞ we clearly use
 s
Z ∞


a

kM (t)ka dt for a < ∞
kM ka :=
0


 ess sup
kM (t)k for a = ∞,
t∈[0,∞)
with the corresponding specializations to vector-valued functions.
49/60
A Useful Auxiliary Result
Lemma 17 Suppose that ẋ = Ax + Bu, y = Cx is detectable. If
k
(x(.), u(.), y(.)) is any trajectory such that u ∈ Lm
2 and y ∈ L2 then
x(t) → 0 for t → ∞.
Proof. Choose L such that A − LC is Hurwitz. Then the trajectories
also satisfy the relations
ẋ(t) = (A − LC)x(t) + v(t) for v(.) := Ly(.) + Bu(.).
Note that v ∈ Ln2 . Since A−LC is Hurwitz, we certainly have e(A−LC)• ∈
L1 ([0, ∞), Rn×n ) ∩ L2 ([0, ∞), Rn×n ).
The Variation-of-Constants formula and Young’s inequality imply that
x(.) ∈ Ln2 and the differential equation then immediately reveals that
ẋ(.) ∈ Ln2 as well. By applying our version of Barbalat’s lemma, the
claim is proved.
50/60
Robustness Properties
A perfect implementation of a state-feedback controller leads to
ξ
ẋ = Ax + Bu
−
x
F
In a non-ideal situation, the signal sent to the system might get distorted.
This is modeled by a filter ∆ which is just another dynamical system:
ξ
−
∆
ẋ = Ax + Bu
x
F
Typically examples: Actuator or transmission channel dynamics, delays
51/60
Robustness Properties
If ∆ equals the static gain matrix I ∈ Rm×m , we know that the system
is stable in the sense that limt→∞ x(t) = 0 for all ξ ∈ Rn .
Question: How much can ∆ deviate from I without loosing stability?
For example, one could consider static gain perturbations ∆ ∈ Rm×m .
ξ
z
−
∆
u
ẋ = Ax + Bu
x
F
The interconnection is then compactly described as
ẋ
A B
x
=
, x(0) = ξ, u = ∆z.
z
−F 0
u
52/60
Robustness Properties
Let P satisfy (ARE) and set F = R−1 B T P . We infer
T
A P + P A − P BR−1 B T P + Q 0
0=
=
0
0
T
A P + P A + Q − F T RF P B − F T R
=
=
B T P − RF
0
T
T
A P + PA PB
Q0
F RF F T R
=
+
−
<
0 0
RF
0
BT P
0
T T I 0
0 P
I 0
−F 0
−R R
−F 0
<
+
AB
P 0
AB
0 I
R 0
0 I
Why did we derive this inequality?
It leads to a crucial inequality that allows us to specify a very large class
of ∆’s which do not destroy stability of the loop.
53/60
Robustness Properties
For any systems trajectory and for all t ≥ 0 we easily conclude:
T T x(t)
I 0
0 P
I 0
x(t)
0≥
+
u(t)
A B
P 0
A B
u(t)
T x(t)
−F 0
−R R
−F 0
x(t)
+
=
u(t)
0 I
R 0
0 I
u(t)
T T x(t)
0 P
x(t)
z(t)
−R R
z(t)
=
+
=
ẋ(t)
P 0
ẋ(t)
u(t)
R 0
u(t)
T d
z(t)
z(t)
−R R
T
= x(t) P x(t) +
.
u(t)
u(t)
R 0
dt
For any system trajectory and T > 0 the following inequality holds
T
Z T
z(t)
−R R
z(t)
T
x(T ) P x(T ) +
dt ≤ ξ T P ξ. (1)
u(t)
R
0
u(t)
0
54/60
Robustness Properties: Main Result III
m
Let ∆ : Lm
2,loc → L2,loc have the following properties: There exist real
constants γ and > 0 such that for all z ∈ Lm
2,loc and for all T > 0:
Z T
Z T
2
2
kz(t)k2 dt
(2)
k∆(z)(t)k dt ≤ γ
0
0
and
Z
0
T
z(t)
∆(z)(t)
T −R R
R 0
z(t)
∆(z)(t)
Z
dt ≥ T
kz(t)k2 dt. (3)
0
Theorem 18 Let P < 0 be the stabilizing solution of (ARE) for
Q < 0. With any ∆ as above and any ξ ∈ Rn , all responses x ∈ Lm
2,loc
of the interconnection
ẋ
A B
x
=
, x(0) = ξ, u = ∆(z)
(4)
z
−F 0
u
satisfy limt→∞ x(t) = 0.
55/60
Proof
Consider any response x ∈ Ln2,loc of the interconnection. We infer that
m
z(.) = −F x(.) ∈ Lm
2,loc and thus u(.) = ∆(z(.)) ∈ L2,loc . We can hence
merge (1) and (3) to infer
Z T
kz(t)k2 dt ≤ ξ T P ξ
0
for all T > 0. This implies z(.) ∈ Lm
2 !
Then (2) leads to
Z T
Z
2
2
ku(t)k dt ≤ γ
0
T
2
kz(t)k dt ≤ γ
0
for all T > 0 and hence u(.) ∈
2
Z
∞
kz(t)k2 dt < ∞
0
Lm
2 .
Since A − BF is Hurwitz, (A, −F ) is detectable. Then the statement
follows from Lemma 17.
56/60
Example: Static Gains
Let D ∈ Rm×m be a static gain-matrix such that
T I
−R R
I
0.
D
R 0
D
Then ∆ defined through ∆(z)(t) := Dz(t) for t ≥ 0 satisfies all required
hypotheses. Hence (4) described in this case through
ẋ = (A − BDF )x, x(0) = ξ
satisfies, for any initial condition, x(t) → 0 for t → ∞.
The inequality characterizing allowed D’s can be interpreted in various
fashions. Here is a simple one: We can allow for D = dI with d ∈ ( 21 , ∞);
classically this means that the system has an impressive gain-margin.
If F is the optimal LQ-gain, it can be changed to dF for d ∈ ( 12 , ∞)
without endangering stability of the closed-loop system.
57/60
Example: Static Nonlinearities
Let N : Rm → Rm be Lipschitz-continuous and suppose there exist
γ ∈ R and > 0 such that for all z ∈ Rm :
T z
−R R
z
kN (z)k ≤ γkzk and
kzk2
N (z)
R 0
N (z)
Then ∆ defined through ∆(z)(t) := N (z(t)) for t ≥ 0 satisfies all
required hypotheses. Hence (4) described in this case through
ẋ = Ax − BN (F x), x(0) = ξ
satisfies, for any initial condition, x(t) → 0 for t → ∞.
• Lipschitz-continuity of N (.) implies that the ivp has a unique solution.
The conditions then also guarantee that there is no finite escape time.
• Our result covers a much larger class of ∆’s, that can be generated by
finite- or infinite-dimensional dynamical systems. We just scratched
the surface of an area which is called robust control.
58/60
Example: Segway
If compared to pole-placement controller (blue), the LQ-controller (green)
leads to better robustness properties.
Gain uncertainty d:
1
0.8
0.7
100
0
−100
0
5
10
p
15
20
2
0
−2
0
5
10
θ
15
20
0.1
0
−0.1
0
5
10
15
20
59/60
Example: Segway
The margin d = 0.5 is tight, as seen from the next simulations for the
LQ-controller (green) only.
Gain uncertainty d:
200
0
−200
0
10
0.6
20
0.5
0.48
30
40
50
30
40
50
30
40
50
p
2
0
−2
0
10
20
0.2
0
−0.2
0
10
20
θ
60/60

Download Report

LQ Optimal Control

Paperzz.com

Your Paperzz