Discrete-time LQR and system discretization

© Copyright F.L. Lewis 2008
All rights reserved
Updated: Monday, December 24, 2012
These notes are based on the book
F.L. Lewis, D. Vrabie, and V.L. Syrmos, Optimal Control, 3rd edition, John Wiley 2013.
More details and examples are found in that book. Practical applications are given in
F.L. Lewis, Applied Optimal Control and Estimation:
Digital Design and
Implementation, Prentice-Hall, New Jersey, TI Series, Feb. 1992.
Feedback Control for Discrete-Time Systems
Discrete-time design for feedback controls yields Digital Controllers that can be implemented as
difference equations on a digital computer.
A discrete-time system is given by
xk 1  Axk  Buk
with xk  R n , uk  R m . The initial condition is x0 . We seek to find a state-variable feedback
(SVFB) control
uk   Kxk  vk
that gives desirable closed-loop properties. The closed-loop system using this control becomes
xk 1  ( A  BK ) xk  Bvk  Ac xk  Bvk
with Ac the closed-loop plant matrix and vk the new command input.
The ideas behind feedback control are the same for CT systems and DT systems, though
some of the formulae change.
The equivalence of pole placement and reachability still holds, as well as the definition of
stabilizability, where the requirement is only to stabilize the system using SVFB, not place all
the poles. Ackermann’s formula works for both CT and DT systems.
Deadbeat Control
There is one notion that is unique to DT systems, that of deadbeat control. In deadbeat
control, one selects the SVFB to place ALL POLES AT THE ORIGIN z=0. Then the closedloop matrix Ac has the characteristic polynomial
c ( z)  z n
and according to the Cayley-Hamilton theorem, ones has
 c ( A)  An  0
Under this circumstance the closed-loop state trajectories are given by
2.1
k 1
xk  Ack x0   Ack i 1 Bvi
i 0
Then, if the input vk =0, deadbeat control yields
xn  An x0  0 .
That is, the state is driven to zero after n time steps.
Deadbeat control allows one to obtain behavior in sampled systems that cannot be
matched using CT design. Specifically, proper selection of the sample period T and deadbeat
design can allow one to achieve very fast settling times with NO OVERSHOOT. In effect, there
is no tradeoff between settling time and POV using deadbeat control.
Discrete-Time Linear Quadratic Regulator (DT LQR) State Feedback Design
Given the discrete-time system
xk 1  Axk  Buk
we now seek to find a state-variable feedback (SVFB) control
uk   Kxk
that minimizes the DT performance index
J ( xk ) 

1
2
 ( x Qx u
T
i
i k
i
T
i
Rui )
(1)
with design weighting matrices Q  QT  0, R  RT  0 . Note that this cost function also
depends on all the future control inputs uk , uk 1 ,  which we do not show in its argument.
This is known as the DT linear quadratic regulator problem (DT LQR), since the system
is linear and the cost is quadratic.
Substituting the SVFB control into this yields
J ( xk ) 

1
2
x
i k
T
i
(Q  K T RK ) xi
The closed-loop system using SVFB becomes
xk 1  ( A  BK ) xk  Ac xk
A difference equation equivalent to (1) is

J ( xk ) 
1
2
 x Qx
 ukT Ruk  
J ( xk ) 
1
2
 x Qx
 ukT Ruk  J ( xk 1 )
or
T
k
T
k
k
k
 ( x Qx u
i  k 1
T
i
i
T
i

Rui )
(2)
2.2
where one requires the boundary condition J ( xk  0)  0 . That is, if one can solve (2) given a
control input sequence, it is the same as finding J ( xk ) for the given current state xk by
evaluating the infinite sum in (1).
Now ASSUME that the optimal cost, i.e. the minimum cost, is given for all k in the form
J * ( xk )  xkT Pxk
(3)
That is, suppose the optimal cost is quadratic in terms of the current state and in terms of some
unknown kernel matrix P. If we can find the optimal feedback in terms of this assumption, then
the assumption shall turn out to be valid.
Note that J * ( xk ) only depends on the initial state xk, not the future inputs, since the
optimal cost is defined by selecting all future feedback controls as
uk   Kxk
with K the optimal SVFB gain.
To find the optimal SVFB and the optimal cost kernel P, substitute (3) into (2) to obtain
xkT Pxk 
1
2
 x Qx
T
k
k

 ukT Ruk  xkT1 Pxk 1
Note that both J ( xk ), J ( xk 1 ) are expressed in terms of the same kernel P, according to the
assumption (3).
Put xk 1  Axk  Buk into this equation to obtain
J ( xk )  xkT Pxk 
1
2
 x Qx
T
k
k

 ukT Ruk  ( Axk  Buk )T P( Axk  Buk )
(4)
The optimal control problem is now to minimize this with respect to uk. To do this, differentiate
to get

0
J ( xk )  xkT Pxk  Ruk  BT P( Axk  Buk )
uk
This yields the optimal control as
( R  BT PB)uk   BT PAxk
or
uk  ( R  BT PB) 1 BT PAxk
This defines the optimal gain as
K  ( R  BT PB) 1 BT PA
(5)
To find P, put uk   Kxk into (4) to get
xkT  ( A  BK )T P ( A  BK )  P  Q  K T RK  xk  0
Since this must hold for all current states xk, one has the matrix equation
( A  BK )T P( A  BK )  P  Q  K T RK  0
Note that this is a Lyapunov equation in terms of the closed-loop system matrix
Ac  A  BK
Now put (5) into this to obtain
2.3
[ A  B( R  BT PB) 1 BT PA]T P[ A  B( R  BT PB) 1 BT PA]  P  Q
 [( R  BT PB ) 1 BT PA]T R[( R  BT PB) 1 BT PA]  0
whence some simplification and determination yields
AT PA  P  Q  AT PB( R  BT PB) 1 BT PA  0
This equation is of prime importance in modern control theory. It is known as the DT algebraic
Riccati equation (DT ARE).
At this point we have succeeded in solving the DT LQR problem, therefore all our
assumptions are justified.
The design procedure for finding the LQR feedback K is:

Select design parameter weighting matrices Q  QT  0, R  RT  0

Solve the DT algebraic Riccati equation for P
AT PA  P  Q  AT PB( R  BT PB) 1 BT PA  0

Find the SVFB using K  ( R  BT PB) 1 BT PA

Verify performance. If not suitable try again for different weights Q, R.
There are very good numerical procedures for solving the DT ARE. The MATLAB routine that
performs this is named dlqr(A,B,Q,R).
In keeping with modern design techniques, one solves a matrix quadratic equation for the
auxiliary matrix P given (A,B,Q,R). Then, the optimal SVFB gain is given by (5). The minimal
value of the PI using this gain is given by
J * ( xk )  xkT Pxk
(which only depends on the current initial condition. This mean that the cost of using the
optimal SVFB can be computed from the current initial conditions before the control is ever
applied to the system.
Compare the DT ARE and optimal gain to the CT ARE and gain
AT P  PA  Q  PBR 1 B T P  0
K  R 1 B T P
Evidently, design equations are more complex for DT systems.
The LQR design procedure, either Ct or DT, is guaranteed to produce a feedback that
stabilizes the system as long as some basic properties hold:
LQR Theorem. Let the system (A,B) be reachable. Let R be positive definite and Q be
positive definite. Then the closed loop system (A-BK) is asymptotically stable.
2.4
Note that this holds regardless of the stability of the open-loop system. Recall that reachability
can be verified by checking that the reachability matrix U  [ B AB A 2 B  A n1 B ] has full rank
n.
The following milder version of this result also holds in DT.
LQR Theorem 2. Let the system (A,B) be stabilizable. Let R be positive definite, Q be
positive semi definite, and ( A, Q ) be observable. Then the closed loop system (A-BK)
is asymptotically stable.
2.5
DISCRETIZATION OF CONTINUOUS SYSTEMS
In this section we shall discuss converting a continuous-time system into a discrete-time
system. We shall explicitly include the ZOH required to convert the discrete control samples uk into
the plant control input u(t).
Sampling The Plant
In this subsection we shall show how to convert a continuous-time state-variable description
of a plant into a discrete-time state-variable description.
Suppose a continuous time-invariant plant is given by
.
x= Ax + Bu
y= Cx + Du
(i)
(ii)
with x(t)ε Rn, measured output y(t)ε Rp, and control input u(t)ε Rm. For digital control purposes it
is desired to define a discrete time index k such that
t= kT
(iii)
with T the sampling period. Then, the discrete control input uk is to be switched at times kT, k=
0,1,...,N-1 by the microprocessor.
The usual procedure for controlling the plant is to hold the control input u(t) constant
between control switchings. This may be achieved by adding a ZOH before the plant. See the
figure.. Then, the continuous plant input u(t) is given in terms of the discrete control uk by
u(t)= uk , kT t< (k+1)T.
(iv)
Note that u(t) is switched at the times kT so that it is continuous from the right.
Also shown in the figure is a sampler with period T added to the output channel of the plant.
This A/D device generates the samples
yk= y(kT)
(v)
of the output. Let us also define the samples
u(t)
y(t)
●
● ● ●
●
0 1
Tuk
2 3
●
● yk
u(t)
ZOH ● ●
4 hold
5 6 7 8
Sample y(t)
●
x  Ax  Bu
t= kT
k
2.6
x(t)
uk
●
●
●
●
●
T
C
y(t)
T●
●
●
0 1 2 3 4 sample
5 6 7 8
Hold uk
Sampling a continuous-time system
yk
t= kT
k
xk= x(kT)
(vi)
of the state vector.
It is now required to determine a dynamical relation between uk and xk such that
xk+1= Asxk + Bsuk.
(vii)
That is, we need to determine the sampled equivalents As, Bs of A and B.
To achieve this, first write the solution of (i), which is
x (t )  e
A ( t  t0 )
t
x (t0 )   e A( t  ) Bu ( ) d
(viii)
t0
Setting t0= kT and t= (k+1)T yields
( k 1)T
x (( k  1)T )  e x( kT ) 
AT

e A (( k 1)T  ) Bu ( ) d
kT
Since u(t) has the constant value of uk over the sample period due to the ZOH, we may extract it
from the integrand. Then, changing variables to λ= τ-kT yields
T
xk 1  e xk   e A(T  ) B d  uk
AT
0
Changing variables again to τ= T-λ yields finally
T
xk 1  e xk   e A B d uk
AT
0
2.7
By comparison with (i) we may now identify the discretized plant matrices as
As  e AT
(ix)
T
B   e A B d
s
(x)
0
It is important to notice that the discretized plant matrix As is always nonsingular.
Since the output equation is a nondynamical relation, we may simply write
yk= Cxk + Duk.
(xi)
That is, the C and D matrices are unchanged on discretization.
The design of digital controls proceeds as follows. First, a continuous-time state-variable
model of the plant is derived. Then, As and Bs are determined using the above equations. Finally,
the techniques to be presented are used to design a discrete control sequence uk. During the
implementation phase when the control is actually applied to the plant, u(t) is manufactured by
passing uk through a ZOH. Commercial digital signal processors (DSPs) will usually have a ZOH
built in.
It is important to clearly realize that this approach to digital controls design explicitly
includes the effects of the ZOH, since u(t) was assumed constant over the sample period in deriving
Bs. Thus, the resulting controller takes into account the properties of the hold and sampling
processes, guaranteeing exact behavior at the sample instants.
DISCRETIZATION OF PERFORMANCE INDEX
The LQR performance index for the continuous time system (7.1.1) is

J
1 T
x Qx  u T Ru dt

20
The corresponding performance index for the discrete-time system (7.1.7) is
JS 

1
2
 (x Q
ik
T
i
S
xi uiT R S ui )
with
Q S  QT , R S  R / T .
Note that the weighting matrices Q, R must also be discretized according to these formulas.
2.8
Example- Discrete-Time LQR Design
The inverted pendulum is notoriously difficult to stabilize using classical techniques. Here we
will use MATLAB to design a discrete-time LQR for the inverted pendulum.
Open-Loop Analysis
Taking the state as x  [ p p  ] T , with p(t) the cart position and (t) the rod angle, a
representative inverted pendulum is described by the continuous-time dynamics
Apend =
0
0
0
0
1 0
0 -1
0 0
0 9
0
0
1
0
Bpend =
0
0.1000
0
-0.1000
Cpend =
0
0
1
0
Dpend =
0
2.9
We select a sampling period of T=0.1 sec. This is smaller than the system time constant of 1/3 sec.
Using ZOH discretization with MATLAB routine c2d, the sampled data dynamics are given by
>> [ad,bd]=c2d(Apend,Bpend,0.1)
ad =
1.0000 0.1000 -0.0050 -0.0002
0 1.0000 -0.1015 -0.0050
0
0 1.0453 0.1015
0
0 0.9136 1.0453
bd =
0.0005
0.0100
-0.0005
-0.0102
LQR Design #1
Select the continuous-time PI weighting matrices as
Q=
1
0
0
0
0 0 0
1 0 0
0 10 0
0 0 10
R=
0.1000
Then, the sampled LQR weighting matrices are
Q S  QT , R S  R / T .
with T= 0.1 sec.
Now, the DT ARE is solve using MATLAB routine using
2.10
[K,P] dlqr(ad,bd,Q*T,R/T)
Now MATLAB routine dlsim(.) can be used to simulate the closed-loop discrete-time system. The
resulting response is indistinguishable from the continuous-time LQR result shown before. The
result is shown below, where the solid line is pk and the dotted line is k
0.3
0.25
0.2
0.15
0.1
0.05
0
-0.05
0
2
4
6
8
10
2.11
12
14
16
18
20

Download Report

Discrete-time LQR and system discretization

Paperzz.com

Your Paperzz