Derivation of HJI Constrained – Hong Kong Version

Derivation of HJI Constrained – Hong Kong Version
Revised: Friday, March 12, 2004
Bounded L2 Gain Solution for Input-Constrained Systems
For the system
x  f ( x)  g ( x)u  k ( x)d
yx
z
z   ( x, u )
y
where z  hT h  u , one desires to find
2
2
d
yx
u
z   ( x, u )
a control u(t) such that, for a prescribed ,
when x(0)= 0 and for all disturbances
d (t )  L2 one has

x  f ( x)  g ( x)u  k ( x) d
u  l ( y)


2
z (t ) dt

0

 d (t )
2
 (h
T
h  u ) dt
2
  2,
0

 d (t )
dt
0
2
dt
0
i.e. L2-gain less than or equal to .
Game Theory Saddle Point
Define the value functional


J (u, d )   h T h  u   2 d
2
2
 dt
0
and consider the zero sum differential game where player u(t) (the optimal control) seeks to
minimize J and player d(t) (the worst case disturbance) seeks to maximize J. If the resulting
solution satisfies J<0 for all d(t), then the L2-gain is less than or equal to . This optimal control
problem has a unique solution if a game theoretic saddle point exists, i.e. if
max min J ( x0 , u, d )  min max J ( x0 , u, d )
(1)
d
u
u
d
with x(0)=x0 the initial condition.
To solve the optimal control problem, introduce a Lagrange multiplier p(t) and define the
Hamiltonian
H ( x, p, u, d )  p T  f  gu  kd   h T h  u
2
 2 d .
2
Since H is separable in u, d, necessary conditions for a stationary point are given by
H
H
0
, 0
.
u
d
A unique saddle point H ( x, p, u*, d *) for H(x,p,u,d) exists at (u*,d*) if
1
max min H ( x, p, u, d )  min max H ( x, p, u, d )
u
d
u
d
(2)
which is equivalent to
H ( x, p, u*, d )  H ( x, p, u*, d *)  H ( x, p, u, d *)
or
2H
2H

0
,
 0.
u 2
d 2
Though (1) implies (2), the converse is not true. When (2) holds and (1) does not hold,
one solves the minimax problem to obtain the practically useful solution
min max J ( x0 , u , d ) .
u
d
This is accomplished by first finding the worst case disturbance d(t), then holding it fixed while
finding the optimal control u(t).
Constrained Controls
Define the norm d
2
 d T d . To make sure the control u(t) is constrained with prescribed
saturation function (.), define the object
u
u
2
q
 2  T ( )d
0
where 
T
(u)u  0 . This is a quasi-norm, i.e.
0 x0
1.
x
2.
x y
3.
x
q
q
q
 xq y
 x
q
q
Note that the third property of symmetry is weaker than the norm property of homogeneity.
Now one has the Hamiltonian
H ( x, p, u, d )  p
T
f
u
 gu  kd   h h  2  T ( )d   2 d T d
T
0
and the stationarity conditions
H
0
 g T p  2 1 (u )
u
H
0
 k T p  2 2 d
d
so the optimal inputs are

u*   12  g T ( x) p
d* 
1
2
2

k T ( x) p .
Note that the control is guaranteed to be saturated with saturation function (.).
2
One can show that
u

H ( x, p, u, d )  H ( x, p, u*, d *)   d  d *  2   T ( )d   T (u*)(u  u*)
u*

2
2
2
where the last term is positive definite if  1 (u) is monotonically increasing. This implies
H ( x, p, u*, d )  H ( x, p, u*, d *)  H ( x, p, u, d *)
which guarantees a unique saddle point (u*,d*) for the Hamiltonian.
Bounded L2 Gain
Note that for any C1 function V ( x) : R n  R one has, along the system trajectories,
dV V V
V


x 
( f  gu  kd ) ,
dt
t
x
x
T
T
so that
u
dV
 h T h  2  T ( )d   2 d T d  H ( x,Vx , u, d ) .
dt
0
Suppose now there exists a C1 function V ( x) : R n  R , with V(0)=0, whose gradient
satisfies
V
H ( x,Vx , u*, d *) 
x
T
f
u*
 gu * kd *  h h  2   T ( )d   2 d *T d * = 0
T
(3)
0
for all x. Then one has
u



H ( x,Vx , u, d )  H ( x,Vx , u*, d *)   d  d *  2    T ( )d   T (u*)(u  u*) 
u *



2
u

H ( x,Vx , u, d )   d  d *  2   T ( )d   T (u*)(u  u*)
u*

Therefore,
2
2
2
2
2
u
 u T

dV
2
2
T
T
2 T
 h h  2  ( )d   d d    d  d *  2   ( )d   T (u*)(u  u*)
dt
0
u*

Selecting u(t)=u*(t) yields
2
u*
dV
 h T h  2   T ( )d   2 d T d  0 .
dt
0
(4)
Integrating this equation yields
T
u*


V ( x(T ))  V ( x(0))    h T h  2   T ( )d   2 d T d  dt  0
0
0

(5)
Select x(0)=0 and assume the system is asymptotically stable so that lim x(T )  0 . Noting that
T 
V(0)=0 one has
3
u*

 T

T
2


h
h

2

(

)
d

dt


(d T d ) dt
0 
0


0


so the L2 gain is less than .
Value Functions
Note that, for any prescribed u(t), d(t) one may set the Hamilton to zero,
0  H ( x,Vx , u, d ) 
V
x
T
f
u
 gu  kd   h T h  2  T ( )d   2 d T d
0
u

dV
 h T h  2  T ( )d   2 d T d
dt
0
(6)
This provides an infinitesimal equivalent to J(u,d). In fact, the value function or cost to go for
the selected u(t), d(t) is defined as


V ( x(t ))   h T h  u   2 d
2
2
 dt
t
whence Leibniz’s formula reveals the infinitesimal equivalent


dV V V
2
2


x   h T h  u   2 d .
dt
t
x
Therefore, solving (6) gives the cost to go for any given u(t), d(t). The optimal cost to go is
given by the value function which solves (3).
T
Stability of Closed-Loop System
So far V(x) only needs to satisfy the HJI equation. If in fact there is a solution V(x) to the HJI
with V(x)>0, and the system is zero-state observable through z(t), then the closed-loop system
using u(t)= u*(t) is locally asymptotically stable. In fact, letting d (t )  0 in (4) one has
u*
dV
 h T h  2   T ( )d   z (t )
dt
0
2
However, ZS observable means z (t )  0  x(t )  0 . The converse x(t )  0  z (t )  0 shows
that V  0 , so that the system is locally AS with Lyapunov function V(x).
Note further that if V(x)>0 then, for any d(t), the closed-loop system has bounded output
z(t). In fact if V(x)> 0, then according to (5) one has
u*
 T

T
2 T

h
h

2
0 
0  ( )d   d d  dt  V ( x(T ))  0
T
and the system has bounded L2 gain.
Determining the Optimal Controls
To determine the optimal controls one must solve the equation
4
H ( x,Vx , u*, d *) 
V
x
T
f
u*
 gu * kd *  h T h  2   T ( )d   2 d *T d *  0
0
with the optimal control and worst case disturbance

u*   12  g T ( x)V x
d* 
1
2
2

k T ( x)V x .
Substituting these into the optimal Hamiltonian yields the Hamilton-Jacobi-Isaacs equation
 1 dV 
  g T

dx 
2
dV 
1 dV
dV
 1 T dV   T
1
kk T
 0.
   h h  2   (v)dv  2
 f  g   g
dx 
dx  
4

dx
dx
2
0
This equation is generally impossible to solve analytically for most nonlinear systems.
T
T
5