Derivation of HJI Constrained – Hong Kong Version Revised: Friday, March 12, 2004 Bounded L2 Gain Solution for Input-Constrained Systems For the system x f ( x) g ( x)u k ( x)d yx z z ( x, u ) y where z hT h u , one desires to find 2 2 d yx u z ( x, u ) a control u(t) such that, for a prescribed , when x(0)= 0 and for all disturbances d (t ) L2 one has x f ( x) g ( x)u k ( x) d u l ( y) 2 z (t ) dt 0 d (t ) 2 (h T h u ) dt 2 2, 0 d (t ) dt 0 2 dt 0 i.e. L2-gain less than or equal to . Game Theory Saddle Point Define the value functional J (u, d ) h T h u 2 d 2 2 dt 0 and consider the zero sum differential game where player u(t) (the optimal control) seeks to minimize J and player d(t) (the worst case disturbance) seeks to maximize J. If the resulting solution satisfies J<0 for all d(t), then the L2-gain is less than or equal to . This optimal control problem has a unique solution if a game theoretic saddle point exists, i.e. if max min J ( x0 , u, d ) min max J ( x0 , u, d ) (1) d u u d with x(0)=x0 the initial condition. To solve the optimal control problem, introduce a Lagrange multiplier p(t) and define the Hamiltonian H ( x, p, u, d ) p T f gu kd h T h u 2 2 d . 2 Since H is separable in u, d, necessary conditions for a stationary point are given by H H 0 , 0 . u d A unique saddle point H ( x, p, u*, d *) for H(x,p,u,d) exists at (u*,d*) if 1 max min H ( x, p, u, d ) min max H ( x, p, u, d ) u d u d (2) which is equivalent to H ( x, p, u*, d ) H ( x, p, u*, d *) H ( x, p, u, d *) or 2H 2H 0 , 0. u 2 d 2 Though (1) implies (2), the converse is not true. When (2) holds and (1) does not hold, one solves the minimax problem to obtain the practically useful solution min max J ( x0 , u , d ) . u d This is accomplished by first finding the worst case disturbance d(t), then holding it fixed while finding the optimal control u(t). Constrained Controls Define the norm d 2 d T d . To make sure the control u(t) is constrained with prescribed saturation function (.), define the object u u 2 q 2 T ( )d 0 where T (u)u 0 . This is a quasi-norm, i.e. 0 x0 1. x 2. x y 3. x q q q xq y x q q Note that the third property of symmetry is weaker than the norm property of homogeneity. Now one has the Hamiltonian H ( x, p, u, d ) p T f u gu kd h h 2 T ( )d 2 d T d T 0 and the stationarity conditions H 0 g T p 2 1 (u ) u H 0 k T p 2 2 d d so the optimal inputs are u* 12 g T ( x) p d* 1 2 2 k T ( x) p . Note that the control is guaranteed to be saturated with saturation function (.). 2 One can show that u H ( x, p, u, d ) H ( x, p, u*, d *) d d * 2 T ( )d T (u*)(u u*) u* 2 2 2 where the last term is positive definite if 1 (u) is monotonically increasing. This implies H ( x, p, u*, d ) H ( x, p, u*, d *) H ( x, p, u, d *) which guarantees a unique saddle point (u*,d*) for the Hamiltonian. Bounded L2 Gain Note that for any C1 function V ( x) : R n R one has, along the system trajectories, dV V V V x ( f gu kd ) , dt t x x T T so that u dV h T h 2 T ( )d 2 d T d H ( x,Vx , u, d ) . dt 0 Suppose now there exists a C1 function V ( x) : R n R , with V(0)=0, whose gradient satisfies V H ( x,Vx , u*, d *) x T f u* gu * kd * h h 2 T ( )d 2 d *T d * = 0 T (3) 0 for all x. Then one has u H ( x,Vx , u, d ) H ( x,Vx , u*, d *) d d * 2 T ( )d T (u*)(u u*) u * 2 u H ( x,Vx , u, d ) d d * 2 T ( )d T (u*)(u u*) u* Therefore, 2 2 2 2 2 u u T dV 2 2 T T 2 T h h 2 ( )d d d d d * 2 ( )d T (u*)(u u*) dt 0 u* Selecting u(t)=u*(t) yields 2 u* dV h T h 2 T ( )d 2 d T d 0 . dt 0 (4) Integrating this equation yields T u* V ( x(T )) V ( x(0)) h T h 2 T ( )d 2 d T d dt 0 0 0 (5) Select x(0)=0 and assume the system is asymptotically stable so that lim x(T ) 0 . Noting that T V(0)=0 one has 3 u* T T 2 h h 2 ( ) d dt (d T d ) dt 0 0 0 so the L2 gain is less than . Value Functions Note that, for any prescribed u(t), d(t) one may set the Hamilton to zero, 0 H ( x,Vx , u, d ) V x T f u gu kd h T h 2 T ( )d 2 d T d 0 u dV h T h 2 T ( )d 2 d T d dt 0 (6) This provides an infinitesimal equivalent to J(u,d). In fact, the value function or cost to go for the selected u(t), d(t) is defined as V ( x(t )) h T h u 2 d 2 2 dt t whence Leibniz’s formula reveals the infinitesimal equivalent dV V V 2 2 x h T h u 2 d . dt t x Therefore, solving (6) gives the cost to go for any given u(t), d(t). The optimal cost to go is given by the value function which solves (3). T Stability of Closed-Loop System So far V(x) only needs to satisfy the HJI equation. If in fact there is a solution V(x) to the HJI with V(x)>0, and the system is zero-state observable through z(t), then the closed-loop system using u(t)= u*(t) is locally asymptotically stable. In fact, letting d (t ) 0 in (4) one has u* dV h T h 2 T ( )d z (t ) dt 0 2 However, ZS observable means z (t ) 0 x(t ) 0 . The converse x(t ) 0 z (t ) 0 shows that V 0 , so that the system is locally AS with Lyapunov function V(x). Note further that if V(x)>0 then, for any d(t), the closed-loop system has bounded output z(t). In fact if V(x)> 0, then according to (5) one has u* T T 2 T h h 2 0 0 ( )d d d dt V ( x(T )) 0 T and the system has bounded L2 gain. Determining the Optimal Controls To determine the optimal controls one must solve the equation 4 H ( x,Vx , u*, d *) V x T f u* gu * kd * h T h 2 T ( )d 2 d *T d * 0 0 with the optimal control and worst case disturbance u* 12 g T ( x)V x d* 1 2 2 k T ( x)V x . Substituting these into the optimal Hamiltonian yields the Hamilton-Jacobi-Isaacs equation 1 dV g T dx 2 dV 1 dV dV 1 T dV T 1 kk T 0. h h 2 (v)dv 2 f g g dx dx 4 dx dx 2 0 This equation is generally impossible to solve analytically for most nonlinear systems. T T 5
© Copyright 2026 Paperzz