Example Paper With Correct Formatting

AN ADAPTIVE LEARNING SYSTEM YIELDING UNBIASED
PARAMETER ESTIMATES
DANIEL W. REPPERGER

Air Force Research Laboratory, AFRL/HECP, WPAFB, Ohio 45433, USA, [email protected]
Abstract. A learning system involving model reference adaptive control (MRAC) algorithms is studied in which
the Lyapunov function and its associated time derivative are simultaneously quadratic functions of both the position
tracking error and the parameter estimation error. An implementation method is described which expands results
from [5]. Initially it appears that the condition of persistent excitation (PE) need not be explicitly satisfied, however,
this condition is actually implicit in the requirements for a solution.
Key Words: Adaptive Control, Learning System
1. INTRODUCTION
This paper will address a specific class of learning
systems or MRAC algorithms which have the
interesting property that both the Lyapunov function
and its associated time derivative along the motion
trajectory are quadratically dependent on both the
tracking error and the parameter estimation error. If
this type of algorithm can be successfully
implemented, then unbiased parameter estimates can
be obtained. The study of model reference adaptive
control has been well-established [1] when over
1,500 papers have been reported at that time with
numerous experimental results obtained.
The
notation used herein and a brief description of the
conventional MRAC method is first discussed.
2. THE DIRECT METHOD OF THE
STANDARD MRAC PROBLM
Using the nomenclature of [2], the scalar tracking
error e0(t)= yp(t)-ym(t) in Figure 1 represents the
difference between the plant’s output yp and the
reference model ym. The unknown parameter vector
is  which is mx1 and its adjustment mechanism only
allows knowledge of yp(t), e0(t), and possibly their
respective derivatives. A standard solution to this
problem is given by Lemma 1 [3]:
Lemma 1
Let a state-space description (x  Rn, e0  R1, v  Rm, 
 Rm) of Figure 1 be of the form:
x = A x + b [ k T v(t) ]
(1)
e0 = c T x
(2)
The remaining matrices are of appropriate
dimensions with A being Hurwitz, the scalar k is
unknown (except for sign), the pair (A,b) is
completely controllable with b known, and v(t) is a
measured variable to be defined in the sequel. For the
 * represents the true value, ˆ is
the estimate, and  = ˆ -  * is the parameter error.
parameter vector,
~
With some abuse of notation (mixing both the
Laplace transform variable p with the time domain
variables), the error vector resulting from equation
(2) admits to the form:
e0(t) = H(p) [ k
T
v(t) ]
(3)
The transfer function H(p) is strictly positive real
(SPR) (stable, minimum phase, and of relative degree
no greater than unity). The adaptation law is given
by:
 (t) = - sgn(k)  e0(t) v(t)
(4)
where  >0. e0(t) and  (t) are globally bounded and
if v(t) is bounded, then e0(t)  0 as t .
Associated with this problem is a Lyapunov function
( =(x,
~

)):
V() = V(x,
~

) = xT P x +
|k | ~T ~

 
(5)
and  > 0 controls the rate of parameter adaptation
such that the parameters change more slowly than the
effects they induce on the error vector e0(t). It can be
shown that:
(i) As ||  , lim V()  (radial unbounded
condition).
(ii) V() > 0 if   0 (positive definite property).
(iii) To show that V ( ) < 0    0 (negative
definite property of V ), if A is Hurwitz, using the
Kalman-Yakubovich lemma (there exists positive
definite matrices P and Q such that ATP + PA = -Q
and P b = c) and with the fact that H(p) is SPR, then
it follows:
V
 0
= - xT Q x
where the term:
~
x  x(t )  x m (t )  y p  y m
(10)
will be used to represent tracking error. The choice is
now made of the following Lyapunov function:
V1 =
1 *

2
s(t)2 +
1
2
~2

1
+
1
2
2
~

(11)
Where s(t) is the tracking error of equation (9) and
the positive constants 1 and 2 will be specified
later. The time derivative of V1 along its motion
trajectory is required to satisfy:
V1
=-

*
s2 -
(6)
1
2
3
~2

-
1
2
4
~ 2

(12)
~
to the fact that V does not depend explicitly on  .
where  is identical to the variable used in (9) and
will be defined later with the positive constants 3
and 4. The following assumptions are implicit in
what is to follow:
The persistent excitation condition can mitigate this
situation [3,4].
4.1 Assumptions
3.
(1) The true parameter
It is noted that only e0(t)  0 as t   is guaranteed
and the requirement that
~

 0 may not occur due
THE PERSISTENT EXCITATION
CONDITION

*
is constant.
(2) s(t) satisfies the following relationship [5]:
If the reference model in Figure 1 satisfies:
ym  1 y m  2 y m  2 r (t )

t T
v T d  1 I
(8)
t
Where  1 > 0, I is the identity matrix, and T >
0, for any t > 0. With this condition in place, the
biased parameter estimation problem can be
mitigated. An alternative approach to this problem is
now provided.
v(t)
(13)
Where the measured position variable v(t) is specified
via:
xm -2  ~
x - 2 ~
x
v(t) =
(14)
Hence it is required to measure the variable xm and its
next two derivatives as well as obtain measurements
of ~
x and its first derivative.
The algorithm now follows:
4.2 Derivation of the Algorithm:
The goal is to simultaneously satisfy (11) and (12).
Differentiating V1 of (11) yields:
V1
4.
~

s (t) +   * s(t) =
(7)
Where r(t) is the input forcing function to the
reference model, the persistent excitation condition
would normally require, on v = [r, e0]T, through the
choice of r(t) in Figure 1:
v
*
=

*
~ ~
s (t ) s (t) + 1  
+ 2
~ ~
 
(15)
A NEW ADAPTATION ALGORITHM [5]
using the relationship in equation (13), the following
results:
For this problem, first define a scalar sliding state
variable s(t) as follows:
s(t) : =
~
x
+  ~
x
V1
(9)
= s [ -

*
~
~ ~
s +  v(t) ] + 1 

+ 2
~ ~
 
16)
which is required to satisfy V1 of equation (12).
This will occur if the following relationship holds:
~

v(t) s(t) + 1
~ ~
 
+ 2
~ ~
 
=-
1 ~ 2 1 ~ 2
3  - 4 
2
2
METHOD OF SELECTION OF I AND
IMPLEMENTATION OF THE
ALGORITHM
7.
A three step procedure will implement this algorithm:
Step 1: Pick i such that equation (21) is strictly
Hurwitz, i.e. the solution of (21) is of the form:
(17)
This equation will now be simplified and various
solutions examined. This extends results from [5] to
a larger class of solutions.
Z(t) = a1 e-3 t + a2 e-4 t
(22)
Where the real part of 3 and 4 are both >0. Then
5.
lim
Z(t)  0
t 
THE NONLINEAR EQUATION TO BE
SATISFIED
For notational simplicity, it is easier to denote Z(t) =
~
 = ˆ -  * and since  * is constant then:
Z (t)
=
̂

(18)
and the adaptation law is then specified independent
of knowledge of the true parameter  *. Also for
brevity, the variable (t) = v(t) s(t) is known from
measured quantities (cf. equations (9) and (14)).
Equation (17) now simplifies to the form:
Z Z
2 +
Z
Z 1 +
1
2
4
Z 2 =
Z [-(t) -
1
2
3 Z ]
(19)
The goal is to provide solutions of (19) which are
stable and not trivial.
6.
SOME ALTERNATIVE SOLUTIONS OF
(19)
and the parameters are unbiased since Z(t) =
-  *.
term
Z cancels out. If (sufficient condition):
[-(t) -
1
3 Z ] = 0
2
(20)
= ˆ
(t) = -
1
3 Z = v(t) s(t)
2
(24)
which would now satisfy lim t , (t)  0. This
means that both tracking error variables (containing
s(t) in equation (9) and v(t) in equation (14)) would
have to converge to zero. Thus both tracking error
and parameter error convergence are established
simultaneously.
Step 3: There still exists a caveat from the
procedure so far. What is unknown is:
~

(0) =
ˆ (0) - 
*
= -

(25)
if the estimator ˆ (0) = 0 is unbiased, which is
usually the case. This implies we know the true
parameter  * if we know Z(0). To circumvent this
difficulty, the procedure is modified to determine
Z (t ) rather than Z(t) via the following sequence of
events:
Then the following linear equation in Z has to be
solved:
1
Z 2 + 4 Z + Z 1 = 0
2
~

Step 2: From (20) this also implies that (t) must
also be of exponential order since:
Z(0) = a1 + a2 =
If the right hand side of equation (19) could be set to
zero, the left hand side then becomes linear since the
(23)
(a) Solve equation (24) for Z(t) and substitute the
results into (21). This yields:
(21)
Since all i > 0, i=1,4 then (21) is Hurwitz for
properly selected i. This leads to the following
methodology for the selection of the i in equations
(11,12):
1
Z 2 + 4 Z = 2 (1 / 3 ) (t)
2
(26)
(b) Now define a new variable:
Y(t) =
Z =
̂
 (t)
(27)
Then Y(t) is Hurwitz and satisfies:
Y(0) = 0
1
Y (t) +
2
4
2
(28)
Y(t) = f(t)
(29)
where f(t) is of exponential order since:
f(t) = 2
1
 2 3
(t)
(30)
and (t) is of exponential order from equation (24).
(b) Thus the adaptation algorithm is to calculate Y(t)
via (28-29) and then:
ˆ (0) = 0
(31)
and
̂
 (t) = Y(t)
(32)
Numerical simulations of examples are presented in
figures 2, 3 and 4 which will be discussed at the
conference during this paper’s presentation.
8.
CONCLUSIONS AND DISCUSSION
A simple method of providing both parameter error
convergence and tracking error convergence is
demonstrated by taking a special case solution of a
nonlinear equation, which describes potential
adaptation algorithms. It is possible to guarantee the
tracking error to be Hurwitz as well as the parameter
estimation error in a special case solution of this
nonlinear equation.
9. REFERNCES
[1] K. J. Astrom, “Theory and Applications of
Adaptive Control – A Survey,” Automatica, Vol. 19,
No. 5, pp. 471-486, 1983.
[2] S. Sastry and M. Bodson, Adaptive Control,
Stability, Convergence, and Robustness, Prentice
Hall, 1989.
[3] J-J E. Slotine and W. Li, Applied Nonlinear
Control, Prentice-Hall Inc., 1991.
[4] K. S. Narenda, A. M. Annasswamy,
Adaptive Systems, Prentice-Hall Inc., 1989.
Stable
[5] D. W. Repperger and J. H. Lilly, “A Study on a
Class of MRAC Algorithms,” Proceedings of the
1999 IEEE International Conference on Decision and
Control, December, 1999, Phoenix, Arizona..