Stable adaptive control with recurrent networks

Automatica 36 (2000) 5}22
Stable adaptive control with recurrent networksq
Grzegorz J. Kulawski!,*,1, Mietek A. BrdysH "
!Shell International Exploration and Production B.V., Research and Technical Services, Volmerlaan 8, P.O. Box 60, 2280 AB Rijswijk,
The Netherlands
"School of Electronic and Electrical Engineering, The University of Birmingham, Edgbaston, Birmingham B15 2TT, UK
Received 4 November 1997; revised 9 December 1998; received in "nal form 6 March 1999
¹he paper describes an adaptive control scheme for uncertain nonlinear plants with unmeasurable state,
based on dynamic neural networks. ¹heoretical stability analysis and simulation examples are presented.
Abstract
An adaptive control technique for nonlinear stable plants with unmeasurable state is presented. It is based on a recurrent neural
network employed as a dynamical model of the plant. Using this dynamical model, a feedback linearizing control is computed and
applied to the plant. Parameters of the model are updated on-line to allow for partially unknown and time-varying plant. The stability
of the scheme is shown theoretically, and its performance and limitations of the assumptions are illustrated in simulations. It is argued
that appropriately structured recurrent neural networks can provide conveniently parameterized dynamic models for many nonlinear
systems for use in adaptive control. ( 1999 Elsevier Science Ltd. All rights reserved.
Keywords: Dynamic neural networks; Nonlinear adaptive control; Learning
1. Introduction
From the point of view of algorithm design, universal
approximation properties of static networks, shown for
example in Funahashi (1989), Cybenko (1989) and
Hornik, Stinchcombe and White (1989), make them a
potentially useful tool for nonlinear control and identi"cation problems. In addition to that, their massively
parallel structure combined with simplicity of single
neurons brings bene"ts related to implementation issues
such as speed of computations and fault tolerance due to
natural redundancy. Realisation of these facts sparked
recently a great deal of research activity in the control
community for which the early paper (Narendra & Parthasarathy, 1990) takes much of the credit. A number of
q
This paper was not presented at any IFAC meeting. This paper was
recommended for publication in revised form by Associate Editor A.E.
de Barros Ruano under the direction of Editor C.C. Hang.
* Corresponding author. Tel.: 31-70-311-6089; fax: #31-70-3112521.
E-mail addresses: [email protected] (G.J. Kulawski),
[email protected] (M.A. BrdysH )
1 The paper was written during the author's stay at The University of
Birmingham.
di!erent control structures utilising neural networks
have been subsequently proposed, while multilayer perceptron (MLP) and radial basis function (RBF) networks
became the most popular neural architectures used.
After the early enthusiasm and wealth of ideas, the
challenge was to put neural networks more "rmly into
the control engineering context and rigourously address
such essential issues as guarantees of performance, robustness or stability. Control algorithms, in which neural
networks are trained o!-line prior to their application in
the on-line operation, when no further weights adjustments are made, are relatively simpler from the viewpoint
of mathematical analysis. This approach was for example
taken in two papers (Levin & Narendra, 1993, 1996),
where a number of control methods using multilayer
perceptron trained o!-line are presented and supported
by solid mathematical treatment.
Design of provably stable adaptive neural controllers
posed more di$cult theoretical questions. One of the
early adaptive schemes with local stability results has
been presented in Chen and Khalil (1991). A globally
stable nonlinear adaptive controller using RBF networks
was proposed in Sanner and Slotine (1992). The fact that
an output of a RBF network is linear in the adjustable
parameters proved important for the design of a stable
parameter update law. Finally, global stability has also
0005-1098/00/$ - see front matter ( 1999 Elsevier Science Ltd. All rights reserved.
PII: S 0 0 0 5 - 1 0 9 8 ( 9 9 ) 0 0 0 9 2 - 8
6
G.J. Kulawski, M.A. Brdys& / Automatica 36 (2000) 5}22
been shown for an adaptive controller using a single
hidden layer MLP network, in which both output and
hidden weights are tuned, "rst in Lewis, Yesildirek and
Liu (1993) and later further developed in Chang, Fu
and Yang (1996). These developments are very much
based on results from the theory of adaptive control of
linear plants (see, for example Narendra & Annaswamy,
1989). In the case of linearly parameterised RBF networks the existing adaptive schemes could be applied in
a relatively straightforward manner, while the nonlinear
parameterisation of MLP networks was a signi"cant
obstacle. It was then realised that, due to the speci"c
properties of sigmoid activation functions, control laws,
with approximations of unknown functions realised by
MLP networks, can be "t into a framework which allows
the use of methods from adaptive control for linear
systems.
While signi"cant progress has been achieved in applications of static neural networks for control, these
algorithms are limited to the case when full state of the
controlled plant is available for measurement. The output feedback control (when the full state is not measurable) and especially adaptive output feedback control for
nonlinear systems remains one of the outstanding problems which are, at present, very actively researched.
Systematic design techniques have only been developed
for speci"c classes of nonlinear systems, e.g. Marino
and Tomei (1995) and KrsticH , Kanellakopoulos and
KokotovicH (1995).
Lack of state measurements makes the problem particularly di$cult, as some kind of dynamic model of the
system needs to be used for control law synthesis. For
linear systems with known parameters, owing to the
separation principle, the output feedback control design
can be conveniently split into an independent design of
a stable Luenberger observer and a stable state feedback
law. However, there is no similar separation principle in
the adaptive context even for linear systems, let alone
nonlinear ones. In case of adaptive control of linear
plants, special, nonminimal in terms of state dimension,
parameterisations of linear dynamic systems have been
developed. These special models allowed the formulation
of stable adaptive control techniques, see e.g. Narendra
and Annaswamy (1989), and Sastry and Bodson (1989).
No such general framework exists for nonlinear systems and there is a need for some kind of nonlinear
dynamic model, which allows the synthesis of control
input for the plant to be performed and is at the same
time `manageablea enough, allowing analytical treatment and the incorporation of mechanisms ensuring
stability. Already in Narendra and Parthasarathy (1990)
a possibility of using dynamic neural models combining
nonlinearities with dynamics for the identi"cation of
nonlinear plants was demonstrated in simulations. Still,
however, relatively little has been done as far as application of such models for control, and particularly for
adaptive control, is concerned. Most papers present heuristic approaches, (e.g. Gupta, Rao & Nikiforuk, 1993;
Sastry, Santharam & Unnikrishnan, 1994; Parlos, Chong
& Atiya, 1994) and comprehensive convergence analysis
of adaptive schemes is still missing.
Stability of dynamic networks in control, in a
nonadaptive context, has been rigourously addressed in
recent publications (Suykens, De Moor & Vandewalle,
1995a,b; Suykens, Vandewalle & DeMoor 1997; Verrelst,
Van Acker, Suykens, Motmans, DeMoor & Vandewalle,
1997) using an extension of H theory. While use of
=
H techniques raises questions about conservativeness
=
of the stability results, this is certainly an important
contribution.
Models of dynamic systems can be constructed using
neural networks in di!erent ways, for example by closing
a feedback loop around a static MLP network. Modelling capabilities of such dynamic structures derive from
approximation properties, for static functions, of static
MLP networks. The ability of Hop"eld-type recurrent
neural networks to approximate dynamic systems, in
continuous and discrete time, has been shown respectively in Funahashi and Nakamura (1993) and Jin, Nikiforuk
and Gupta (1995). It is our belief that recurrent neural
networks of this form have the potential to provide useful
models for adaptive control of many nonlinear systems.
Recurrent networks of the Hop"eld type have relatively
simple structure and the hyperbolic tangent function
tanh( ) ) has advantageous properties like smoothness,
boundedness and being monotonic. Loosely speaking it
is a kind of `milda nonlinearity. Usefulness of these
features will become more apparent and more precisely
formulated in the following sections, where they are exploited in the stability analysis.
A model of the controlled system based on a recurrent
network can be interpreted as a state space model, usually with a nonminimal state dimension. Use of this type
of model for adaptive control of nonlinear systems seems
quite natural, as "rstly, there is no general equivalence,
for nonlinear systems, between state-space and input}
output models while for most physical systems, statespace models based on "rst principles are a natural way
of describing dynamics. Secondly, even when the
input}output models exist, like for example a SISO
system
y(d)"F(y(d~1),2, y)#G(y(d~1),2, y)u
practical utilisation of such models, when only output
y but not its derivatives y(d~1),2,y5 are measured, still
requires the construction of a dynamical state-space
model. If a dynamic neural model can be trained, so that
in some region its input}output behaviour is close to that
of the unknown system, then we should be able to obtain
some kind of equivalent information about the state of
the unknown plant from the state of the neural network.
G.J. Kulawski, M.A. Brdys& / Automatica 36 (2000) 5}22
Thus, if a controller based on such a model can be
constructed, it should be able to achieve a good dynamic
performance due to the information about the state of the
actual plant without the explicit use of the state and
parameter observer.
In the method presented here, of which a detailed
description is given in the following section, a linearizing
feedback control law is computed analytically for the
network and applied to the plant while parameters of the
network are updated on-line. Such approach can be
classi"ed as indirect adaptive control, as the parameters
of the plant model are adapted and control is computed
based on the current model, rather than directly adapting
the controller parameters. The reason for such a choice is
that, as opposed to linear systems, the question of existence of direct dynamic controllers for nonlinear systems
is a very di$cult one and the use of a heuristicly chosen
direct adaptive dynamic controller, as for example in Ku
and Lee (1992), renders the overall scheme extremely
di$cult for analytical treatment.
A similar general motivation appears to underlie
recent schemes of Jin, Nikiforuk and Gupta (1993, 1994)
and Delgado, Kambhapati and Warwick (1995), in
which, either continuous or discrete-time recurrent networks, respectively, are used as a model of the unknown
system and the control law utilises the state of the
network.
The scheme presented here was "rst proposed in
Kulawski and BrdysH (1994) and BrdysH , Kulawski and
Quevedo (1996) and a convergence result for the case of
constant (or slowly varying) reference output was presented in BrdysH , Kulawski and Quevedo (1998) using the
singular perturbation methodology. In this paper, the
stability analysis is extended to the case of a general
reference signal employing a completely di!erent approach. The su$cient conditions for stability are derived
for exponentially stable nonlinear plants. In order to
validate limitations of the assumptions made during the
stability analysis, among which the assumption about
perfect parameterisation is the most severe one, a comprehensive simulation study is performed. The simulation examples are selected accordingly starting from
a simple academic one and ending with the induction
motor case study.
2. Control algorithm
We seek to design a control law for a continuous-time
nonlinear system, which is at this point formulated quite
generally as
q5 "f (q, u),
y"h(q),
(1)
where q3Rs is a state vector, u3Rm is an input vector,
and y3Rm is an output vector. The objective is to make
7
Fig. 1. Diagonal dynamic network.
Fig. 2. A single dynamic neuron.
the system outputs track a vector of speci"ed trajectories
y 3Rm.
3%&
A recurrent neural network is used as a dynamic model
of the system (1), based on which the control law is
synthesised. The neural model is de"ned by
x5( "Dx( #A) T(x( )#Bu, y( "Cx( ,
(2)
where x( 3Rn is a state vector, u3Rm is an input vector,
and y( 3Rm is an output vector. A nonlinear operator T( ) )
is de"ned as
C D
tanh(x( )
1
T(x( )"
F
,
tanh(x( )
n
D"diag(d ,2, d ) is a n]n diagonal matrix with nega1
n
tive entries, d (0, AK "diag(a( ,2, a( ) is an n]n diagi
1
n
onal matrix, B3RnCm and C3RmCn. An example of such
a network with two inputs and two outputs is shown in
Fig. 1 and a single neuron in Fig. 2. The network can be
regarded as a parsimonious version of the Hop"eld-type
network. It has a diagonal structure, that is, there is no
interaction between dynamics of di!erent neurons (these
are shown as dashed lines in both "gures).
From now on, a( will denote a vector of parameters
containing the diagonal elements of AK :
CD
a(
1
a( " F .
a(
n
G.J. Kulawski, M.A. Brdys& / Automatica 36 (2000) 5}22
8
The following assumption guaranteeing exponential
stability of the neural model (as shown in Lemma 3 in
BrdysH et al., 1998) is required.
Assumption 1. a( (!d , i"1,2, n.
i
i
P
2.1. Control synthesis
The evolution of the network output can be expressed
as
y(5 "Cx(5 "CDx( #CAK T(x( )#CBu.
(3)
An assumption is made:
Assumption 2. CB is invertible.
The above assumption implies relative degree of the
neural model equal to one. The feedback linearizing
control can be calculated as
u"(CB)~1(!CDx( !CA) T(x( )#v)
(4)
which applied to the network results in it being decoupled and linear with respect to the new input vector v.
y5( "v.
(5)
The auxiliary control input v is designed as a simple
linear pole placement
v"y5 !a(y( !y ),
(6)
3%&
3%&
where a is a positive constant, a'0. The control input,
as de"ned by (4) and (6), is applied to both the plant and
the neural model. The error between the output vector of
the model and the reference output vector, e( "y( !y ,
3%&
obeys
e5( #ae( "0
and converges exponentially to the origin with the rate a.
Thus, we can assume, that after some initial transient,
e( converges to zero and so the plant tracking error
e "y !y is equivalent to the modelling error
5
3%&
e "y( !y:
.
e "y !y"y( !y"e .
(7)
5
3%&
.
From now onwards, for simplicity of exposition,
a single-input }single-output plant is considered.
2.2. On-line parameter adaptation
On-line updates of the parameter vector a( are performed in discrete-time instants, with a period ¹, in the
direction opposite to the gradient of the following error
criterion:
1
E"
2
P
kT`T
(y( !y)2 dq,
kT
LE
a( (k¹#¹)"a( (k¹)!j ,
i
i
La(
i
where j'0 is the learning rate. The discrete-time instants, when updates take place, are indexed by the
positive integer k. Calculating the error function derivative as
(8)
(9)
(k`1)T
LE
Ly(
(y( !y)
"
dq
(10)
La(
La(
kT
i
i
this derivative, LE/La( , is obtained by integrating over the
i
time ¹ the following two di!erential equations:
d LE
Ly(
"(y( !y)
,
dt La(
La(
i
i
Lx(
Ly(
"c i,
iLa(
La(
i
i
d Lx(
Lx(
i"(d #a( tanh@(x( )) i#tanh(x( ).
i
i
i La(
i
dt La(
i
i
starting with zero initial conditions:
(11)
(12)
(13)
LE
Lx(
i(k¹)"0.
(k¹)"0,
La(
La(
i
i
Once the parameter updates are done, initial conditions
in Eqs. (11)}(13) are reset to zero and the cycle is repeated. Di!erential equations (11)}(13) need to be integrated
in real time. Eq. (13) is obtained by applying a partial
derivative L/La( to both sides of the di!erential state
i
equation (2) (under standard smoothness conditions we
can assume that d/dt and L/La( commute). Because of the
i
diagonal structure of the network state equation we have
Lx( /La( "0 for jOi. Eq. (13) describes the so-called sensij i
tivity model, as in e.g. Narendra and Parthasarathy,
(1991). Its derivation relies on the assumption that a( is
i
constant. As an alternative to Eq. (9), parameter updates
could be performed continuously with a very small learning rate so that this assumption would be `almosta true
(see Narendra & Parthasarathy, 1991). We chose the
periodic update method as it is more conceptually clear
and makes the overall scheme more tractable analytically. In the adaptive literature, such schemes are referred
to as hybrid adaptive control. It is pointed out in
Narendra and Annaswamy (1989) that they exhibit better
robustness, with respect to disturbances, compared with
continuous parameter adaptation. Although results in
Narendra and Annaswamy (1989) refer to the control of
linear systems, similar properties can be expected here as
an additional nonlinearity of the tracking error due to
nonlinear plant dynamics should not change the underlying argument.
The updating process has to be constrained in the
way that Assumption 1 is satis"ed at any point in
time. The `rawa values of weights a( being the result
i
of Eq. (9) are projected onto the nearest point of the
set:
P ,Ma( : a( 4!d !eN,
i
i i
i
(14)
G.J. Kulawski, M.A. Brdys& / Automatica 36 (2000) 5}22
Fig. 3. Adaptive control using recurrent network.
where e is a small positive number used to construct a closed set which is needed for a well-de"ned
projection.
A diagram of the overall control scheme is shown in
Fig. 3.
3. Stability
To proceed with the convergence analysis it is assumed
that there exists a choice of network parameters a( "a,
such that a recurrent neural network is capable of exactly
modelling the plant. In other words, the dynamical
system
x5 "Dx#AT(x)#Bu,
9
cal tractability and insight into the convergence issues.
As a trade-o!, it may not have the full approximation
capabilities of the Hop"eld network, although this could
probably be compensated to some extent by a further
increase of the state vector dimension. Simulations reported here seem to justify this conjecture. In any
case, whether with the diagonal network (2) or a fully
interconnected one, some residual structural error will
remain. By the structural error it is meant that there
is in fact no ideal set of parameters of Eq. (15) which
would give a perfect model of Eq. (1). Although the
assumption about perfect modelling capabilities made
here is strong, it is not completely unreasonable.
For example, many results in adaptive control assume
a linear plant, although a perfectly linear plant is
hardly ever the case. The stability analysis which
follows provides understanding of the general mechanisms governing the behaviour of such neural
adaptive systems and explains interactions between
plant and model dynamics on the one hand and the
parameter adaptation on the other. Incorporation of
the structural error into the analysis needs certainly
to be the next step in this development. To enable
the controller to handle signi"cant structural errors,
modi"cations of the adaptation laws will most likely
be necessary.
The following assumptions are required:
Assumption 3. Matrix M"(D!(1/CB)BCD!(a/CB)BC)
is Hurwitz (i.e. all its eigenvalues have negative real
parts).
(15)
y"Cx
is a realisation of the same input}output mapping
u(t)Py(t) as Eq. (1). System (15) has its state vector of
di!erent size than Eq. (1), always larger even if D and
A were full matrices (see results in Funahashi and
Nakamura, 1993). Hence, even if a( "a, vector x is not the
same as q. The increase in the state dimension is the price
to be paid for the structure of Eq. (15). This can be seen as
an analogy with nonminimal parameterisations of linear
dynamic systems for the purpose of adaptive control, as
mentioned in the Introduction. We require that the plant
(15) satis"es Assumption 1, implying exponential stability
of the original plant (1). The input}output mapping
realised by Eqs. (1) and (15) is certainly parameterised by
the initial conditions of their respective state vectors.
However, since in this analysis the plant is assumed to be
stable, the in#uence of initial conditions should die out
and Eq. (15) will be considered as a valid description of
the controlled plant. From now on, the vector x will
denote the state of the plant.
The ability of Hop"eld type networks to approximate
dynamic systems has already been mentioned. The
simpler network structure is chosen here to allow analyti-
Assumption 4. Reference output signal y and its deriva3%&
tive y5
are bounded, i.e. there exist positive constants
3%&
d , d such that Dy5 D(d , Dy D(d .
1 2
3%&
1 3%&
2
Although Assumption 4 prevents theoretical step inputs to be considered, it does not pose signi"cant restrictions in practice where pre-"ltering of sharp changes of
reference inputs is a common approach.
In addition to Assumptions 1}4 an extra Assumption 5
is needed and two conditions expressed by inequalities
(27) and (29) must hold. These will be discussed later in
the proof where it can be done in a much more natural
way. The main result of the paper can now be stated as
follows.
Theorem. If the initial parameter mismatch DDa( !aDD is
small enough, and ¹ long enough, control (4) achieves
ultimate boundedness of all signals in both the plant (15)
and the model (2). There exists a positive number jM such that
for 0(j4jM the updating scheme (8)}(10) results in the
bound on the tracking error e "y !y decreasing
t
3%&
monotonously.
G.J. Kulawski, M.A. Brdys& / Automatica 36 (2000) 5}22
10
K A
K
BK
K
Proof. Part 1. Ultimate boundedness of DDx( DD, DuD, DDxDD:
Because control input (4) utilises the state of the network,
its dynamics in the closed loop are modi"ed to
42 x( TP NA) T(x( )#B
1
a
y5 #B
y
CB 3%&
CB 3%&
x5( "Dx( #AK T(x( )
42DDPx( DD NA) T(x( )#B
1
a
y5 #B
y
CB 3%&
CB 3%&
#B
1
(!CDx( !CAK T(x( )#y5 !aCx( #ay )
3%&
3%&
CB
A
B A
B
1
a
1
" D! BCD! BC x( # I! BC AK T(x( )
CB
CB
CB
1
a
# By5 # By .
CB 3%& CB 3%&
(16)
The closed-loop neural model (16) is driven only by
y and y5 which are bounded. There is no direct feed3%&
3%&
back from the state of the plant into Eq. (16). A feedback
from the plant output into Eq. (16) is applied indirectly
via the parameter updates in discrete moments of time.
Therefore, we can "rst show the ultimate boundedness of
DDx( DD in Eq. (16), and then proceed with the boundedness
of DDxDD.
We denote
A
B
A
B
1
a
1
M" D! BCD! BC , N" I! BC .
CB
CB
CB
42DDPDD DDx( DD(DDNA) T(x( )DD
2
1
a
DDBDDDy5 D# DDBDDDy D
#
3%&
3%&
CB
CB
A
B
4m6 DDx( DD.
This gives
<QK 4!DDx( DD2#m6 DDx( DD,
where
A
C D
a( tanh(x( )
1
1
F
A) T(x( )"
<K "x( TPx( ,
DDA) T(x( )DD4DDa( DD.
where P is a symmetric positive-de"nite matrix satisfying
the Lyapunov equation
Therefore,
<QK "x(5 TPx( #x( TPx(5
Furthermore, relation
1
1
#
y5 BTPx( #x( TPB
y5
CB 3%&
CB 3%&
a
a
y BTPx( #x( TPB
y
#
CB 3%&
CB 3%&
"!DDx( DD2#2x( TPNA) T(x( )
a
1
y5 #2x( TPB
y
CB 3%&
CB 3%&
A
"!DDx( DD2#2x( TP NA) T(x( )#B
B
1
a
y5 #B
y .
CB 3%&
CB 3%&
We have
A
2x( TP NA) T(x( )#B
DDNA) T(x( )DD4DDNDD DDA) T(x( )DD4DDNDD DDa( DD.
2
2
If we de"ne m"max a( m6 , then Eq. (17) holds with the
@@ @@
same m for all t. Updating of a( , as will be described later,
ensures that DDa( DD remains bounded. Since all other quantities appearing in Eq. (18) are bounded, m is a bounded
positive number. Therefore
<QK (0 ∀DDx( DD'm.
"x( T(MTP#PM)x( #T(x)TA) TNPx( #x( TPNA) T(x( )
#2x( TPB
(18)
The above expression for m6 is obtained by noticing
that
Consider the Lyapunov function for Eq. (16):
The existence of such a matrix is assured by Assumption 3.
Taking the time derivative of the Lyapunov function
gives
B
1
a
m6 "2DDPDD DDNDD DDa( DD# DDBDDd # DDBDDd .
2
2
1 CB
2
CB
a( tanh(x( )
n
n
and due to the fact that Dtanh(x)D(1
MTP#PM"!I.
(17)
B
1
a
y5 #B
y
CB 3%&
CB 3%&
!DDx( DD2#mDDx( DD4!k DDx( DD2,
1
where 0(k (1, is satis"ed for
1
m
DDx( DD5
.
1!k
1
Therefore, outside any ball with radius greater or equal
to m "m/(1!k ), the right-hand side of Eq. (17) can be
1
1
bounded by
<QK 4!k DDx( DD2 ∀DDx( DD5m .
1
1
From the above and the fact that Lyapunov function
<K satis"es
j (P)DDx( DD24<K (x( )4j (P)DDx( DD2
.*/
.!9
G.J. Kulawski, M.A. Brdys& / Automatica 36 (2000) 5}22
one can conclude, based on Theorem 4.10 in Khalil
(1992), that there exists a "nite time t and positive
1
constants k ,c '0 such that
2 2
DDx( (t)DD4k DDx( (t )DDexp(!c (t!t )), ∀t 4t(t , (19)
2
0
2
0
0
1
S
j (P)
.!9 m ∀t5t .
(20)
1
j (P) 1
.*/
In other words, the norm of x( decreases exponentially
and in "nite time t enters a ball of radius ox( . Naturally,
1
boundedness of x( implies boundedness of y( .
Assumption 4, relations (19), (20) and boundedness of
DDa( DD imply that the input to both plant and model, as
given by Eq. (4), also converges in "nite time to bounded
values. There exists a positive constant o '0, such that
u
DuD4o ∀t5t .
u
1
We proceed to show that x is bounded as well.
A Lyapunov function for the plant (15) is chosen as
DDx( (t)DD4ox( "
<"1xTx.
2
Its time derivative is given by
<Q "xTDx#xTAT(x)#xTBu.
Since u is bounded for t5t , T(x) contains bounded
1
elements and norms of A and B are bounded, it holds
that
xT(AT(x)#Bu)4f DDxDD ∀t5t ,
1
where f is a bounded positive constant. Thus, with D
being a diagonal matrix, we have
<Q 4!min Dd DDDxDD2#f DDxDD ∀t5t ,
i
1
i
and following a similar line of thought it can be concluded that the Lyapunov derivative satis"es
f
∀t5t ,
<Q 4!k DDxDD2 ∀DDxDD5
1
3
min Dd D!k
i i
3
where k is a positive constant satisfying 0(k (
3
3
min Dd D. Thus, assumptions of Theorem 4.10 in Khalil
i i
(1992) are satis"ed for t5t . It can be then concluded,
1
similarly as in the case of convergence of the norm of x( ,
that the norm of x decreases exponentially and in "nite
time enters a ball of radius ox
DDxDD4ox ∀t5t ,
2
where ox is some positive constant and t 5t . This in
2
1
turn results in bounded y for t5t 5t .
2
1
Thus, ultimate boundedness of all signals in the plant
and neural model has been established, that is, x, x( , y, y(
and u are bounded for t5t .
2
Part 2. Convergence of the tracking error: We proceed
to show how parameter updates result in a decrease of
the tracking error.
11
Control input (4) makes the output of the neural model
converge quickly to the reference trajectory (see relation
(7)). Thus, after some fast transient when the algorithm is
initialised, the tracking error is directly related to the
state error
e "y !y"y( !y"C(x( !x)"Cx8 .
t
3%&
Therefore "rstly, the Lyapunov function:
<I "1x8 Tx8
2
is used, to show that the state error norm x8 converges to
a ball whose size is determined by the parameter error
norm DDa( !aDD"DDa8 DD.
Denoting x8 "x( !x and a8 "a( !a , from Eqs. (2)
i
i
i
i
i
i
and (15) we have
x58 "d x( !d x #a( tanh(x( )!a tanh(x )
i
i i
i i
i
i
i
i
"d x8 #a( tanh(x( )!a tanh(x ).
i i
i
i
i
i
Adding and subtracting a( tanh(x ) to the right-hand side
i
i
of the above equation we obtain
x58 "d x8 #a( tanh(x( )!a( tanh(x )
i
i i
i
i
i
i
#a( tanh(x )!a tanh(x )
i
i
i
i
"d x8 #a( (tanh(x( )!tanh(x ))#a8 tanh(x ).
i i
i
i
i
i
i
The Lyapunov derivative is given by
n
<QI "x8 Tx85 " + (d x8 2#a( (tanh(x( )
i
i
i i
i/1
! tanh(x ))x8 #a8 tanh(x )x8 )
i i
i
i i
n
" + (d Dx8 DDx8 D#a( Dtanh(x( )
i i i
i
i
i/1
! tanh(x )DDx8 D#a8 tanh(x )x8 )
i i
i
i i
n
4 + (d Dx8 DDx8 D#a( Dtanh(x( )
i i i
i
i
i/1
! tanh(x )DDx8 D#Da8 DDtanh(x )DDx8 D)
i i
i
i i
n
a(
" + Dd DDx8 D !Dx8 D# i Dtanh(x( )
i i
i
i
Dd D
i
i/1
Da8 D
! tanh(x )D# i Dtanh(x )D .
i
i
Dd D
i
The following relations hold: Firstly, due to the projection algorithm used (14), the following is satis"ed uniformly in time:
A
a( 4!d !e,
i
i
a(
e
i 41!
,
!d
!d
i
i
a(
e
i 41! .
Dd D
Dd D
i
i
B
G.J. Kulawski, M.A. Brdys& / Automatica 36 (2000) 5}22
12
De"ning l "1!e/Dd D if 1!e/Dd D'0 and l "0 otheri
i
i
i
wise, we obtain
a(
i 4l (1, i"1,2, n.
i
Dd D
i
Secondly, function tanh( ) ) satis"es
DDx8 (k¹)DD4o DDa8 ((k!1)¹)DD,
(26)
1
where o "1/((1!max l )min Dd D!k ).
1
i i
i i
4
It remains to be shown that parameter updates result
in the decrease of the parameter error magnitude DDa8 DD.
This is done by showing that, on the system trajectories
in discrete-time instants a( (k¹), x( (k¹), the tracking error
criterion E as a function of a( (k¹) and x( (k¹) satis"es
k
conditions (A.1)}(A.5) necessary for Lemma 1 to hold (see
Appendix A).
We can treat both the output of the plant and the
output of the neural model as being generated by the
same operator Y( ) ) whose arguments are the initial conditions of state vectors, parameter sets and input over
a time interval
Dtanh(x )D(1,
i
Dtanh(x( )! tanh(x )D4Dx( !x D"Dx8 D, i"1,2, n.
i
i
i
i
i
Using the above one obtains
A
A
B
Da8 D
n
<QI 4 + Dd DDx8 D !Dx8 D#l Dx8 D# i Dtanh(x )D
i
i i
i
i i
Dd D
i
i/1
n
Da8 D
" + Dd DDx8 D !(1!l )Dx8 D# i
i i
i i
Dd D
i
i/1
n
n
4 + !(1!l )Dd DDx8 D2# + Da8 DDx8 D.
i i i
i i
i/1
i/1
Finally, we obtain
A
B
B
(21)
<QI 4! 1!max l min Dd DDDx8 DD2#DDa8 DD DDx8 DD.
i
i
i
i
For a constant a( (k¹), i.e. between consecutive parameter
updates, we can apply similar reasoning as in the "rst
part of the proof to show the following. Outside the ball
G
ball B x8 . As a consequence, after each update, the state
o (k)
error enters the corresponding ball before the next update is done. Thus, from Eqs. (25) and (22) it can be
concluded that, at the start of a new integration period,
we have
H
1
x8 :DDx8 DD(m "
DDa8 (k¹)DD
2 (1!max l )min Dd D!k
i i
i i
4
(22)
the Lyapunov derivative (21) can be bounded by
<QI 4!k DDx8 DD2 ∀DDx8 DD5m (a( (k¹)),
(23)
4
2
where k
is a positive constant satisfying
4
0(k ((1!max l )min Dd D. Using again Theorem 4.10
4
i i
i i
from Khalil (1992), it can be concluded that
DDx8 (t)DD4DDx8 (k¹)DDexp(!c t) ∀k¹4t(k¹#q , (24)
4
k
DDx8 (t)DD4ox8 (k)"m (k) ∀k¹#q 4t4(k#1)¹, (25)
2
k
where c is a positive constant. When a( (k) is constant,
4
state error norm enters in "nite time q a ball, whose size
k
ox8 (k) is determined by the magnitude of the parameter
error DDa8 DD. Constant c is uniform on k, due to the fact that
4
the Lyapunov function <I does not depend on time and
the bound on its derivative (23) is uniform. Therefore, if
during updates, changes in DDa8 (k)DD are uniformly bounded
and thus changes in ox8 (k) are uniformly bounded, q is
k
uniformly bounded. In other words, there exists a time
q6 such that
q 4q6 ∀k'0.
k
We need to choose the update period as ¹'q6, that is,
longer than the time needed by the state error to enter the
y(t)"Y(x(k¹), a, u
),
*kT, t+
y( (t)"Y(x( (k¹), a( (k¹), u
) for k¹4t4(k#1)¹.
*kT, t+
The tracking error criterion for each integration period
P
(k`1)T
(y( (q)!y(q))2 dq
kT
is a function of a( (k) and x( (k) and it has a global minimum
E ,0 in a( (k)"a, x( (k)"x(k). Let us assume that
k
for nonzero input E (a( (k), x( (k))'E (a, x(k)) for
k
k
[a( (k)T, x( (k)T]O[aT,x(k)T]. Since tanh( ) ) is a smooth function, the function on the right-hand side of the state
equation of the neural model (2) is smooth with respect to
both x( and a( . As a consequence, solutions of state equations are smooth with respect to the initial conditions
x( (k) and parameters a( (k). In other words, Y( ) ) is
a smooth operator. Thus, E is a smooth function of a( (k)
k
and x( (k). As such it satis"es Eq. (A.2) and, in a "nite ball
around the minimum, also Eq. (A.4).
1
E"
k 2
Assumption 5. We have to assume that the Hessian of
the error function is positive dexnite in a neighbourhood
of the minimum (but not necessarily in the optimal
point itself, where the updates are ewectively not
performed).
For this to hold it is su$cient, but not necessary, that
the error function E satis"es the su$cient conditions for
k
the existence of a local minimum in [aT, x(k)T]. We also
assume that on the system trajectories the following is
satis"ed:
K
K
LE(a( (k), x( (k))
4cDDa8 (k)DD ∀k,
La(
(27)
G.J. Kulawski, M.A. Brdys& / Automatica 36 (2000) 5}22
13
where c is a positive constant. It can then be shown (see
Appendix B) that conditions (A.3) and(A.5) are satis"ed if
the update step j satis"es
Due to Eq. (25), the above relation also implies decrease
of the tracking error bound. h
jQ 4jHH
The analysis presented in the proof of the theorem
highlights mechanisms governing stability and tracking
error convergence in this neural adaptive control system.
Firstly, with bounded reference input and with constant
values of the parameters the tracking error remains
bounded. Then, decrease of the tracking error is achieved
in the updating process via decrease of the network
parameters distance to the `truea parameters of the plant,
that is, the parameter values which result in a perfect
modelling of the plant by the recurrent network. Consequently, two types of dynamics can be distinguished in
this system: the fast continuous-time dynamics of the
plant and neural controller and the slow discrete-time
dynamics of learning. Learning dynamics binds together
the evolution of parameter error and state error from one
update step to another. The speed of the plant and neural
model dynamics determines how fast the learning process
can progress. This dependence is quanti"ed by the length
of the update period ¹, which must be long enough so
that relation (26) holds and the gradient used gives a direction of improvement of the parameter error. The result of the theorem does not o!er prescriptive methods
for choosing ¹ and j and in practice, as it is the case in
the reported simulations, they have to be chosen experimentally. It provides however an insight into the convergence mechanisms which helps to guide such process.
1!o max[(1#gj /j )DDL2E(a,x6 )/La( Lx( DD /(j (1!g))]
1
.*/ .!9
2 .*/
"
c
(28)
and if the following holds:
(1#gj /j )DDL2E(a, x6 )/La( Lx( DD
.*/ .!9
2(1,
(29)
j (1!g)
.*/
where j , j are the smallest and greatest eigenvalues
.*/ .!9
of the matrix of second derivatives of E with respect to a( .
The maximum of the expression in the fraction in Eq. (28)
is taken over the ball in which system trajectories are
contained. Bounded initial parameter mismatch, results
of the "rst part of this proof and the following analysis
show that system trajectories are indeed restricted to
a ball.
It is very di$cult to specify precise conditions such
that relation (29) is always satis"ed. However, certain
mechanisms of the state error and parameter error convergence appear quite clearly. The presence of the initial
conditions error x8 (k) acts as a kind of disturbance in the
updating process, since it perturbs the minimum of
E form the point of zero parameter error a( (k)"a. This
k
disturbance x8 (k) has to be small enough relative to the
magnitude of the parameter error a8 (k¹), so that updates
in the direction of negative gradient LE/La( can improve
the parameter error. The shape of the error criterion
function can be in#uenced by varying the length of the
integration period ¹. It appears that the fraction appearing on the left-hand side of relation (29) can be reduced
by increasing the length of the integration period ¹. Due
to the stability of both plant and neural model, the
in#uence of the initial conditions on the outputs of plant
and model, and thus on tracking error, decreases with
time. Thus, increasing the integration period ¹ reduces
the in#uence of the initial conditions mismatch x8 (k) on
the error criterion relative to the in#uence of the parameter mismatch a8 (k¹). We can therefore expect that the
ratio
o max
1
DDL2E(a, x6 )/La( Lx( DD
2
j (L2E(a6 , x( )/La6 (k)2)
.*/
is smaller for bigger ¹.
With all the conditions (A.2)}(A.5) satis"ed, Lemma 1
can be called to show that there exists a choice of learning
rate 0(j(jH such that the parameter error decreases
in the consecutive steps. To preserve condition (28), the
maximum value of the learning rate needs to be chosen as
jM "minMjH, jHHN. Thus for 0(j(jM it holds that
DDa8 (k#1)DD2(DDa8 (k)DD2 ∀k'0.
4. Simulation examples
Example 1. This is a rather academic example in which
the controlled plant is chosen as a recurrent network of
the structure (15). Thus, the assumption about the recurrent network being able to perfectly model the plant for
a certain set of network parameters is satis"ed. This
allows to verify theoretical results presented earlier. Both
the object-network and the controller-network have
three states. There is initially a parameter mismatch
between a( 's in the controller-network and actual values
i
of a 's. Only a( 's are updated using the procedure dei
i
scribed in Section 2, while the rest of the parameters, i.e.
elements of D, B, C are assumed to be known. The
update step is ¹"0.5, the learning rate initially j"1 is
later increased to j"200. Convergence of the plant
output to the reference output is shown in Fig. 4. Fig. 5
illustrates that learning results in both model parameters
and model states converging to the actual plant parameters and states, respectively. Simulations with all the
parameters being updated showed that tracking error
convergence does not necessarily mean that parameter
and state errors need to converge. That appears dependent on the character of the reference output which bears
14
G.J. Kulawski, M.A. Brdys& / Automatica 36 (2000) 5}22
Fig. 4. Control of a neural network. Reference output (dotted line) and
plant output (solid line).
Fig. 5. Control of a neural network. (a) parameter errors convergence:
a8 (solid line), a8 (dashed), a8 (dot dash); (b) state errors convergence: x8
1
2
3
1
(solid), x8 (dashed), x8 (dot dash).
2
3
resemblance to the problem of persistency of excitation
known in adaptive control.
Example 2. The control object is a single robot arm
described by
fQ "f ,
1
2
fQ "!a sin(f )!a f #bu,
2
1
1
2 1
where y"f is the arm position which is the measured
1
output, f is the (unmeasured) angular velocity and input
2
u is the torque. The state dimension of the neural model
was chosen equal to 10, after a few trials with di!erent
network sizes. Elements of D and A) were initialised as
Fig. 6. Control of a single arm. (a) y (dotted) and angular position of
3%&
the arm (solid), (b) control input.
random numbers within the interval [!1,0] with uniform distribution, elements of B, C were similarly initialised in the interval [0,1]. Prior to its application in
on-line control the network was trained o!-line to obtain
a "rst approximation of the plant model. This relates to
the fact that the theorem requires the initial parameter
mismatch between parameters of the neural model and
plant to be appropriately small. Thus, the o!-line training phase can be understood as providing initial model
parameters which are close enough to the ideal ones. At
this stage all parameters, i.e. D, A) , B, C, were updated
using the same methodology as for updates of A) described in Section 2.2. To ensure that parameters d rei
main negative, the projection algorithm was extended to
constrain each d to the set d 4!e. An input signal
i
i
consisting of two sine waves, u(t)"4 sin(0.5t)#
0.5 sin(2t), was applied to both the network and the plant
and training was carried on until errors between the
outputs of the network and the plant were small. Prior to
on-line control satisfaction of Assumption 2 was checked.
In both the o!-line training and control e"0.0001 was
used.
As opposed to Example 1, there is an inevitable structural error in the neural model. Validation of the controller robustness with respect to this error is one of the
major objectives in this simulation example.
Control is performed according to the algorithm described in Section 2. Results for two di!erent reference
trajectories are shown in Figs. 6 and 7. These trajectories
are obtained by passing a piecewise constant and a triangular signal, respectively, through a "rst-order stable
linear "lter. In both simulations ¹"0.2 and j"100.
After initial #uctuations, the output of the plant converges to the reference trajectory. The increased output
error seen in Fig. 6 at time t"50 is due to the mass of the
G.J. Kulawski, M.A. Brdys& / Automatica 36 (2000) 5}22
15
Fig. 8. MIMO control of the induction motor using dynamic networks.
Fig. 7. Control of a single arm. (a) y (dotted) and angular position of
3%&
the arm (solid), (b) control input.
General arguments for the use of dynamic neural models for nonlinear adaptive control of systems with state,
or part of state, unavailable for measurement have been
already given in the Introduction. Clearly, induction motor falls into such category and is in fact one of the case
studies motivating the development of the scheme proposed in Section 2.
5.1. MIMO control of the motor
link being increased by 50%. The increased output error
in Fig. 7 at time t"55 is a result of the mass being
reduced by 50%. In both cases, the controller is able
to adapt to the change and regain good tracking
performance.
5. Induction motor control
In this section, a multi-input}multi-output control
scheme for the induction motor is presented which is
based on the neural algorithm described in Section 2.
Induction motor control is a di$cult, and still not completely resolved, engineering problem. This is due to
highly nonlinear dynamics of the machine and unavailability of measurements of some of the state variables
and usually also of some of the controlled outputs, in
a typical hardware con"guration. In addition to that,
some of the machine parameters, mainly the rotor resistance, exhibit strong variations due to changing thermal
conditions.
The state-space description of the motor (C.1)}(C.5)
uses a state vector [u , i , i , t , t ] consisting of rotor
3 4$ 42 3$ 32
speed, stator current components and rotor #ux components (see Appendix C). Only stator currents and rotor
speed are normally available for measurement. In some
control schemes for the induction motor, observers are
used to obtain estimates of the unmeasurable variables.
However, as the plant is nonlinear, the separation principle does not apply and stable state feedback control
combined with a stable observer do not imply stability of
the closed-loop system. This is further complicated by
signi"cant parameter variations during operation. Due
to these di$culties, no stability proof for observer-based
induction motor control schemes has been shown so far.
A general diagram of the proposed control structure is
shown in Fig. 8. Rotor speed u and amplitude of the
3
stator #ux Dw D are the controlled variables. Two inputs
4
generated by the controller are the amplitude Du D and the
4
angular frequency of stator supply voltage. Amplitude
and frequency of the required sinewave are typically the
inputs of a voltage}source inverter.
The controller requires an estimate of the stator #ux
magnitude Dw D, which can be obtained from a stator #ux
4
observer. It has to pointed out that, "rstly, it is easier to
estimate stator than rotor #ux, and secondly, a need for
the magnitude only is a weaker requirement than relying
on estimates of the full stator phasor. Compared to
methods which require good estimates of both coordinates of the stator phasor for conversion between di!erent reference frames, the sensitivity of the method
presented here with respect to observer errors will be
much lower. Furthermore, as the updating scheme of the
neural algorithm applied here is based on the integral of
the output error, this will further provide for averaging
out the estimation errors.
5.2. Dynamic neural controller for the induction motor
The control algorithm utilises the technique presented
in Section 2, with a modi"cation in the structure of the
neural model of the plant. A dynamic network serves as
a model of the motor, with stator voltage magnitude Du D
4
and angular supply frequency u being the inputs and
4
stator #ux amplitude Dw D and rotor speed u as the
4
3
outputs. The neural model of the motor is constructed as
x5( "Dx( #AT(x( )#f tanh(i )#f tanh(i )#Bu,
1
4x
2
4y
y( "Cx( ,
(30)
16
G.J. Kulawski, M.A. Brdys& / Automatica 36 (2000) 5}22
where, as before, x( 3Rn is the state vector, u"[Du D,u ]T is
4 4
the input vector and y( "[DwK D,u( ]T is the output vector.
4 3
f , f 3Rn are vectors of adjustable weights. i and
1 2
4x
i denote components of the stator currents phasor with
4y
respect to a rotating reference frame "xed to the stator
input voltage phasor u . Measurements of stator currents
4
are easily obtainable and, since u is known, the conver4
sion from (sD, sQ) coordinates can be easily performed.
The extra terms, containing measurements of stator currents, were found to improve the modelling capabilities of
the network and consequently also the quality of control.
As already mentioned, the proposed structure of the
dynamic network (2) is a compromise between the full
modelling capabilities and analytical tractability of the
algorithm. It is shown here that this structure can be
augmented by utilising the available information and
thus improve the modelling capabilities at no extra cost.
Since in normal operation, the value of the rotor speed
expressed in rad/s is about two orders of magnitude
bigger than the #ux amplitude expressed in Wb, the
speed output, that the neural model is supposed to
model, is the measured value of rotor speed in rad/s
scaled down by a factor of 0.01. This is done to improve
conditioning of the error function used in parameter
adaptation. Consequently, the speed setpoint for the motor (in rad/s) is also scaled by 0.01 for control generation
purposes. Following the general procedure described in
Section 2, control input is calculated as
u"(CB)~1(!CDx( !CAT(x( )!Cf tanh(i )
1
4x
!Cf tanh(i )#v),
2
4y
where, as previously,
(31)
v"y5 !a(y( !y ),
3%&
3%&
and y contains the reference values for #ux magnitude
3%&
and speed, y "[Dw DH, uH]T.
3
3%&
4
Although the presence of the stator current measurements in the model (30) resembles an observer structure,
it has to be pointed out that, "rstly, as opposed to typical
observers based for example on the extended Luenberger
scheme (BrdysH & Du, 1991; Du & BrdysH , 1993), here
measurements enter in a nonlinear fashion and there is
no explicit corrector term. Secondly, the control input is
generated based on the model (30) in an integrated
fashion without use of the explicit state and parameter
observer and application of the certainty equivalence
principle.
form distributions, for D, A in the interval [!1,0], for
B, C in [!0.01,0.01] and f , f in [!0.05,0.05]. The
1 2
network was "rst trained in open loop with externally
generated control inputs applied to both motor and
neural model. All the weights of the network were updated in this phase. The projection algorithm extended to
D, as described in Example 2, was used both for the
preliminary training and the control. e"0.0001 was
used. Similarly as before the size of the network and the
initialisation intervals were chosen after a few trials.
In the control experiment, the start-up of the motor
was performed using an independent controller whose
inputs were applied both to the motor and the neural
model. In this start-up period learning was switched o!.
After the start-up, the control was switched to the neural
controller described above. During the closed-loop control, only weight matrices D, A, f and f were adapted,
1
2
with period ¹"0.2 s and learning rate j"10. Stator
#ux magnitude was obtained directly from the simulation
model without implementing the observer.
In the experiment, the motor is subjected to a constant
load torque t "5 N m and it is expected to follow a traL
jectory of reference rotor speed. This is a standard test
trajectory consisting of an increasing (acceleration) and
then decreasing (deceleration) ramp, which is passed
through a stable linear "lter. The reference value of the
stator #ux magnitude is kept constant at Dw DH"1.1 Wb.
4
Results are shown in Fig. 9. The controller is able to
follow the speed pro"le quite well while maintaining the
#ux amplitude close to the desired value. The behaviour
of four randomly chosen weights of the neural model,
during the course of this experiment, is shown in Fig. 10.
It can be seen that speed variation over a wide range
necessitates adaptation of the neural model, whose approximation capabilities are not global enough.
5.3. Simulation results
For the simulation study reported here, a network of
the structure (30) with 40 dynamic neurons was used. It
was "rst trained in open-loop with externally generated
control inputs applied to both motor and neural model.
Weights were initialised as random numbers with uni-
Fig. 9. Speed pro"le following. (a) reference speed (dotted) and rotor
speed (solid) (rad/s), (b) stator #ux magnitude (solid) and reference value
(dotted) (Wb).
G.J. Kulawski, M.A. Brdys& / Automatica 36 (2000) 5}22
17
Fig. 10. Speed pro"le following, changing weights in the neural model: (a) weight a , (b) weight d , (c) weight f , (d) weight f
.
2
1
1,2
2,19
Remark. As already mentioned, the analysis of the structural error is beyond the scope of the theoretical work
presented in this paper. Control of the induction motor
presented above is clearly a case where structural error is
present. The recurrent network is not able to provide
a global model of the complex dynamics of the motor and
parameter adjustments are required if the operating
point of the machine changes signi"cantly. In this way
parameter adaptation appears to give robustness with
respect to the structural error. The issue of the structural
error is related to the locality of the neural model and the
question whether the network will `forgeta a model obtained at a previous operating point. In the convergence
analysis, which is presented for the case of no structural
error, parameter updates result in the decrease of both
the parameter error and the tracking error bound. Thus,
the overall quality of the (global) model improves. The
structural error will prevent achieving a good global
model and, as the operating point changes, the network
may, to some extent, `forgeta the previous model. Intuitively however, an accurate enough local model is su$cient to generate suitable control action in the present
operating region and consequently achieve maintaining
decrease of the tracking error. This explains the observed
robustness with respect to the structural error.
6. Conclusions
The adaptive neural control algorithm for nonlinear
exponentially stable plants presented here is based on use
of a recurrent neural network as a dynamic model of the
system. The recently obtained stability results are extended to the case of general reference output signals.
The analysis presented in the proof of the theorem highlights the mechanisms governing the stability of such
neural control systems, and the interactions between the
dynamics of the plant and the neural model on the one
hand, and the dynamics of learning on the other. Speed of
the plant-model dynamics limits the admissible speed of
adaptation. This is quanti"ed by the smallness of the
adaptation rate and the length of update period ¹.
The stability analysis shows that recurrent networks
seem to possess certain intrinsic features making them
suitable for nonlinear adaptive control in the case of
unmeasurable state of the plant. It is clear from the
presented results that hyperbolic tangent activation function appears to be a very good choice of nonlinearity for
constructing nonlinear dynamic models. Properties of
the function tanh( ) ), like smoothness, boundedness and
being monotone, have been used many times in the
stability proof in a crucial way. It is our belief that
appropriately structured recurrent neural networks can
provide conveniently parameterized dynamic models for
many nonlinear systems for use in adaptive control.
Further research e!ort needs to be directed on the one
hand towards analysis of approximation capabilities of
the recurrent networks and improvements in the model
structure to remove the restrictive assumptions, and on
the other hand towards possible improvements in learning methods and their stability as well as incorporation
of the modelling errors into the analysis.
G.J. Kulawski, M.A. Brdys& / Automatica 36 (2000) 5}22
18
K
Appendix A. Lemma A.1 with proof
4 !2jbDDa8 (k)DD
Lemma A.1. Consider a smooth function E(z( ) which has
a local minimum in z( "z, i.e.
E(z( )'E(z) ∀DDz( !zDD(r, z( Oz
(A.1)
and
LE(z)
"0.
Lz(
K
(A.2)
Vector variable z( is composed of two vectors of the same
dimension z( "[a( T, x( T]T. Variable a( is updated according to
LE(z( (k))
a( (k#1)"a( (k)!j
.
La(
K
K
0(b(1,
K K
L2E(z( )
4M(R,
Lz( 2
2
DDx( (k)!xDD4sDDa( (k)!aDD, s(R.
LE(a( , x( ) LE(a, x( ) L2E(a6 , x( )
"
#
(a( !a)
La(
La(
La( 2
L2E(a6 , x( )
LE(a, x) L2E(a, x6 )
#
(x( !x)#
a8
"
La( Lx(
La( 2
La(
L2E(a, x6 )
L2E(a6 , x( )
"
x8 #
a8 ,
La( Lx(
La( 2
a6 "a a( #(1!a )a,
1
1
(A.4)
x6 "a x( #(1!a )x.
2
2
(A.5)
Positive numbers a and a satisfy 0(a , a (1.
1
2
1 2
Thus, using Eqs. (A.4) and (A.5),
then there exists a positive number jH such that for
0(j4jH the following is true for all k50:
K
K K
K
#
The update equation is equivalent to
LE(z( (k))
a8 (k#1)"a8 (k)!j
.
La(
From the above and using Eq. (A.3) we obtain
K
L2E(a6 (k), x( (k))
DDa8 (k)DD
La( 2
2
4M(s#1)DDa8 (k)DD.
where a8 "a( !a.
Proof. Initial condition satisfying DDa( (0)!aDD(r/Js2#1
together with relation (A.5) ensure that z( (0) is inside the
ball, where all assumptions are satis"ed.
K
LE(z( (k))
L2E(a, x6 (k))
DDx8 (k)DD
4
La(
La( Lx(
2
DDa8 (k#1)DD2(DDa8 (k)DD2,
S
LE(z( (k))
.
La( (k)
(A.3)
DDa( (0)!aDD(r/Js2#1,
s2r2
r2
(
#
"r.
s2#1 s2#1
!2bDDa8 (k)DD#j
where x8 "x( !x and
in the ball DDz( !zDD(r, for z( Oz, and the initial condition
a( (0) satisxes
DDz( (0)!zDD"JDDx( (0)!xDD2#DDa( (0)!aDD2
K
KB
Applying the mean value theorem twice, and utilising the
fact that, due to Eq. (A.2), LE(a, x)/La( "0, the following
can be obtained:
The second argument x( can vary as well and x( (k) denotes
the corresponding trajectory. If, for all k50, it holds
that
LE(z( (k))T
LE(z( (k))
(a( (k)!a)5 b
DDa( (k)!aDD,
La(
La(
KA
LE(z( (k))
La(
"j
K K
K
LE(z( (k))
LE(z( (k)) 2
#j2
La(
La(
Finally,
DDa8 (k#1)DD2!DDa8 (k)DD2
K
4j
K
LE(z( (k))
DDa8 (k)DD(!2b#jM(s#1))
La(
and for 0(j(jH"2b/M(s#1) the above expression
is negative. The above and the initial condition
DDz( (0)!zDD2 ensure that for 0(j(jH z( (k) remains in
the ball, where the assumptions are satis"ed, for all
k50. h
DDa8 (k#1)DD2!DDa8 (k)DD2
" a8 (k#1)Ta8 (k#1)!a8 (k)Ta8 (k)
" !2ja8 (k)T
LE(z( (k))
LE(z( (k))T LE(z( (k))
#j2
La(
La(
La(
Appendix B. Learning rate condition (28)
In this section it is shown that a learning rate satisfying
(28), provided that (29) holds, ensures that the error
G.J. Kulawski, M.A. Brdys& / Automatica 36 (2000) 5}22
criterion function E satis"es conditions (A.3) and (A.5) on
the system trajectories in the discrete-time instants
a( (k), x( (k).
B.1. Satisfaction of condition (A.3)
We "rst show that conditions (28) and (29) imply
that relation (A.3) holds for the error function E on
the system trajectories. Using twice the mean value
theorem and the fact that the gradient of the error function is zero at the optimum point a( (k)"a, x( (k)"x(k),
we have
L2E(a, x6 )
L2E(a6 , x( )
"
x8 #
a8 ,
La( Lx(
La( 2
B B
K
B
K
DDx8 (k)DD.
A
B
A
B
L2E(a6 (k), x( (k))
j
b(k)"g .*/
La( 2
j
.!9
with 0(g41, giving
(1#gj /j )DDL2E(a, x6 (k))/La( Lx( DD
.*/ .!9
2DDx8 (k)DD.
DDa8 (k)DD5
(1!g)j
.*/
(B.3)
LE(a( , x( )
(a( (k)!a)T
La(
If the above holds, condition (A.3) is satis"ed for all k50
with a certain b(k). We assume that there exists bH such
that
L2E(a, x6 (k))
L2E(a6 (k), x( )
x8 (k)#a8 (k)T
a8 (k)
La( Lx(
La( 2
K
LE(a( (k), x( (k))
.
La(
L2E(a6 (k),x( )
L2E(a,x6 (k))
x8 (k)#a8 (k)T
a8 (k)
La( 2
La( Lx(
B
L2E(a6 (k), x( (k))
DDa8 (k)DD2
La( 2
Thus
and, due to Eq. (B.1),
K
K
A
K
DDa8 (k)DD5 DDa8 (k!1)DD!j
LE(a( (k), x( (k))
L2E(a, x6 (k))
DDx8 (k)DD
4
La(
La( Lx(
2
#j
.!9
Then (B.3) being satis"ed for all k50, implies that condition (A.3) holds uniformly for all k50 with b"bH.
Because of the relation (26), a bound on the state error
magnitude DDx8 (k)DD is determined by parameter error magnitude in the previous time instant DDa8 (k!1)DD. Also a8 (k),
through the update equation, is determined by a8 (k!1)
and the update step j. We show that j satisfying (28)
guarantees that (B.3) holds. The update equation is
equivalent to
LE(a( (k), x( (k))
a8 (k#1)"a8 (k)!j
.
La(
L2E(a,x6 (k))
!DDa8 (k)DD
DDx8 (k)DD
La( Lx(
K K
0(bH(b(k) ∀k
(B.2)
Since
K
K
L2E(a, x6 (k))
DDx8 (k)DD
La( Lx(
2
L2E(a6 (k), x( (k))
#j
DDa8 (k)DD .
.!9
La( 2
b can be chosen as
Positive numbers a and a satisfy 0(a ,a (1.
1
2
1 2
We need
A
AKK
A
5bDDa8 (k)DD
(B.1)
x6 "a x( #(1!a )x,
2
2
5j
.*/
L2E(a, x6 (k))
! DDa8 (k)DD
DDx8 (k)DD
La( Lx(
A
a6 "a a( #(1!a )a,
1
1
a8 (k)T
B
L2E(a, x6 (k))
La( Lx(
2
DDa8 (k)DD5
L2E(a6 (k), x( (k))
L2E(a6 (k), x( (k))
!bj
j
.!9
.*/
La( 2
La( 2
where
K
A
L2E(a6 (k), x( (k))
DDa8 (k)DD2
La( 2
j
.*/
(1#b)
LE(a, x) L2E(a, x6 )
L2E(a6 , x( )
"
#
(x( !x)#
a8
La(
La( Lx(
La( 2
5 bDDa8 (k)DD
relation (B.2) is satis"ed if the following holds:
This leads to the condition
LE(a( , x( ) LE(a, x( ) L2E(a6 , x( )
"
#
(a( !a)
La(
La(
La( 2
" a8 (k)T
19
KK
LE(a( (k!1), x( (k!1))
.
La(
Due to Eqs. (29) and (27), jHH from Eq. (28) satis"es
B
L2E(a6 (k), x( (k))
DDa8 (k)DD
La( 2
1
DDa8 (k)DD
jHH4 4
∀k50.
c DDLE(a( (k), x( (k))/La( DD
G.J. Kulawski, M.A. Brdys& / Automatica 36 (2000) 5}22
20
Condition (28) implies that j(1/c, and thus the denominator in the above expression is positive.
Therefore for j satisfying Eq. (28), we have
K
DDa8 (k!1)DD!j
K
LE(a( (k!1), x( (k!1))
50.
La(
Appendix C. Induction motor model
Thus
K
DDa8 (k)DD5DDa8 (k!1)DD!j
K
LE(a( (k!1), x( (k!1))
.
La(
(B.4)
The above and Eq. (26) imply, that for Eq. (B.3) to hold it
is su$cient that
K
DDa8 (k!1)DD!j
K
LE(a( (k!1), x( (k!1))
La(
(1#gj /j )DDL2E(a, x6 (k))/La( Lx( DD
.*/ .!9
2o DDa8 (k!1)DD.
5
1
(1!g)j
.*/
This inequality is satis"ed with
1!((1#gj /j )DDL2E(a, x6 )/La( Lx( DD /j (1!g))o
.*/ .!9
2 .*/
1
j4
DDLE(a( (k!1), x( (k!1))/La( DD
DDa8 (k!1)DD.
Therefore, using Eq. (27), we conclude that the above is
satis"ed uniformly for all k50 by j given by Eq. (28),
provided that Eq. (29) holds.
B.2. Satisfaction of (A.5)
We proceed to show that such value of j also guarantees that relation (A.5) i.e.
DDx8 (k)DD4sDDa8 (k)DD
holds with a constant s uniformly for k50. Eq.
(B.4) implies that for the above to hold it is su$cient
that
A
K
DDx8 (k)DD4s DDa8 (k!1)DD!j
KB
LE(a( (k!1), x( (k!1))
.
La(
Since due to Eq. (27),
A
K
DDa8 (k!1)DD!j
LE(a( (k!1), x( (k!1))
La(
KB
5 DDa8 (k!1)DD!jcDDa8 (k!1)DD
relation (26) can be used to obtain a further su$cient
condition:
o DDa8 (k!1)DD4s(1!jc)DDa8 (k!1)DD
1
which is satis"ed for all k50 with
o
s5 1 .
1!jc
A three-phase squirrel-cage induction motor is considered. A few standard simplifying assumptions are
made: the air gap between rotor and stator is uniform
and constant, saturation and hysteresis are neglected and
the stator supply neutral point is isolated. Using the
Parks transformation (e.g. Vas, 1990), the three-phase
stator windings (sA, sB, sC) can be transformed into
equivalent quadrature-phase windings (sD, sQ). The dynamics of the motor are then given by a "fth-order
nonlinear di!erential model (Marino, Peresada & Valigi,
1993)
du
nM
t
3" p (t i !t i )! L,
(C.1)
3d
4q
3q
4d
dt
J¸
J
3
di
MR
nM
M2R #¸2R
3 4i
4d"
3t # p ut !
3
4d
p¸ ¸2
dt
p¸ ¸2 3d p¸ ¸ 3 3q
4 3
4 3
4 3
1
u ,
(C.2)
#
p¸ 4d
4
di
nM
MR
M2R #¸2R
3 4i
4q"! p u t #
3t !
3
4q
p¸ ¸2
dt
p¸ ¸ 3 3d p¸ ¸2 3q
4 3
4 3
4 3
1
u ,
(C.3)
#
p¸ 4q
4
dt
R
R
3d"! 3t !n u t # 3Mi ,
(C.4)
p 3 3q ¸
4d
dt
¸ 3d
3
3
dt
R
R
3q"n u t ! 3t # 3Mi ,
(C.5)
p 3 3d ¸ 3q ¸
4q
dt
3
3
where i, u, t denote current, voltage and #ux linkage
respectively. Subscripts r and s stand for rotor and stator.
u is the rotor speed. d and q denote (`directa and
3
`quadraturea) components of the vectors with respect to
the "xed stator reference frame (sD, sQ). ¸ and R are the
autoinductances and resistances, M is the mutual inductance and p"1!(M2/¸ ¸ ). t is the load torque.
4 3 L
C.1. Machine data used in simulations
ASEA 3&50 Hz(ABB) MBL 132 SB38-2 7.5 kW.
n
number of pole pairs
1.
p
R
stator resistance
2.19 ).
4
R
rotor resistance
1.038 ).
3
¸
stator autoinductance
0.51159 H.
4
¸
rotor autoinductance
0.51159 H.
3
M
mutual inductance
0.501 H.
J
rotor inertia
0.35 kg m2.
G.J. Kulawski, M.A. Brdys& / Automatica 36 (2000) 5}22
References
BrdysH , M. A., & Du, T. (1991). Algorithms for joint state and parameter
estimation in induction motor drive systems. In Proceedings of
the international conference CONTROL'91, Edinburgh, Scotland
(pp. 915}920).
BrdysH , M. A., Kulawski, G. J., & Quevedo, J. (1996). Recurrent
networks for nonlinear adaptive control. In Proceedings
of the 13th IFAC world congress, vol. F. San Francisco
(pp. 151}156).
BrdysH , M. A., Kulawski, G. J., & Quevedo, J. (1998). Recurrent
networks for nonlinear adaptive control. IEE Proceedings on
Control Theory and Applications, 145(2), 177}188.
Chang, W. D., Fu, L. C., & Yang, J. H. (1996). Adaptive robust
neural-network based control for siso systems. In Proceedings
of the 13th IFAC world congress, vol. F. San Francisco
(pp. 163}168).
Chen, F. C., & Khalil, H. K. (1991). Adaptive control of nonlinear systems using neural networks * a dead-zone approach.
In Proceedings of the American control conference (pp. 667}
672).
Cybenko, G. (1989). Approximation by superpositions of a sigmoidal
function. Mathematics of Control, Signals and Systems, 2, 303}314.
Delgado, A., Kambhapati, C., & Warwick, K. (1995). Dynamic
recurrent neural network for system identi"cation and
control. IEE Proceedings on Control Theory and Applications,
142, 307}314.
Du, T., & BrdysH , M. A. (1993). Implementations of extended Luenberger
observers for joint state and parameter estimation of pwm induction
motor drive. In Proceedings of the 5th European conference on power
electronics (IPE), Brighton, England.
Funahashi, K. I. (1989). On the approximate realization of continuous mappings by neural networks. Neural Networks, 2, 183}191.
Funahashi, K. I., & Nakamura, Y. (1993). Approximation of dynamical
systems by continuous time recurrent networks. Neural Networks, 6,
801}806.
Gupta, M. M., Rao, D. H., & Nikiforuk, P. N. (1993). Dynamic neural
network based inverse kinematic transformation of two- and threelinked robots. In Proceedings of the 12th IFAC world
congress, vol. 3. Sydney (pp. 289}296).
Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2,
359}366.
Jin, L., Nikiforuk, P. N., & Gupta, M. M. (1993). Model matching
control of unknown nonlinear systems using recurrent neural
networks. In Proceedings of the 12th IFAC world congress, vol. 1.
Sydney (pp. 337}344).
Jin, L., Nikiforuk, P. N., & Gupta, M. M. (1994). Adaptive control
of discrete-time nonlinear systems using recurrent neural networks.
IEE Proceedings on Control Theory and Applications, 141, 169}176.
Jin, L., Nikiforuk, P. N., & Gupta, M. M. (1995). Approximation of
discrete-time state-space trajectories using dynamic recurrent neural networks. IEEE Transactions on Automatic Control, 40,
1266}1270.
Khalil, H. K. (1992). Nonlinear systems. New York: Macmillan.
KrsticH , M., Kanellakopoulos, I., & KokotovicH , P. (1995). Nonlinear and
adaptive control design. New York: Wiley.
Ku, C. C., & Lee, K. Y. (1992). System identi"cation and control using
diagonal recurrent neural networks. In Proceedings of the American
control conference (pp. 545}549).
Kulawski, G. J., & BrdysH , M. A. (1994). Dynamic neural networks for
nonlinear adaptive control. Technical Report 11/04-94. School
of Electronic and Electrical Engineering, The University of
Birmingham, UK.
Levin, A. U., & Narendra, K. S. (1993). Control of nonlinear dynamical
systems using neural networks: Controllability and stabilisation.
IEEE Transactions on Neural Networks, 4, 192}206.
21
Levin, A. U., & Narendra, K. S. (1996). Control of nonlinear dynamical
systems using neural networks * Part II: Observability, identi"cation
and control. IEEE Transactions on Neural Networks, 7, 30}42.
Lewis, F. L., Yesildirek, A., & Liu, K. (1993). Neural net robot controller: Structure and stability proofs. In Proceedings of the 32nd conference on decision and control, San Antonio (pp. 2785}2791).
Marino, R., & Tomei, P. (1995). Nonlinear control design: Geometric,
adaptive and robust. Englewood Cli!s, NJ: Prentice-Hall.
Marino, R., Peresada, S., & Valigi, P. (1993). Adaptive input-output
linearizing control of induction motors. IEEE Transactions on Automatic Control, 38(2), 208}221.
Narendra, K. S., & Annaswamy, A. M. (1989). Stable adaptive systems.
Englewood Cli!s, NJ: Prentice-Hall.
Narendra, K. S., & Parthasarathy, K. (1990). Identi"cation and control
of dynamical systems using neural networks. IEEE Transactions on
Neural Networks, 1, 4}27.
Narendra, K. S., & Parthasarathy, K. (1991). Gradient methods for the
optimization of dynamical systems containing neural networks.
IEEE Transactions on Neural Networks, 2, 252}262.
Parlos, A. G., Chong, K. T., & Atiya, A. F. (1994). Application of
the recurrent multilayer perceptron in modelling complex process
dynamics. IEEE Transactions on Neural Networks, 5(2), 255}266.
Sanner, R. M., & Slotine, &J. -J. E. (1992). Gaussian networks for direct
adaptive control. IEEE Transactions on Neural Networks, 3(6),
837}863.
Sastry, P. S., Santharam, G., & Unnikrishnan, K. P. (1994). Memory
neuron networks for identi"cation and control of dynamical systems. IEEE Transactions on Neural Networks, 5(2), 306}319.
Sastry, S., & Bodson, M. (1989). Adaptive control: Stability, convergence
and robustness. Englewood Cli!s, NJ: Prentice-Hall.
Suykens, J. A. K., Vandewalle, J., & De Moor, B. L. R. (1997). Nonlinear
H control for continuous-time recurrent neural networks. In Pro=
ceedings of the fourth European control conference, Brussels.
Vas, P. (1990). Vector control of AC machines. Oxford: Oxford
University Press.
Verrelst, H., Van Acker, K., Suykens, J. A. K., Motmans, B., De Moor,
B. L.R., & Vandewalle, J. (1997). N¸ neural control theory: Case
q
study for a ball and beam system. In Proceedings of the fourth
European control conference, Brussels.
Grzegorz Kulawski was born in Poland in
1970. He obtained a B.Eng. in Electronic
and Electrical Engineering in 1993 and
a PhD in 1998, both from The University
of Birmingham, UK. At present he is with
Shell International Exploration and Production B.V., Research and Technical Services, Rijswijk, The Netherlands. His
research interests include neural networks,
nonlinear adaptive control and modelling
of industrial processes.
Mietek Brdys: received the M.Sc. degree in
Electronic Engineering and the Ph.D. and
the D.Sc. degrees in Control Systems from
the Institute of Automatic Control at the
Warsaw University of Technology in 1970,
1974 and 1980, respectively. From 1974 to
1983, he held the posts of Assistant Professor and Associate Professor at the Warsaw
University of Technology. In 1992 he became Full Professor of Control Systems in
Poland. Between 1978 and 1995, he held
22
G.J. Kulawski, M.A. Brdys& / Automatica 36 (2000) 5}22
various visiting faculty positions at the University of Minnesota, City
University, De Montfort University and University Polytecnic of
Catalonia. Since January 1989, he has held the post of Senior Lecturer in
the School of Electronic and Electrical Engineering at The University of
Birmingham, UK. He has served as the Consultant for the Honeywell
Systems and Research Center in Minneapolis, GEC Marconi and Water
Authorities in UK, France, Germany, Spain and Poland. His research is
supported by the UK Research Council and industry and the European
Commission. He is the author or co-author of about 100 refereed paper
and "ve books. His current research interests include intelligent control
of nonlinear and uncertain systems, robust monitoring and operational
control with application to environmental systems. He is a Chartered
Engineer, a Member of the IEE and the IEEE, a Fellow of IMA and
a member of IFAC Technical Committee on Large Scale Systems.

Download Report

Stable adaptive control with recurrent networks

Paperzz.com

Your Paperzz