Stochastic Iterative Learning Control Design for Nonrepetitive Events

2010 American Control Conference
Marriott Waterfront, Baltimore, MD, USA
June 30-July 02, 2010
WeB14.1
Stochastic Iterative Learning Control Design for Nonrepetitive Events
Sandipan Mishra and Andrew Alleyne
Abstract— This paper proposes a lifted domain ILC design
technique for repetitive processes with significant non-repetitive
disturbances. The learning law is based on the minimization
of the expected value of a cost function (i.e., error norm) at
each iteration. The derived learning law is iteration-varying
and depends on the ratio of the covariance of non-repetitive
component of the error to the covariance of the residual
total error. This implies that in earlier iterations the learning
is rapid (large learning gains) and as iterations go by, the
algorithm is conservative and learns slowly. The proposed
algorithm is also extended to the case where the learning filter
is fixed and the optimal (iteration-varying) learning rate needs
to be determined. Finally, the performance of the proposed
method is evaluated vis-a-vis a geometrically decaying learning
algorithm and an optimal fixed-rate learning algorithm through
simulation of a Micro-robotic deposition system.
I. INTRODUCTION
Iterative learning control (ILC) is a feedforward control
design technique for repetitive processes. ILC algorithms
use information from earlier trials of the repetitive process
to improve performance in the current trial [1]. ILC has
been employed in a wide variety of applications including industrial robotics, computer-numerical control tools,
injection molding systems, rapid thermal processing, microscale robotic deposition , and rehabilitation robotics. Detailed
surveys of ILC algorithms and their applications can be
found in [2], [3]. The most attractive feature of ILC is its
simple implementation and robustness to model uncertainty.
ILC algorithms are designed with the assumption that
the disturbances are repetitive, i.e., they remain the same
from one run of the process to another. This assumption is
however not valid for most processes, where nonrepetitive
disturbances or events often occur. Therefore, it is important
to account for the effect of noise and nonrepetitive events
on ILC performance. Of late, significant interest has been
generated in the ILC community towards understanding and
analyzing the effects of nonrepetitive disturbances in ILC.
Saab [4] developed a stochastic ILC framework for designing
optimal P-type learning. Moore [5] proposed a higher order
ILC algorithm to suppress the effect of nonrepetitive but
structured disturbances. [6], [7] used a forgetting factor to
avoid the accumulation of nonrepetitive disturbances. In [8],
Helfrich et al designed optimal feedback and ILC algorithms
taking into account the frequency domain characteristics of
the nonrepetitive and repetitive disturbances. Bristow [9]
analyzed the effect of learning rate on the amplification of
nonrepeating components of the error and proposed optimal
learning algorithms for a fixed convergence rate. Longchamp
[10] statistically evaluated and compared three alternate ILC
algorithms for robustness to nonrepetitive disturbances and
showed that the heuristic geometrically decaying learning
algorithm proposed in [11] provided good robustness. However, there has been so far no systematic determination of
an optimal (decreasing) learning rate taking into account
the frequency spectra of the nonrepetitive and repetitive
components of the error in ILC.
In this paper, we present stochastic machinery in lifted
domain to understand the effect of nonrepetitive events and
design algorithms that take into account the nature of these
nonrepetitive events in order to improve performance for
linear ILC systems. We propose a stochastic norm optimal
learning algorithm that takes into account repetitive and
nonrepetitive components of the disturbance by minimizing
the expected value of the 2-norm of the error at each
iteration. This norm optimal learning algorithm is shown
to be iteration-varying, with faster learning rates in initial
iterations and slower learning rates as iterations go by.
The rate of learning is effectively governed by the ratio of
nonrepetitive error to total error at each iteration. This is
intuitive since we wish to have aggressive learning initially
when large repetitive disturbances exist and then slow down
learning in order to avoid accumulation of nonrepeating
disturbances. Section II describes the general problem framework for a repetitive process with repetitive and nonrepetitive
disturbances.
II. ILC P ROBLEM S ETUP FOR A
F EEDBACK -C ONTROLLED R EPETITIVE P ROCESS
Let us consider a discrete time linear time invariant (LTI)
plant, denoted by P. The plant is controlled via an LTI
feedback controller C. This closed loop system is stable and
executes a repetitive process with period of N samples. We
want the output of the system to track a trajectory yd ( j),
where j ranges from 0 to N − 1. This is repeated several
times, with the system coming back to rest condition at
the end of each iteration of the cycle, and starting at rest
condition at the beginning of each iteration. This system is
illustrated in Figure 1. From this, we can derive the following
input-output relationships.
S. Mishra is a post doctoral research asscoiate in the Department
of Mechanical Science and Engineering, University of Illinois, UrbanaChampaign, IL 61802. [email protected]
A. Alleyne is a faculty member with the Department of Mechanical
Science and Engineering, University of Illinois, Urbana-Champaign, IL
61802. [email protected]
978-1-4244-7425-7/10/$26.00 ©2010 AACC
1266
yk ( j) = (I + P(z)C(z))−1 P(z)C(z)yd ( j) +
(I + P(z)C(z))−1 P(z) u f ,k ( j) + d( j) + nk ( j)
(1)
ek ( j) = yd ( j) − ym,k ( j)
(2)
yd- eekC(z)
−
6
u f ,k d
-?
e -?
e - P(z)
6
nk
not always true, by shifting the output signal in time till the
relative degree is 0, we can make G full column rank.
Consider the learning law u f ,k+1 = u f ,k + Lek , assuming
zero measurement noise ξ (·) = 0, we get the following
equations for error and control effort evolution.
-yk
?
eξk
Fig. 1.
Block Diagram of Closed loop System for the kth trial.
We make the assumption that the overall closed loop
system is of relative degree 1. This assumption may be
relaxed by shifting the input signal forward in time by r
steps for systems with relative degree r. Consider the vectors
obtained by stacking the signals yd (·), yk (·), u f ,k (·), etc. for
each cycle. These are denoted as, for example,
T
yd = yd (0) yd (1) yd (2) . . . yd (N − 1)
T
yk = yk (0) yk (1) yk (2) . . . yk (N − 1)
T
u f ,k = u f ,k (0) u f ,k (1) u f ,k (2) . . . u f ,k (N − 1)
Assuming that the system starts at intial rest condition
(yk ( j) ≡ 0 ∀ j ≤ 0, ∀k), Eq. 1 for each step j ∈ {0, 1, 2, . . . , N −
1} can be unified into
yk = Gyd yd + Gu f u f ,k + Gd (d + nk ) + Gξ ξ k
ek = yd − yk
where Q : ℜN → ℜN and L : ℜN → ℜN .
III. ILC AND N ONREPETITIVE S IGNALS
In this section, we derive optimal learning matrices for the
ILC algorithm based on the minimization of the cost function
given by the expected value of the error 2-norm. Our goal is
to determine the learning matrix Lk so that this is minimized
at each iteration. In other words, we aim to minimize
(5)
J = E eTk+1 ek+1
The plant output equations in lifted form for the kth trial
of the process are given by
(9)
uk+1 = (I − Lk G) uk + Lk w + Lk vk
(10)
ek+1 = w + vk+1 − Guk+1
(11)
We assume that the nonrepetitive component vk is a zero
mean Gaussian random variable in ℜN with covariance V.
The frequency spectrum of nonrepetitive component of the
error is captured in the structure of V. We also introduce the
following definitions of input and error covariance matrices:
h
i
Xuu,k = E (uk − E[uk ]) (uk − E[uk ])T
(12)
h
i
Xee,k = E (ek − E[ek ]) (ek − E[ek ])T
(13)
Note that:
J = E eTk+1 ek+1 = trace E ek+1 eTk+1
= trace Xee,k+1 + E [ek+1 ] E [ek+1 ]T
(14)
From the error (Eq. 9) and control (Eq. 10) update equations,
we get
(3)
where Gyd , Gu f , Gd , and Gξ ∈ ℜN×N . We will refer to the
lifting operation on a transfer function ( H( jω) ) to get the
corresponding matrix ( H ) as L (H( jω)) = H.
A linear learning law u f ,k+1 (·) = Q(z) u f ,k (·) + L(z)ek (·)
can be written in the lifted form as shown in Eq. 4.
u f ,k+1 = Q u f ,k + Lek
(4)
ek = w + vk − Guk
Xee,k+1 = GXuu,k+1 GT + V
(15)
Xuu,k+1 = (I − Lk G) Xuu,k (I − Lk G)T + Lk VLTk
(16)
Xee,k+1 = (I − GLk ) GXuu,k GT (I − GLk )T +
GLk VLTk GT + V
(17)
T
Also, we get E [ek+1 ] E [ek+1 ] =
(I − GLk ) E [ek ] E [ek ]T (I − GLk )T
(18)
Plugging the above into Eq. 14, we get E eTk+1 ek+1 =
trace (I − GLk ) E [ek ] [ek ]T + GXuu,k GT (I − GLk )T
+GLk VLTk GT + V
= trace (I − GLk ) E ek eTk − V (I − GLk )T
+ GLk VLTk GT + V
= trace (I − GLk ) E ek eTk (I − GLk )T + GLk V
+VLTk GT
We denote E ek eTk = Φk , and note that E eTk ek =
trace(Φk ). Therefore, we get,
yk = Gyd yd + Gu f u f ,k + Gd (d + nk )
(6)
Φk+1 = (I − GLk ) Φk (I − GLk )T + GLk V + VLTk GT
ek = yd − yk
(7)
= Φk + GLk (V − Φk ) + (V − Φk ) LTk GT + GLk Φk LTk GT (19)
u f ,k+1 = u f ,k + Lk ek
(8)
Φk is already determined from the kth run of the process.
So, we must find the optimal learning matrix Lk based on
Φk and V. Further, Φk+1 is an indicator of the expected size
of the repeating disturbance in stochastic terms for the k + 1
cycle. The update equation for Φ is
We club all the repetitive (invariant with k) terms together
into w = (I − Gyd )yd − Gd d, and all the nonrepetitive terms
together into vk = Gd nk . We also drop the subscript f for the
control effort and the matrices, henceforth. We also assume
that the matrix Gu ≡ G is full column rank. While this is
1267
Φk+1 = (I − GLk ) Φk (I − GLk )T + GLk V + VLTk GT
(20)
In order to find the optimal matrix Lk , we set the gradient
of the cost function (trace(Φk+1 )) to zero, as shown below.
∇Lk trace (I − GLk ) Φk (I − GLk )T
+GLk V + VLTk GT = 0
(21)
−1 T
opt
T
−1
⇒ Lk = G G
G I − VΦk
(22)
From Eq. 22 we can see that if V ≈ Φk , then we should
have very slow learning (Lopt
k = 0), while if V << Φk ,
−1
Lopt
≈
G
(which
means
one-step
inversion learning). This
k
makes intuitive sense, i.e., if we have very large nonrepetitive components (V ≈ Φk ), then learning will only degrade
performance, therefore we should NOT learn (Lopt
k = 0). On
the other hand, if there are no nonrepetitive disturbances,
the best learning
one step inversion learning algorithm
−1is the
T
T
(Lopt
=
G
G
G
).
For notational clarity, we will drop
k
the superscript opt on the optimal learning gain for the
remainder of the paper.
In order to get the optimal βk , we set
and solve for βk .
d
(1 − βk )2 trace(Φk ) + 2βk trace(V) = 0
dβk
⇒ βk trace(Φk ) = trace(Φk ) − trace(V)
trace(V)
trace(V)
= 1− T βk = 1 −
trace(Φk )
E ek ek
1) Φ0 may be initialized to the size of the error in the
first iteration (with no learning, L ≡ 0). Further, if
the repetitive and nonrepetitive parts of the error are
independent, Φ0 = V + W, where W is the covariance
of the repetitive error.
2) trace (Φk+1 ) ≤ trace (Φk ), i.e., Jk+1 ≤ Jk . This property
is very desirable since it guarantees monotonicity of
convergence of the expected value of the Euclidean
norm of the error. Further, this condition is sufficient
to claim stability of the proposed learning law. See
Appendix for proof.
3) The contraction rate ρk = σ̄ (I − GLk ) is smaller for
earlier iterations. In other words, ρk ≤ ρk+1 < 1. Also,
ρk → 1 as k → ∞. In other words, we start with
aggressive learning and as iterations go by learning
decreases. In the limit, the convergence rate approaches
1 (i.e., no learning). This is a desirable property that
has been used heuristically in many ILC algorithms,
see for example [11].
IV. F IXED S TRUCTURE L EARNING R ATE
In many applications, the learning functions (lifted matrices) are already determined from the available mathematical
model of the system, but the rate of learning needs to be
decided. In other words, the structure of the matrix L is
fixed, but has a variable learning rate , i.e., Lk = βk L0 , with
βk ∈ ℜ+ . We propose a method to determine the optimal
iteration-varying gain to minimize the expected value of the
2-norm of error.
Case 1: L0 = G−1 .
Φk+1 = (I − GLk ) Φk (I − GLk )T + GLk V + VLTk GT
= (I − βk I) Φk (I − βk I)T + βk V + βk V
⇒ trace(Φk+1 ) = (1 − βk )2 trace(Φk ) + 2βk trace(V) (23)
=0
(24)
(25)
(26)
Note that βk ∈ [0, 1] ∀k, therefore, the ILC scheme is monotonically convergent in error.
Case 2: L0 is a general learning matrix satisfying
σ̄ (I − GL0 ) < 1.
As before, we obtain the optimal βk at each step by
minimizing trace(Φk+1 ), i.e., by setting dβd trace (Φk+1 ) = 0
k
and solving for βk .
d
(trace(Φk+1 )) = 0
dβk
⇒
Properties of Optimal Learning
d
dβk trace (Φk+1 )
d
trace Φk + βk GL0 (V − Φk ) + βk (V − Φk ) LT0 GT
dβk
+βk2 GL0 Φk LT0 GT = 0
trace (GL0 (Φk − V))
⇒ βk =
trace GL0 Φk LT0 GT
V. N ORM O PTIMAL S TOCHASTIC I TERATIVE L EARNING
C ONTROL
The earlier derivations assume that G is full column rank.
We now consider the norm optimal cost function
J = E eTk+1 ek+1 + uTk+1 ST Suk+1
(27)
which does not require G to be full rank if ST S is chosen to
be > 0. The penalty on the control effort also improves the
robustness of the learning algorithm [12].
Consider the augmented system
ek+1
w
vk+1
G
=
+
−
uk+1 (28)
Suk+1
0
0
−S
ēk+1 =
w̄ + v̄k+1 − Ḡuk+1
(29)
T
The cost functiontherefore becomes J = E ēk+1 ēk+1 =
trace E ēk+1 ēTk+1 = trace(Φ̄k+1 ). Further, we have
V 0
V̄ =
0 0
Φee,k Φeu,k
Φ̄k =
Φue,k Φuu,k
ek
uk+1 = uk + L̄
Suk
Using the same development as in section III, we get the
optimal learning gain to be
−1
T
T
L̄k = Ḡ Ḡ
Ḡ I − V̄Φ̄−1
(30)
k
Plugging in Eq 30 into the learning update law and consolidating the error and control effort terms, we obtain the
1268
learning update law
−1
uk+1 = GT G + ST S
−1
−1
S
uk
Φ
Φ
Φ
GT G + GT V Φee,k − Φeu,k Φ−1
eu,k uu,k
uu,k ue,k
−1
T
−1
ek
+G I − V Φee,k − Φeu,k Φuu,k Φue,k
(31)
It is interesting to note that now we have a Q-filter
effect because of the penalty on control effort. Further, the
convergence rate is
−1 T
V 0
G
T
T
−1
T
ρk = σ̄ G G + S S
Φ̄k
<1
G
S
0 0
S
(32)
Fig. 2.
Plot of Reference Trajectory for µ-RD system.
VI. S IMULATION R ESULTS
We now demonstrate the performance of the proposed
method on a simulated model of a Micro Robotic Deposition
(µ-RD ) system. A detailed description of the modeling and
control of this system may be found in [13]. The closed loop
system model for the y-axis for this system is given by
(z + 0.9963) z2 − 1.768z + 0.9567
P(z) = 0.0459
(z − 1) (z − 0.9772) (z2 − 1.764z + 0.9562)
z2 − 0.2238z + 0.7933
× 2
(z − 0.1784z + 0.7898)
The µ-RD system is run (experimentally) for M = 50 iterations with u f ,0 ≡ 0. The reference trajectory is a constant
velocity scan as shown in Figure 2. The covariances of the
nonrepetitive (V) and repetitive (W) components of the error
(e0j ) are determined by
uk+1 = uk +
W = L (Pww ( jω))
M
1
Pvv ( jω) = |V ( jω)|2 =
∑ |F e0j − E e0 |2
M j=1
V = L (Pvv ( jω))
Φ0 = W + V,
where F is the Discrete-time Fourier transform operator and
L is the lifting operator introduced in Section II.
Figure 3 shows experimentally obtained V ( jω) and
W ( jω). We fit transfer functions through these given by.
7.926 × 10−5
e jω − 0.9921
−4
jω
2.864 × 10 e + 2.832 × 10−4
W 0 ( jω) =
e2 jω − 1.973e jω + 0.9732
A. Fixed-Structure Learning Rate
In this section, we consider two algorithms with decaying
learning rates. The first algorithm is discussed in [10] and is
a geometrically decaying algorithm
1 M 0
E e0 =
∑ ej
M j=1
Pww ( jω) = |W ( jω)|2 = |F E e0 |2
V 0 ( jω) =
Fig. 3.
Frequency Spectra of Repetitive (W ( jω)) and Nonrepetitive
(V ( jω)) Components of the Tracking Error.
(33)
(34)
1
Lek
k+1
(35)
−1 T
where L = GT G
G .
The second algorithm uses the same learning structure L,
but uses the method proposed in Section IV to determine the
learning rate.
trace(V)
uk+1 = uk + 1 −
Lek
(36)
eTk ek
Figure 4 shows the rate of learning against iteration number.
Note that both algorithms decay to zero learning as k → ∞.
Therefore both methods have the same performance against
nonrepetitive disturbances in steady state. However, the proposed method has a faster rate of decay. As a result, the
transient performance is better. Figure 5 shows the plot of
error 2-norm over iterations. We notice that as predicted the
transient response of the optimal learning algorithm is better
1269
than that of the heuristic geometrically decaying learning
algorithm. However, the steady state performance of both
algorithms is the same.
Fig. 4. Convergence rate ( 1 - learning rate) for the trajectory following
error vs. number of trials k for the fixed structure learning algorithms
(Optimal Rate and Geometrically Decaying Rate).
We compare the performance of the above fixed rate
optimal learning algorithm to:
−1 T
Lk = GT G
G I − VΦ−1
(39)
k
Φk+1 = (I − GLk ) Φk (I − GLk )T + GLk V + VLTk GT
(40)
uk+1 = uk + Lk ek
(41)
, where Φ0 = W + V.
Figure 6 shows the plot of the 2-norm of error against
iteration number for the above two schemes. η was chosen as
0.3 to match the initial learning rate of both algorithms. We
see that the performance of the two algorithms is comparable
in the initial iterations. However, in the limit as iterations
go by, the stochastic optimal ILC algorithm has a smaller
error norm. This is because there is no residual learning for
nonrepetitive disturbances that dominate in later iterations.
At the same time, there is more computational burden at
each iteration in recomputing the learning matrix Lk .
Fig. 6. 2-norm of the trajectory following error ek vs. number of trials
k for the fixed rate optimal learning algorithm and the proposed stochastic
optimal ILC algorithm.
Fig. 5. 2-norm of the trajectory following error ek vs. number of trials k
for the fixed structure learning algorithms (Optimal Rate and Geometrically
Decaying Rate).
B. Stochastic Optimal Learning
The fixed structure learning rate algorithms do not take
into account the frequency distribution of the nonrepetitive
and repetitive components of the error signal. In this section,
we compare two algorithms: (a) fixed learning rate algorithm
proposed in [9], and (b) optimal iteration-varying learning
rate algorithm proposed in Section III.
The lifted version of the fixed rate learning algorithm in
[9], for a maximum rate of convergence (η) is given by
√
√
√
L = ΠG ((1 − η)W) ((1 + η)W + η(1 − η)V)−1 (37)
uk+1 = uk + Lek (38)
The attractive feature of this algorithm is that it explicitly
accounts for the relative magnitudes of nonrepetitive and
repetitive disturbances in different frequency bands. However, since it does not have a decaying learning rate, there
is always some accumulation of nonrepetitive disturbances.
C. Discussion
Based on the simulation results presented above, we make
the following observations.
Remark 1 The geometrically decaying algorithm is attractive from an implementation point of view because of
its simplicity. Further, since the rate of learning decays to
zero, there is no residual impact of nonrepetitive disturbances
from earlier iterations. At the same time, in comparison to
the optimal learning rate algorithm the transient performance
of the geometrically decaying algorithm is a little worse.
Further, the fixed decay rate means that there is no resetting
of the learning in case of an unanticipated change. By using
the error-norm in the learning rate decay term in Equation
36, a sudden increase in error norm re-triggers the learning
algorithm. Finally, the frequency spectra of the repetitive and
nonrepetitive disturbances is not taken into account for either
of these algorithms, leading to suboptimal performance.
Remark 2 The optimal fixed-rate learning algorithm proposed in [9] uses the frequency spectra of the nonrepetitive
1270
and repetitive components of the error and performs very
well in the initial iterations of the process, when the repetitive
and nonrepetitive disturbance spectra match well with that
of the first iteration. This can be seen clearly in Figure 6.
However, the fixed learning rate forces us to have some
accumulation of error cause by nonrepetitive components,
as can be seen in Figure 6. In contrast, the varying learning
rate algorithm with a decaying rate of learning (proposed in
Section III) does not cause accumulation of the nonrepeating
error. This performance benefit, however, comes at the cost
of substantially more computation (forward propagation of
the covariance matrices).
VII. C ONCLUSIONS
Performance of ILC algorithms is degraded in the presence
of nonrepetitive components in the error signal. With a faster
rate of learning, the effect of nonrepetitive disturbances is
amplified. At the same time, in order to remove repetitive
disturbances, the rate of learning should be high. Therefore,
there exists a clear tradeoff between amplifying nonrepetitive
disturbances and attenuating repetitive disturbances. This paper posed this trade-off from a lifted-domain ILC framework
perspective and provided expressions for optimal learning
rates. The optimal learning rate was found to be dependent on
the relative magnitudes of the nonrepetitive error to the total
error at each iteration. As the nonrepetitive and total error
sizes become comparable, the rate of learning decreases. As a
result, nonrepetitive disturbances are not amplified in steady
state. A comparison with a geometrically decaying algorithm
showed that the steady state error performance is the same,
however the transient performance is poorer as compared
to the optimized learning rate algorithm. The tradeoff here
is that the geometrically decaying algorithm does not require heavy computation in between iterations. Therefore,
in applications where between-iteration times are small,
the geometrically decaying algorithm provides an attractive
solution at the cost of marginal loss in transient performance.
The proposed optimal method, on the other hand, provides
better performance in cases where the frequency spectra of
the nonrepetitive and repetitive disturbances overlap.
A central drawback of the method proposed in this paper is
the need for forward propagation of the covariance matrices.
In order to overcome this, an online estimation method for
these matrices can be investigated. This online estimation
may be done through processing of the error information
along the iteration. At the same time, frequency domain
equivalents for the learning algorithms derived in this paper will provide elegant and computationally advantageous
implementation. We aim to address these issues in the future.
R EFERENCES
[3] H.-S. Ahn, Y. Chen, and K. Moore, “Iterative learning control: Brief
survey and categorization,” Systems, Man, and Cybernetics, Part C:
Applications and Reviews, IEEE Transactions on, vol. 37, no. 6, pp.
1099–1121, Nov. 2007.
[4] S. Saab, “Stochastic p-type, d-type iterative learning control algorithms,” International Journal of Control, vol. 76, pp. 139–148(10),
2003.
[5] Y. Chen and K. Moore, “Harnessing the nonrepetitiveness in iterative
learning control,” Decision and Control, 2002, Proceedings of the 41st
IEEE Conference on, vol. 3, pp. 3350–3355 vol.3, Dec. 2002.
[6] S. Arimoto, “Robustness of learning control for robot manipulators,”
in Proc. of the 1990 IEEE Int. Conf. on Robotics and Automation,
Cincinnati, Ohio, USA, 1990, pp. 1528–1533.
[7] G. Heinzinger, D. Fenwick, B. Paden, and F. Miyazaki, “Robust
learning control,” in Proc. of the 28-th IEEE Conf. on Decision and
Control, Tempa, FL, USA, Dec. 1989, pp. 436–440.
[8] B. Helfrich, C. Lee, D. Bristow, X. Xiao, J. Dong, A. Alleyne,
S. Salapaka, and P. Ferreira, “Combined h-infinity feedback and
iterative learning control design with application to nanopositioning
systems,” in American Control Conference, 2008, June 2008, pp.
3893–3900.
[9] D. Bristow, “Frequency domain analysis and design of iterative
learning control for systems with stochastic disturbances,” in American
Control Conference, 2008, June 2008, pp. 3901–3907.
[10] M. Butcher, A. Karimi, and R. Longchamp, “A statistical analysis of
certain iterative learning control algorithms,” International Journal of
Control.
[11] K. Tao, R. Kosut, and G. Aral, “Learning feedforward control,” in
American Control Conference, 1994, vol. 3, June-1 July 1994, pp.
2575–2579.
[12] K. Barton, J. van de Wijdeven, A. Alleyne, O. Bosgra, and M. Steinbuch, “Norm optimal cross-coupled iterative learning control,” in
Decision and Control, 2008. CDC 2008. 47th IEEE Conference on,
Dec. 2008, pp. 3020–3025.
[13] D. Bristow and A. Alleyne, “A high precision motion control system
with application to microscale robotic deposition,” Control Systems
Technology, IEEE Transactions on, vol. 14, no. 6, pp. 1008–1020,
Nov. 2006.
A PPENDIX
Claim: trace (Φk+1 ) ≤ trace (Φk ), i.e., Jk+1 ≤ Jk . Proof: Plugging in the optimal Lk = ΠG I − VΦk−1 into Eq.
20, we get
(V − Φk ) +
Φk+1 = Φk + ΠG I − VΦ−1
k
T T
T
(V − Φk ) I − VΦ−1
ΠG +
k
T T
Φk I − VΦ−1
ΠG
(42)
ΠG I − VΦ−1
k
k
Further
−1 T T
Φ
I
−
VΦ
Π
0 ≤ trace ΠG I − VΦ−1
k
G
k
k
T
−1
−1 T
= trace ΠG ΠG I − VΦk Φk I − VΦk
= −trace ΠG I − VΦ−1
(V − Φk )
k
Using the above, we have
trace (Φk+1 ) = trace (Φk ) −
T T trace ΠG I − VΦ−1
Φk I − VΦ−1
ΠG
k
k
≤ trace (Φk ) .
Hence, Jk+1 ≤ Jk .
[1] S. Arimoto, S. Kawamura, and F. Miyazaki, “Bettering operation of
robots by learning,” J. of Robotic Systems, vol. 1, no. 2, pp. 123–140,
1984.
[2] D. Bristow, M. Tharayil, and A. Alleyne, “A survey of iterative
learning control,” Control Systems Magazine, IEEE, vol. 26, no. 3,
pp. 96–114, June 2006.
1271