Equality-Inequality Mixed Constraints in Optimal Control 1 Introduction

Int. Journal of Math. Analysis, Vol. 3, 2009, no. 28, 1369 - 1387
Equality-Inequality Mixed Constraints
in Optimal Control
Javier F. Rosenblueth
Universidad Nacional Autónoma de México
IIMAS–UNAM, Apartado Postal 20-726, México DF 01000
[email protected]
Abstract
In this paper we consider an optimal control problem involving equality
and/or inequality state-control (mixed) constraints, and with fixed initial and final endpoint state constraints. The control and state variables
belong to the class of piecewise continuous and piecewise smooth functions, respectively. The main objective of the paper is to provide a direct
derivation of second order necessary conditions, based on a variational
approach, which enlarges a well-known set of “differentially admissible
variations” where a certain quadratic form is nonnegative. For such a
set, the inequality constraints are treated as equalities of the active constraints, thus producing a restrictive set of optimality conditions. For
the set proposed in this paper, under the piecewise constancy of the
active indexes set and under a certain normality condition, the inequality constraints play a fundamental role in its definition, thus expanding
successfully the former set of admissible variations.
Mathematics Subject Classification: 49K15
Keywords: Optimal control, second order conditions, equality and/or
inequality constraints, normality
1
Introduction
In this paper we shall consider the following optimal control problem. Suppose
we are given an interval T := [t0 , t1 ] in R, two points ξ0 , ξ1 in Rn , and functions
L, f and ϕ = (ϕ1 , . . . , ϕq ) mapping T × Rn × Rm to R, Rn and Rq (q ≤ m)
respectively. Let
A := {(t, x, u) ∈ T × Rn × Rm | ϕα (t, x, u) ≤ 0 (α ∈ R),
ϕβ (t, x, u) = 0 (β ∈ Q)}
1370
J. F. Rosenblueth
where R = {1, . . . , r}, Q = {r + 1, . . . , q} (0 ≤ r ≤ q). If r = 0, then R = ∅
and we disregard statements involving ϕα . Similarly, if r = q, then Q = ∅ and
we disregard statements regarding ϕβ .
Denote by X the space of piecewise C 1 functions mapping T to Rn , by U
the space of piecewise continuous functions mapping T to Rm , set Z := X ×U,
D := {(x, u) ∈ Z | ẋ(t) = f (t, x(t), u(t)) (t ∈ T )},
Ze (A) := {(x, u) ∈ D | (t, x(t), u(t)) ∈ A (t ∈ T ), x(t0 ) = ξ0 , x(t1 ) = ξ1 },
and consider the functional I: Z → R given by
I(x, u) :=
t1
t0
L(t, x(t), u(t))dt ((x, u) ∈ Z).
The problem we shall be concerned with, which we label (P), is that of minimizing I over Ze (A).
A common and concise way of formulating this problem is as follows:
Minimize I(x, u) = tt01 L(t, x(t), u(t))dt subject to
a.
b.
c.
d.
x: T → Rn piecewise C 1 ; u: T → Rm piecewise continuous;
ẋ(t) = f (t, x(t), u(t)) (t ∈ T );
x(t0 ) = ξ0 , x(t1 ) = ξ1 ;
ϕα (t, x(t), u(t)) ≤ 0 and ϕβ (t, x(t), u(t)) = 0 (α ∈ R, β ∈ Q, t ∈ T ).
Elements of Z will be called processes, of Ze (A) admissible processes, and
a process (x, u) solves (P) if (x, u) is admissible and I(x, u) ≤ I(y, v) for all
admissible processes (y, v). For any (x, u) ∈ Z we use the notation (x̃(t))
to represent (t, x(t), u(t)) (similarly (x̃0 (t)) represents (t, x0 (t), u0 (t))), and ‘∗ ’
denotes transpose. We assume that L, f and ϕ are C 2 and the q × (m + r)dimensional matrix
∂ϕi
δiα ϕα
∂uk
(i = 1, . . . , q; α = 1, . . . , r; k = 1, . . . , m)
has rank q on A (here δαα = 1, δαβ = 0 (α = β)). This condition is equivalent
to the condition that, at each point (t, x, u) in A, the matrix
∂ϕi
∂uk
(i = i1 , . . . , ip ; k = 1, . . . , m)
has rank p, where i1 , . . . , ip are the indexes i ∈ {1, . . . , q} such that ϕi (t, x, u) =
0 (see [1] for details).
The theory of second order necessary conditions in optimal control has
received considerable attention since the pioneering work of Hestenes [3, 4]
and Warga [14]. A wide range of problems, under different assumptions, have
1371
Mixed constraints in optimal control
been successfully studied (see, in particular, [2, 7, 13] and references therein)
but the problem posed above with equality-inequality constraints in both the
state (piecewise smooth) and control (piecewise continuous) functions conveys
serious difficulties which make unusable some standard techniques.
Some widely quoted references treat only the case of equality state-control
(mixed) constraints (see, for example, [7, 13]) and, for the case involving inequality mixed constraints, only first order conditions are derived (see [3, 4,
6–8]). Second order conditions for the problem in hand can be found, for example, in [1, 12] but, as we shall explain below, the conditions obtained are
expressed in terms of a set of “admissible variations” which may give little or
no additional information even under strong normality assumptions.
To understand the type of necessary conditions given in those references,
let us briefly recall a similar situation that occurs in the finite dimensional
case (see [5] for details). Suppose we are interested in minimizing a function
f : Rn → R on the set
S = {x ∈ Rn | gα (x) ≤ 0 (α ∈ A), gβ (x) = 0 (β ∈ B)}
where A = {1, . . . , p}, B = {p + 1, . . . , m}. The cases p = 0 and p = m are to
be given the obvious interpretations. Let
F (x, λ) = f (x) +
m
1
λα gα (x) ((x, λ) ∈ Rn × Rm )
and denote by
I(x) = {α ∈ A | gα (x) = 0}
the set of active indexes at x. It is well-known that (under certain normality
and smooth assumptions) if x0 affords a local minimum to f on S then there
exists a unique λ ∈ Rm with
λα ≥ 0 (α ∈ I(x0 )), λα = 0 (α ∈ A \ I(x0 )),
such that Fx (x0 , λ) = 0. Moreover, h, Fxx (x0 , λ)h ≥ 0 for all h in the set of
tangential constraints of S at x0 given by
RS (x0 ) = {h ∈ Rn | gi (x0 ; h) = 0 (i ∈ I(x0 ) ∪ B)}.
A stronger set of necessary conditions states that h, Fxx (x0 , λ)h ≥ 0 for all
h in the set of modified tangential constraints of S at x0 given by
R̃S (x0 ; λ) := {h ∈ Rn | gα (x0 ; h) ≤ 0 (α ∈ I(x0 ), λα = 0),
gβ (x0 ; h) = 0 (β ∈ Γ ∪ B)}
where Γ = {α ∈ A | λα > 0}. One can easily find examples for which a
point x0 satisfies the first order condition for some λ ∈ Rm and, moreover,
1372
J. F. Rosenblueth
h, Fxx (x0 , λ)h ≥ 0 for all h ∈ RS (x0 ), but h, Fxx (x0 , λ)h < 0 for some
h ∈ R̃S (x0 ; λ). In this event, the former result gives no additional information,
but one concludes from the latter one that the point x0 does not afford a local
minimum to f on S.
Now, for the optimal control problem we are dealing with, second order
conditions of the first (weak) type, which can be found in the references mentioned above, are expressed in terms of a set of “admissible variations” (y, v)
which satisfy the relations
i. ẏ(t) = fx (x̃0 (t))y(t) + fu (x̃0 (t))v(t) (t ∈ T ), and y(t0) = y(t1) = 0;
ii. ϕix (x̃0 (t))y(t) + ϕiu (x̃0 (t))v(t) = 0 for all i ∈ Ia (x̃0 (t)) ∪ Q, t ∈ T ,
where Ia (x̃0 (t)) denotes the set of active indexes at (x̃0 (t)) = (t, x0 (t), u0(t))
and the notations ϕix (x̃0 (t)), ϕiu (x̃0 (t)) represent
∂ϕi
(t, x0 (t), u0 (t)),
∂x
∂ϕi
(t, x0 (t), u0 (t))
∂u
respectively.
A set of “modified admissible variations” where one would expect to have
the second (strong) type of necessary conditions corresponds to pairs (y, v)
satisfying, instead of (ii) above, the relations
ϕix (x̃0 (t))y(t) + ϕiu (x̃0 (t))v(t) ≤ 0 (i ∈ Ia (x̃0 (t)) with μi (t) = 0),
ϕjx (x̃0 (t))y(t) + ϕju (x̃0 (t))v(t) = 0 (j ∈ R with μj (t) > 0, or j ∈ Q)
where the multiplier μ, as in the finite dimensional case, appears in the first order conditions and is such that μα (t) ≥ 0 with μα (t) = 0 whenever ϕα (x̃0 (t)) <
0 (α ∈ R, t ∈ T ). Our main objective in this paper is precisely to derive second
order conditions of this “strong” type for the optimal control problem posed
above.
2
First order conditions and strong normality
First order conditions for problem (P) are well established (see, for example,
[3, 4, 6–8]) and one version, expressed in terms of a maximum principle, can
be written as follows.
For all (t, x, u, p, μ, λ) in T × Rn × Rm × Rn × Rq × R let
H(t, x, u, p, μ, λ) := p, f (t, x, u) − λL(t, x, u) − μ, ϕ(t, x, u),
and denote by Uq the space of all piecewise continuous functions mapping T
to Rq .
Mixed constraints in optimal control
1373
Theorem 2.1 Suppose (x0 , u0) solves (P). Then there exist λ0 ≥ 0, p ∈ X,
and μ ∈ Uq continuous on each interval of continuity of u0 , not vanishing
simultaneously on T , such that
a. μα (t) ≥ 0 with μα (t) = 0 whenever ϕα (x̃0 (t)) < 0 (α ∈ R, t ∈ T );
b. ṗ(t) = −Hx∗ (x̃0 (t), p(t), μ(t), λ0 ) and Hu (x̃0 (t), p(t), μ(t), λ0) = 0 on every interval of continuity of u0 ;
c. H(t, x0 (t), u, p(t), 0, λ0) ≤ H(x̃0 (t), p(t), 0, λ0) for all (t, u) ∈ T × Rm
with (t, x0 (t), u) ∈ A.
Note that (a) and (c) are equivalent, respectively, to the following conditions:
a. μα (t) ≥ 0 and μα (t)ϕα (x̃0 (t)) = 0 (α ∈ R, t ∈ T );
c. H(t, x0 (t), u, p(t), μ(t), λ0)+μ(t), ϕ(t, x0 (t), u) ≤ H(x̃0 (t), p(t), μ(t), λ0)
for all (t, u) ∈ T × Rm with (t, x0 (t), u) ∈ A.
Based on this theorem, let us introduce a set M(x, u) of multipliers together with a set E whose elements have associated a nonzero cost multiplier
normalized to one.
Definition 2.2 For all (x, u) ∈ Z let M(x, u) be the set of all (p, μ, λ0) ∈
X × Uq × R with λ0 + |p| = 0 satisfying
a. μα (t) ≥ 0 and μα (t)ϕα (x̃(t)) = 0 (α ∈ R, t ∈ T );
b. ṗ(t) = −Hx∗ (x̃(t), p(t), μ(t), λ0) (t ∈ T );
c. Hu (x̃(t), p(t), μ(t), λ0) = 0 (t ∈ T ).
Denote by E be the set of all (x, u, p, μ) ∈ Z × X × Uq such that (p, μ, 1) ∈
M(x, u), that is,
a. μα (t) ≥ 0 and μα (t)ϕα (x̃(t)) = 0 (α ∈ R, t ∈ T );
b. ṗ(t) = −fx∗ (x̃(t))p(t) + L∗x (x̃(t)) + ϕ∗x (x̃(t))μ(t) (t ∈ T );
c. fu∗ (x̃(t))p(t) = L∗u (x̃(t)) + ϕ∗u (x̃(t))μ(t) (t ∈ T ).
The notion of “strong normality,” as defined below, is introduced to assure
that, if (p, μ, λ0) is a triple of multipliers corresponding to a strongly normal
solution to the problem, then λ0 > 0 and, when λ0 = 1, the pair (p, μ) is
unique.
Definition 2.3 An admissible process (x, u) will be said to be “strongly
normal” if, given p ∈ X and μ ∈ Uq satisfying
i. μα (t)ϕα (x̃(t)) = 0 (α ∈ R, t ∈ T );
ii. ṗ(t) = −fx∗ (x̃(t))p(t) + ϕ∗x (x̃(t))μ(t) [ = −Hx∗ (x̃(t), p(t), μ(t), 0) ] (t ∈ T );
iii. 0 = fu∗ (x̃(t))p(t) − ϕ∗u (x̃(t))μ(t) [ = Hu∗ (x̃(t), p(t), μ(t), 0) ] (t ∈ T ),
then p ≡ 0. In this event, clearly, also μ ≡ 0.
1374
J. F. Rosenblueth
Theorem 2.4 If (x0 , u0 ) solves (P) then M(x0 , u0 ) = ∅. If also (x0 , u0) is
strongly normal then there exists a unique (p, μ) ∈ X × Uq such that
(x0 , u0 , p, μ) ∈ E.
Proof: Let (x0 , u0 ) solve (P). By Theorem 2.1 there exists (p, μ, λ0 ) ∈
M(x0 , u0). Suppose (x0 , u0) is strongly normal. Clearly we have λ0 =
0 and,
if (q, ν, λ0) ∈ M(x0 , u0 ), then
i. [μα (t) − να (t)]ϕα (x̃0 (t)) = 0 (α ∈ R, t ∈ T );
ii. [ṗ(t) − q̇(t)] = −fx∗ (x̃0 (t))[p(t) − q(t)] + ϕ∗x (x̃0 (t))[μ(t) − ν(t)] (t ∈ T );
iii. 0 = fu∗ (x̃0 (t))[p(t) − q(t)] − ϕ∗u (x̃0 (t))[μ(t) − ν(t)] = 0 (t ∈ T ),
implying that p ≡ q and μ ≡ ν. The result follows by choosing λ0 = 1 since
(p/λ0 , μ/λ0 , 1) ∈ M(x0 , u0).
3
Second order necessary conditions
For any (x, u, p, μ) ∈ Z × X × Uq let
J((x, u, p, μ); (y, v)) :=
t1
t0
2Ω(t, y(t), v(t))dt ((y, v) ∈ Z)
where, for all (t, y, v) ∈ T × Rn × Rm ,
2Ω(t, y, v) := −[y, Hxx(t)y + 2y, Hxu(t)v + v, Huu (t)v]
and H(t) denotes H(x̃(t), p(t), μ(t), 1). For all (t, x, u) ∈ T × Rn × Rm define
the set of active indexes at (t, x, u) as
Ia (t, x, u) := {α ∈ R | ϕα (t, x, u) = 0}.
As mentioned in the introduction, a set of weak second order conditions for
problem (P) can be found in the literature. In particular, the following result
was derived in [1] by reducing the original problem into a problem involving
only mixed equality constraints.
Theorem 3.1 If (x0 , u0 ) is a strongly normal solution to (P) then there
exists a unique (p, μ) ∈ X × Uq such that (x0 , u0, p, μ) ∈ E. Moreover,
J((x0 , u0 , p, μ); (y, v)) ≥ 0
for all (y, v) ∈ Z satisfying
i. ẏ(t) = fx (x̃0 (t))y(t) + fu (x̃0 (t))v(t) (t ∈ T ), and y(t0 ) = y(t1 ) = 0;
ii. ϕix (x̃0 (t))y(t) + ϕiu (x̃0 (t))v(t) = 0 for all i ∈ Ia (x̃0 (t)) ∪ Q, t ∈ T .
1375
Mixed constraints in optimal control
The same set of “admissible variations” defined by relations (i) and (ii)
yields second order necessary conditions in other references mentioned in the
introduction. Those conditions are obtained in different ways and, in some
cases, under different assumptions, but they are all expressed in terms of that
set of variations. Let us briefly mention that the same device used in [1], which
consists in defining the functions
ψα (t, x, u, w) := ϕα (t, x, u) + (w α )2 (α ∈ R),
ψβ (t, x, u, w) := ϕβ (t, x, u) (β ∈ Q),
is also used in [12]. The purpose of this section is to enlarge that set by
considering “modified admissible variations” as defined below, thus obtaining
an improved set of necessary conditions. Let us point out that the underlying
ideas which yield a direct approach in the derivation of second order necessary
conditions have been recently used in [9, 11] for simpler problems, though the
difficulties appearing in the problem we are dealing with make it a much more
complicated setting.
Let us first introduce a set whose elements are embedded into a oneparameter family of admissible processes and for which the derivation of second
order conditions is straightforward.
Definition 3.2 For all (x0 , u0 ) ∈ Ze (A) and μ ∈ Uq denote by W(x0 , u0 , μ)
the set of all (y, v) ∈ Z for which there exist δ > 0 and a one-parameter family
(x(·, ), u(·, )) (|| < δ) of processes such that
i. (x(t, 0), u(t, 0)) = (x0 (t), u0(t)) (t ∈ T );
ii. (x (t, 0), u (t, 0)) = (y(t), v(t)) (t ∈ T );
iii. (x(·, ), u(·, )) ∈ Ze (A) (0 ≤ < δ);
iv. μα (t)ϕα (t, x(t, ), u(t, )) = 0 (α ∈ R, t ∈ T , 0 ≤ < δ).
Lemma 3.3 If (x0 , u0 ) solves (P) and there exists (p, μ) ∈ X ×Uq such that
(x0 , u0 , p, μ) ∈ E then
J((x0 , u0, p, μ); (y, v)) ≥ 0
for all (y, v) ∈ W(x0 , u0, μ).
Proof: Define
K(x, u) := p(t1 ), ξ1 − p(t0 ), ξ0 +
t1
t0
F (t, x(t), u(t))dt
((x, u) ∈ Z)
where, for all (t, x, u) ∈ T × Rn × Rm ,
F (t, x, u) := L(t, x, u) − p(t), f (t, x, u) + μ(t), ϕ(t, x, u) − ṗ(t), x.
1376
J. F. Rosenblueth
Observe that
F (t, x, u) = −H(t, x, u, p(t), μ(t), 1) − ṗ(t), x
and, if (x, u) is an admissible process, then
K(x, u) = I(x, u) +
t1
t0
μ(t), ϕ(t, x(t), u(t))dt.
Let (y, v) ∈ W(x0 , u0 , μ) and let δ > 0 and (x(·, ), u(·, )) (|| < δ) be as in
Definition 3.2. Then
g() := K(x(·, ), u(·, )) (|| < δ)
satisfies
g() = I(x(·, ), u(·, )) ≥ I(x0 , u0 ) = K(x0 , u0 ) = g(0)
Note that
(0 ≤ < δ).
Fx (x̃0 (t)) = −Hx (x̃0 (t), p(t), μ(t), 1) − ṗ∗ (t) = 0,
Fu (x̃0 (t)) = −Hu (x̃0 (t), p(t), μ(t), 1) = 0
and therefore g (0) = 0. Consequently
0 ≤ g (0) = K ((x0 , u0 ); (y, v)) = J((x0 , u0 , p, μ); (y, v)).
Let us now introduce a set of “modified admissible variations” which, under
certain assumptions, coincides with W(x, u, μ). Given a process (x, u) let
A(t) := fx (x̃(t)), B(t) := fu (x̃(t)) (t ∈ T ), and consider for (y, v) ∈ Z the
system
ẏ(t) = A(t)y(t) + B(t)v(t) (t ∈ T )
which we label L(x, u).
Definition 3.4 Given (x, u) ∈ Ze (A) and μ ∈ Uq , a solution (y, v) of
L(x, u) will be called a “modified differentially admissible variation” along
(x, u, μ) if it satisfies
i. ϕix (x̃(t))y(t) + ϕiu (x̃(t))v(t) ≤ 0 for all i ∈ Ia (x̃(t)) with μi (t) = 0
(t ∈ T );
ii. ϕjx (x̃(t))y(t) + ϕju (x̃(t))v(t) = 0 for all j ∈ R with μj (t) > 0, or j ∈ Q
(t ∈ T ).
Denote by Y (x, u, μ) the set of all modified differentially admissible variations
(y, v) along (x, u, μ) satisfying y(t0) = y(t1) = 0.
There is a strong relation between W(x0 , u0, μ) and Y (x0 , u0 , μ). To begin
with, as we show next, W(x0 , u0, μ) is a subset of Y (x0 , u0, μ).
1377
Mixed constraints in optimal control
Proposition 3.5 For all (x0 , u0) ∈ Ze (A) and μ ∈ Uq , W(x0 , u0, μ) ⊂
Y (x0 , u0 , μ).
Proof: Let (y, v) ∈ W(x0 , u0 , μ) and let δ > 0 and (x(·, ), u(·, )) (|| < δ)
be as in Definition 3.2. By 3.2(iii) we have, for all 0 ≤ < δ,
ẋ(t, ) = f (t, x(t, ), u(t, )) (t ∈ T ),
x(t0 , ) = ξ0 ,
x(t1 , ) = ξ1
and so (y, v) solves L(x0 , u0 ) with y(t0 ) = y(t1 ) = 0. Also by 3.2(iii) we have,
for all (t, ) ∈ T × [0, δ),
ϕα (t, x(t, ), u(t, )) ≤ 0 (α ∈ R),
ϕβ (t, x(t, ), u(t, )) = 0 (β ∈ Q).
Fix i ∈ R ∪ Q and t ∈ T , and set γ() := ϕi (t, x(t, ), u(t, )) so that
γ (0) = ϕix (x̃0 (t))y(t) + ϕiu (x̃0 (t))v(t).
If i ∈ Ia (x̃0 (t)) then γ (0) ≤ 0 and, if μi (t) > 0 or i ∈ Q, then γ ≡ 0. Thus
3.4(i) and (ii) hold and so (y, v) ∈ Y (x0 , u0 , μ).
Let us now show that, under certain conditions, the two sets Y (x0 , u0 , μ)
and W(x0 , u0 , μ) coincide.
Lemma 3.6 Let (x0 , u0 ) ∈ Ze (A) and suppose Ia (x̃0 (·)) is piecewise constant and there exist (yi , vi ) (i = 1, . . . , n) solutions of L(x0 , u0) satisfying
a. yi (t0 ) = 0 (i = 1, . . . , n);
b. |y1(t1 ) · · · yn (t1 )| = 0;
c. ϕjx (x̃0 (t))yi (t) + ϕju (x̃0 (t))vi (t) = 0 (j ∈ Ia (x̃0 (t)) ∪ Q, i = 1, . . . , n,
t ∈ T ).
Then Y (x0 , u0 , μ) = W(x0 , u0 , μ) for any μ ∈ Uq with
μα (t) ≥ 0 and μα (t)ϕα (x̃0 (t)) = 0 (α ∈ R, t ∈ T ).
Proof: Let μ ∈ Uq with μα (t) ≥ 0 and μα (t)ϕα (x̃0 (t)) = 0 (α ∈ R, t ∈ T ).
In view of Proposition 3.5 it suffices to show that Y (x0 , u0 , μ) ⊂ W(x0 , u0 , μ).
Let (y, v) ∈ Y (x0 , u0 , μ) and let Tj (j = 1, . . . , s) with cl Tj = [τj , τj+1 ],
τ0 = t0 , and τs+1 = t1 , be the subintervals of T where Ia (x̃0 (·)) is constant
and u0 , v, v1 , . . . , vn are continuous. For all j = 1, . . . , s and t ∈ Tj let pj
be the cardinality of Ia (x̃0 (t)) ∪ Q and denote by ϕj the function mapping
Tj × Rn × Rm to Rpj given by
ϕj (t, x, u) = (ϕi1 (t, x, u), . . . , ϕipj (t, x, u)) where Ia (x̃0 (t)) ∪ Q = {i1 , . . . , ipj }.
Set Λj (t) := ϕju (x̃0 (t))ϕj∗
u (x̃0 (t)), and consider the n × n matrix
−1
j
Cj (t) := A(t) − B(t)ϕj∗
u (x̃0 (t))Λj (t)ϕx (x̃0 (t)) (t ∈ Tj ).
1378
J. F. Rosenblueth
For each i ∈ {i1 , . . . , ipj } let ηij be the unique solution of the system
η̇(t) = Cj (t)η(t) + B(t)ϕj∗
iu (x̃0 (t)) (t ∈ Tj ),
η(τj ) = 0
and let η j be the n × pj matrix (ηij1 , . . . , ηijpj ). Let
−1
j
j
γj (t) := ϕj∗
u (x̃0 (t))[Ipj ×pj − Λj (t)ϕx (x̃0 (t))η (t)] (t ∈ Tj )
and observe that η j (τj ) = 0 and
j
η̇ j (t) = Cj (t)η j (t) + B(t)ϕj∗
u (x̃0 (t)) = A(t)η (t) + B(t)γj (t) (t ∈ Tj ).
For all (t, , α, λ) ∈ Tj × R × Rn × Rpj define
ū(t, , α, λ) := u0 (t) + v(t) +
n
i=1
αi vi (t) + γj (t)λ.
By the embedding theorem of differential equations, the equations
ẋ(t) = f (t, x(t), ū(t, , α, λ)) (t ∈ Tj ),
x(τj ) = x0 (τj )
have, for some τ > 0, unique solutions
x̄(t, , α, λ) (t ∈ Tj , || < τ, |αi| < τ, |λk | < τ, i = 1, . . . , n, k = 1, . . . , pj )
such that x̄(t, 0, 0, 0) = x0 (t) (t ∈ T ). By differentiation with respect to λ it is
found that
x̄˙ λ (t, 0, 0, 0) = A(t)x̄λ (t, 0, 0, 0) + B(t)γj (t) (t ∈ Tj ),
x̄λ (τj , 0, 0, 0) = 0
and therefore x̄λ (t, 0, 0, 0) = η j (t) (t ∈ Tj ).
For all (t, , α, λ) ∈ Tj × (−τ, τ ) × (−τ, τ )n × (−τ, τ )pj , j = 1, . . . , s, let
hj (t, , α, λ) := ϕj (t, x̄(t, , α, λ), ū(t, , α, λ)) − [ϕjx (x̃0 (t))y(t) + ϕju (x̃0 (t))v(t)].
Note that hj (t, 0, 0, 0) = 0 (t ∈ Tj ) and
|hjλ (t, 0, 0, 0)| = |ϕjx (x̃0 (t))η j (t) + ϕju (x̃0 (t))γj (t)| = |Λj (t)| = 0 (t ∈ Tj ).
By the implicit function theorem there exist 0 < νj < τ and functions
σ j : Tj × (−νj , νj ) × (−νj , νj )n → Rpj
such that, for all t ∈ Tj , σ j (t, 0, 0) = 0, σ j (t, ·, ·) is C 2 and
hj (t, , α, σ j (t, , α)) = 0.
1379
Mixed constraints in optimal control
Let ν := min{νj }j and let
σ(t, , α) := σ j (t, , α) (t ∈ Tj , j = 1, . . . , s, || < ν, |αi | < ν).
Thus
ϕj (t, x̄(t, , α, σ(t, , α)), ū(t, , α, σ(t, , α)))
−[ϕjx (x̃0 (t))y(t) + ϕju (x̃0 (t))v(t)] = 0
for all t ∈ T , || < ν, |αi | < ν. Taking the derivative with respect to and αi
at (, α) = (0, 0), and using the fact that
x̄ (t, 0, 0, 0) = y(t) and x̄αi (t, 0, 0, 0) = yi (t) (t ∈ T ),
we have
0 = ϕjx (x̃0 (t))[y(t) + η j (t)σ (t, 0, 0)] + ϕju (x̃0 (t))[v(t) + γj (t)σ (t, 0, 0)]
−ϕjx (x̃0 (t))y(t) − ϕju (x̃0 (t))v(t)
= Λj (t)σ (t, 0, 0)
and, by assumption (c),
0 = ϕjx (x̃0 (t))[yi (t) + η j (t)σαi (t, 0, 0)] + ϕju (x̃0 (t))[vi (t) + γj (t)σαi (t, 0, 0)]
= Λj (t)σαi (t, 0, 0),
implying that σ (t, 0, 0) = σαi (t, 0, 0) = 0 (t ∈ T ). For all t ∈ T , || < ν,
|αi | < ν, let
w(t, , α) := ū(t, , α, σ(t, , α)),
z(t, , α) := x̄(t, , α, σ(t, , α))
and observe that, in view of the above relations,
w (t, 0, 0) = v(t),
wαi (t, 0, 0) = vi (t) (t ∈ T ).
Moreover, z(t, , α) is the unique solution of
ż(t) = f (t, z(t), w(t, , α)) (t ∈ T ),
z(t0 ) = ξ0
satisfying z(t, 0, 0) = x0 (t).
Now, let S := (−ν, ν) and define g: S × S n → Rn by
g(, α) := z(t1 , , α) − ξ1 .
Note that
g(0, 0) = 0 and |gα(0, 0)| = |M | = 0
1380
J. F. Rosenblueth
where M = (y1 (t1 ) · · · yn (t1 )). By the implicit function theorem there exist
0 < δ < ν and β: (−δ, δ) → Rn of class C 2 such that β(0) = 0 and g(, β()) = 0
(|| < δ). We have, taking the derivative with respect to at = 0, that
0 = g (0, 0) + gα (0, 0)β (0) = y(t1 ) + Mβ (0) = Mβ (0)
implying that β (0) = 0. By continuity we may choose δ > 0 so that |βi ()| < ν
for all || < δ, i = 1, . . . , n. Let us now prove that the one-parameter family
x(t, ) := z(t, , β()),
u(t, ) := w(t, , β()) (t ∈ T, || < δ)
has the properties of the theorem. Observe first that
x (t, 0) = y(t) + zα (t, 0, 0)β (0) = y(t),
u (t, 0) = v(t) + wα (t, 0, 0)β (0) = v(t).
Moreover,
x(t1 , ) − ξ1 = z(t1 , , β()) − ξ1 = g(, β()) = 0
so that x(·, ) (|| < δ) joins the endpoints of x0 . Now, for all || < δ and
t ∈ Tj , we have
ϕj (t, x(t, ), u(t, )) = [ϕjx (x̃0 (t))y(t) + ϕju (x̃0 (t))v(t)].
By Definition 3.4, this implies that, for t ∈ T and 0 ≤ < δ,
ϕi (t, x(t, ), u(t, )) ≤ 0 for all i ∈ Ia (x̃0 (t)) with μi (t) = 0
and, for t ∈ T and || < δ,
ϕj (t, x(t, ), u(t, )) = 0 for all j ∈ Ia (x̃0 (t)) with μj (t) > 0, or j ∈ Q.
For the case i ∈ Ia (x̃0 (t)), that is, ϕi (x̃0 (t)) < 0, we have μi (t) = 0 and we
can diminish δ > 0, if necessary, so that ϕi (t, x(t, ), u(t, )) < 0 (|| < δ).
Consequently (x(·, ), u(·, )) ∈ Ze (A) (0 ≤ < δ) and
μi(t)ϕi (t, x(t, ), u(t, )) = 0 (i ∈ R, t ∈ T, 0 ≤ < δ).
This shows that (y, v) ∈ W(x0 , u0 , μ).
As we show next, the existence of n differentially admissible variations
satisfying the assumptions of Lemma 3.6 is assured if the process under consideration is strongly normal.
1381
Mixed constraints in optimal control
Lemma 3.7 Let (x0 , u0 ) ∈ Ze (A) and suppose Ia (x̃0 (·)) is piecewise constant. If (x0 , u0 ) is strongly normal then there exist (yi, vi ) (i = 1, . . . , n)
solutions of L(x0 , u0 ) satisfying (a)-(c) of Lemma 3.6.
Proof: Let T1 , . . . , Ts be the subintervals of T where Ia (x̃0 (·)) is constant
and, for all j = 1, . . . , s and t ∈ Tj , let pj be the cardinality of Ia (x̃0 (t)) ∪ Q.
For all t ∈ Tj define
ϕ̂(t, x, u) = (ϕi1 (t, x, u), . . . , ϕipj (t, x, u)) where Ia (x̃0 (t)) ∪ Q = {i1 , . . . , ipj }.
To simplify the notation, let ϕ̂(t) = ϕ̂(x̃0 (t)) and similarly for ϕ. Let Z(t) ∈
Rn×n satisfy
Ż(t) = −Z(t)C(t) (t ∈ T ), Z(t1 ) = I
where the n × n matrix C(t) is given by
C(t) := A(t) − B(t)ϕ̂∗u (t)Λ−1 (t)ϕ̂x (t) (t ∈ T )
and Λ(t) = ϕ̂u (t)ϕ̂∗u (t). Note that, since
ϕ̂u (t)ϕ̂∗u (t)Λ−1 (t) = Λ−1 (t)∗ ϕ̂u (t)ϕ̂∗u (t) = Ipj ×pj
(t ∈ Tj )
we have Λ−1 (t) = Λ−1 (t)∗ . Denote by z1 , . . . , zn the row vectors of Z so that,
for all t ∈ T and i = 1, . . . , n,
żi (t) = −C ∗ (t)zi (t) = [−A∗ (t) + ϕ̂∗x (t)Λ−1 (t)ϕ̂u (t)B ∗ (t)]zi (t).
Define μ̂i (t) = (μ̂i1 (t), . . . , μ̂ipj (t)) by
μ̂i (t) := Λ−1 (t)ϕ̂u (t)B ∗ (t)zi (t)
so that
(t ∈ Tj , i = 1, . . . , n)
żi (t) = −A∗ (t)zi (t) + ϕ̂∗x (t)μ̂i (t),
and extend the function μ̂i to include all other indexes in R by setting μi (t) =
(μi1 (t), . . . , μiq (t)) where
μiα (t) :=
i
μ̂
ir (t)
0
if α = ir , r = 1, . . . , pj
otherwise.
Clearly we have μiα (t)ϕα (t) = 0 for all i = 1, . . . , n, α ∈ R, and t ∈ T .
Moreover,
⎛ pj
⎞
∂ϕi
i
k
(t)μ̂ik (t) ⎟
⎜
⎜
⎟
⎜ k=1 ∂u1
⎟
⎜
⎟
∗
i
∗
i
.
ϕ̂u (t)μ̂ (t) = ⎜
⎟ = ϕu (t)μ (t)
.
.
⎜ p
⎟
j
⎜
⎟
∂ϕik
⎝
⎠
(t)μ̂iik (t)
∂u
m
k=1
1382
J. F. Rosenblueth
and, similarly, ϕ̂∗x (t)μ̂i (t) = ϕ∗x (t)μi (t) (t ∈ T ).
Now, let yi be the solution of
ẏ(t) = C(t)y(t) + [B(t)B ∗ (t) − B(t)ϕ̂∗u (t)Λ−1 (t)ϕ̂u (t)B ∗ (t)]zi (t),
y(t0 ) = 0
and set, for t ∈ T and i = 1, . . . , n,
vi (t) := B ∗ (t)zi (t) − ϕ̂∗u (t)μ̂i (t) − ϕ̂∗u (t)Λ−1 (t)ϕ̂x (t)yi (t).
As one readily verifies, we have
ẏi(t) = A(t)yi (t) + B(t)vi (t) (t ∈ T ),
ϕ̂x (t)yi (t) + ϕ̂u (t)vi (t) = 0 (t ∈ T )
and so (yi , vi ) are solutions of L(x0 , u0 ) satisfying (a) and (c). It remains to
show that |y1 (t1 ) · · · yn (t1 )| = 0.
Let wi (t) := B ∗ (t)zi (t) − ϕ̂∗u (t)μ̂i (t) so that
wi (t) = vi (t) + ϕ̂∗u (t)Λ−1 (t)ϕ̂x (t)yi (t)
and define
αij :=
t1
t0
wi (t), wj (t)dt (i, j = 1, . . . , n).
Note first that the functions w1 , . . . , wn are linearly independent on T since,
otherwise, there would exist constants a1 , . . . , an not all zero such that
0=
n
1
ai wi (t) =
In this event, if μ(t) :=
(α ∈ R, t ∈ T ),
ż(t) =
n
1
n
1
n
1
ai [B ∗ (t)zi (t) − ϕ̂∗u (t)μ̂i (t)] (t ∈ T ).
ai μi (t) and z(t) :=
n
1
ai zi (t), then μα (t)ϕα (t) = 0
ai żi (t) = [−A∗ (t) + ϕ̂∗x (t)Λ−1 (t)ϕ̂u (t)B ∗ (t)]z(t)
= −A∗ (t)z(t) + ϕ∗x (t)μ(t),
and B ∗ (t)z(t) = ϕ∗u (t)μ(t) (t ∈ T ), and the function z would be a nonnull solution to the system given in Definition 2.3, contradicting the strong normality
of (x0 , u0 ). Hence the rank of the matrix (αij ) is n. Now, observe that
d
zi (t), yj (t) = zi∗ (t)[A(t)yj (t) + B(t)vj (t)] − zi∗ (t)A(t)yj (t)
dt
+ μi∗ (t)ϕx (t)yj (t)
= zi∗ (t)B(t)vj (t) + zi∗ (t)B(t)ϕ̂∗u (t)Λ−1 (t)ϕ̂x (t)yj (t)
= zi∗ (t)B(t)wj (t)
= [wi∗(t) + μ̂i∗ (t)ϕ̂u (t)]wj (t)
= wi(t), wj (t),
1383
Mixed constraints in optimal control
the last equality holding since
ϕ̂u (t)wj (t) = ϕ̂u (t)B ∗ (t)zj (t) − Λ(t)μ̂j (t) = 0.
Therefore zi (t1 ), yj (t1 ) = αij (i, j = 1, . . . , n). Since the right member has
rank n and Z(t1 ) = I, the matrix (yji (t1 )) has rank n.
In view of Note 2.4 and Lemmas 3.3, 3.6 and 3.7, we obtain the following
necessary conditions for optimality of the strong type.
Theorem 3.8 If (x0 , u0 ) is a strongly normal solution to (P) then there
exists a unique (p, μ) ∈ X × Uq such that (x0 , u0 , p, μ) ∈ E. If also Ia (x̃0 (·)) is
piecewise constant then
J((x0 , u0 , p, μ); (y, v)) ≥ 0
for all (y, v) ∈ Z satisfying
i. ẏ(t) = fx (x̃0 (t))y(t) + fu (x̃0 (t))v(t) (t ∈ T ), and y(t0 ) = y(t1) = 0;
ii. ϕix (x̃0 (t))y(t) + ϕiu (x̃0 (t))v(t) ≤ 0 for all i ∈ Ia (x̃0 (t)) with μi (t) = 0
(t ∈ T );
iii. ϕjx (x̃0 (t))y(t) + ϕju (x̃0 (t))v(t) = 0 for all j ∈ R with μj (t) > 0, or
j ∈ Q (t ∈ T ).
4
Examples
In this section we provide two simple examples which illustrate some important
features of the theory developed in Section 3. The first one shows that the
modified (strong) conditions of Theorem 3.8 may give more information than
the classical (weak) conditions of Theorem 3.1. The second one shows that
if, in Theorems 3.1 and 3.8, the strong normality assumption imposed on a
solution to the problem is replaced by a weaker assumption, the conclusion
of the theorems may not hold. In other words, a “weakly normal solution”
to the problem (as defined below) may yield a negative second variation on
the set of admissible variations (and so also on the set of modified admissible
variations).
Example 4.1 Consider the problem (P) of minimizing
I(x, u) =
π
0
{u22 (t) − x2 (t)}dt
subject to
ẋ(t) = u1 (t) + u2 (t),
x2 (t) + u1 (t) ≥ 0 (t ∈ [0, π]),
x(0) = x(π) = 0.
1384
J. F. Rosenblueth
In this case we have T = [0, π], n = r = q = 1, m = 2, ξ0 = ξ1 = 0 and, for
any t ∈ T , x ∈ R, and u ∈ R2 with u = (u1 , u2),
L(t, x, u) = u22 − x2 ,
f (t, x, u) = u1 + u2 ,
ϕ1 (t, x, u) = −x2 − u1 .
Observe first that
H(t, x, u, p, μ, 1) = p(u1 + u2 ) − u22 + x2 + μ(x2 + u1 )
so that
Hu = (p + μ, p − 2u2),
Hxx = 2(1 + μ),
Hxu = 0,
Hx = 2x(1 + μ),
Huu =
0 0
0 −2
with the partial derivatives of H evaluated at (t, x, u, p, μ, 1). Therefore, for
any (x, u, p, μ) ∈ Z × X × U1 and (y, v) ∈ Z,
J((x, u, p, μ); (y, v)) = 2
π
0
{v22 (t) − (1 + μ(t))y 2(t)}dt.
Consider the admissible process (x0 , u0 ) ≡ (0, 0). Note that (x0 , u0 ) is strongly
normal since condition (iii) in Definition 2.3 corresponds to
(0, 0) = p(t)(1, 1) − μ(t)(−1, 0) = (p(t) + μ(t), p(t)).
Let (p, μ) ≡ (0, 0). Clearly (x0 , u0 , p, μ) belongs to E and we have
Ia (t, x0 (t), u0 (t)) = {1}.
By Theorem 3.1, if (y, v) ∈ Z is an admissible variation (in the sense of that
theorem), that is, ẏ(t) = v1 (t) + v2 (t), y(0) = y(π) = 0 and
0 = ϕ1x (x̃0 (t))y(t) + ϕ1u (x̃0 (t))v(t) = −v1 (t),
then J((x0 , u0, p, μ); (y, v)) ≥ 0. But the above relations imply that ẏ(t) =
v2 (t) and so, by Theorem 3.1,
π
0
{ẏ 2(t) − y 2(t)}dt ≥ 0
for all y ∈ X satisfying y(0) = y(π) = 0, which is a well-known fact. Thus the
conclusion of the theorem gives no information with respect to (x0 , u0).
On the other hand, if we let
v(t) = (v1 (t), v2 (t)) :=
(cos t, 0) if t ∈ [0, π/2]
(0, cos t) if t ∈ [π/2, π]
1385
Mixed constraints in optimal control
and y(t) := sin t (t ∈ [0, π]), then (y, v) is a modified admissible variation, that
is, ẏ(t) = v1 (t) + v2 (t), y(0) = y(π) = 0 and
0 ≥ ϕ1x (x̃0 (t))y(t) + ϕ1u (x̃0 (t))v(t) = −v1 (t),
but
J((x0 , u0 , p, μ); (y, v)) = −2
π/2
0
sin2 tdt + 2
π
π/2
{cos2 t − sin2 t}dt
= −π/2 < 0.
By Theorem 3.8 we conclude that (x0 , u0) ≡ (0, 0) is not a solution to the
problem.
Note that the sign of μα (t) is not considered in the definition of strong
normality. By adding such condition, we obtain the following weaker notion
of normality.
Definition 4.2 An admissible process (x, u) will be said to be “weakly normal” if, given p ∈ X and μ ∈ Uq satisfying
i. μα (t) ≥ 0 and μα (t)ϕα (x̃(t)) = 0 (α ∈ R, t ∈ T );
ii. ṗ(t) = −fx∗ (x̃(t))p(t) + ϕ∗x (x̃(t))μ(t) [ = −Hx∗ (x̃(t), p(t), μ(t), 0) ] (t ∈ T );
iii. 0 = fu∗ (x̃(t))p(t) − ϕ∗u (x̃(t))μ(t) [ = Hu∗ (x̃(t), p(t), μ(t), 0) ] (t ∈ T ),
then p ≡ 0. In this event, clearly, also μ ≡ 0.
Observe that in Note 2.4, if we replace the assumption of strong with that
of weak normality, the result remains valid except for the uniqueness of (p, μ).
The next example provides a negative second variation along a solution to the
problem which is weakly but not strongly normal.
Example 4.3 Consider the problem (P) of minimizing
I(x, u) =
1
0
u2 (t)dt
subject to
ẋ(t) = u21 (t) + u2 (t) − u3 (t) (t ∈ [0, 1]),
u2 (t) ≥ 0,
x(0) = x(1) = 0,
x2 (t) + u3 (t) ≥ 0 (t ∈ [0, 1]).
In this case we have T = [0, 1], n = 1, m = 3, r = q = 2, ξ0 = ξ1 = 0 and, for
all t ∈ T , x ∈ R, and u ∈ R3 with u = (u1 , u2, u3 ),
L(t, x, u) = u2 ,
f (t, x, u) = u21 + u2 − u3 ,
ϕ1 (u) = −u2 ,
ϕ2 (u) = −x2 − u3 .
1386
J. F. Rosenblueth
Observe first that
H(t, x, u, p, μ, 1) = p(u21 + u2 − u3 ) − u2 + μ1 u2 + μ2 (x2 + u3 )
so that
Hu (t, x, u, p, μ, 1) = (2pu1, p − 1 + μ1 , −p + μ2 ),
Hx (t, x, u, p, μ, 1) = 2xμ2 ,
⎛
Hxx (t, x, u, p, μ, 1) = 2μ2 ,
⎞
2p 0 0
⎜
⎟
Huu (t, x, u, p, μ, 1) = ⎝ 0 0 0 ⎠ .
0 0 0
Therefore, for any (x, u, p, μ) ∈ Z × X × U2 and (y, v) ∈ Z,
J((x, u, p, μ); (y, v)) = −2
1
0
{μ2 (t)y 2 (t) + p(t)v12 (t)}dt.
Clearly (x0 , u0 ) ≡ (0, 0) is a solution to the problem. It is weakly normal since,
given p ∈ X and (μ1 , μ2 ) ∈ U2 satisfying
μα (t) ≥ 0 (α ∈ {1, 2}), ṗ(t) = 0, and p(t)(0, 1, −1) − (0, −μ1 (t), −μ2 (t)) = 0
then, necessarily, p ≡ 0.
Now, let μ = (μ1 , μ2 ) ≡ (0, 1) and p ≡ 1 and note that (x0 , u0 , p, μ) ∈ E.
Let v = (v1 , v2 , v3 ) ≡ (1, 0, 0) and y ≡ 0. Then (y, v) ∈ Z is an admissible
variation in the sense that it satisfies (i) and (ii) of Theorem 3.1, given in this
case by
i. ẏ(t) = v2 (t) − v3 (t) (t ∈ T ) and y(0) = y(1) = 0;
ii. −v2 (t) = 0, −v3 (t) = 0 (t ∈ T ).
Being (y, v) an admissible variation, it is also a modified admissible variation since it satisfies (i)–(iii) of Theorem 3.8. However, the conclusion of the
theorems does not hold since
J((x0 , u0 , p, μ); (y, v)) = −2
1
0
p(t)v12 (t)dt = −2 < 0.
References
[1] de Pinho MR, Rosenblueth JF (2007) Mixed constraints in optimal control: an implicit function theorem approach, IMA Journal of Mathematical Control and Information, 24: 197-218
[2] Gilbert EG, Bernstein DS (1983) Second order necessary conditions in
optimal control: accessory-problem results without normality conditions,
Journal of Optimization Theory & Applications, 41: 75-106
Mixed constraints in optimal control
1387
[3] Hestenes MR (1965) On variational theory and optimal control theory,
SIAM Journal on Control, 3: 23-48
[4] Hestenes MR (1966) Calculus of Variations and Optimal Control Theory,
John Wiley & Sons, New York
[5] Hestenes MR (1975) Optimization Theory, The Finite Dimensional Case,
John Wiley & Sons, New York
[6] Makowski K, Neustadt LW (1974) Optimal control problems with mixed
control-phase variable equality and inequality constraints, SIAM Journal
on Control, 12: 184-228
[7] Milyutin AA, Osmolovskiı̌ (1998) Calculus of Variations and Optimal
Control, Translations of Mathematical Monographs 180, American Mathematical Society, Providence, Rhode Island
[8] Neustadt LW (1976) Optimization. A Theory of Necessary Conditions,
Princeton University Press, Princeton
[9] Rosenblueth JF (2007) A direct approach to second order conditions for
mixed equality constraints, Journal of Mathematical Analysis & Applications, 333: 770-779
[10] Rosenblueth JF (2007) Convex cones and conjugacy for inequality control
constraints, Journal of Convex Analysis, 14: 361-393
[11] Rosenblueth JF (2007) A new derivation of second order conditions for
equality control constraints, Applied Mathematics Letters, 21: 910-915
[12] Russak IB (1975) Second order necessary conditions for general problems
with state inequality constraints, Journal of Optimization Theory and
Applications, 17: 43-92
[13] Stefani G, Zezza PL (1996) Optimality conditions for a constrained control
problem, SIAM Journal on Control & Optimization, 34: 635-659
[14] Warga J (1978) A second order Lagrangian condition for restricted control
problems, Journal of Optimization Theory & Applications, 24: 475-483
Received: November, 2008