Int. Journal of Math. Analysis, Vol. 3, 2009, no. 28, 1369 - 1387 Equality-Inequality Mixed Constraints in Optimal Control Javier F. Rosenblueth Universidad Nacional Autónoma de México IIMAS–UNAM, Apartado Postal 20-726, México DF 01000 [email protected] Abstract In this paper we consider an optimal control problem involving equality and/or inequality state-control (mixed) constraints, and with fixed initial and final endpoint state constraints. The control and state variables belong to the class of piecewise continuous and piecewise smooth functions, respectively. The main objective of the paper is to provide a direct derivation of second order necessary conditions, based on a variational approach, which enlarges a well-known set of “differentially admissible variations” where a certain quadratic form is nonnegative. For such a set, the inequality constraints are treated as equalities of the active constraints, thus producing a restrictive set of optimality conditions. For the set proposed in this paper, under the piecewise constancy of the active indexes set and under a certain normality condition, the inequality constraints play a fundamental role in its definition, thus expanding successfully the former set of admissible variations. Mathematics Subject Classification: 49K15 Keywords: Optimal control, second order conditions, equality and/or inequality constraints, normality 1 Introduction In this paper we shall consider the following optimal control problem. Suppose we are given an interval T := [t0 , t1 ] in R, two points ξ0 , ξ1 in Rn , and functions L, f and ϕ = (ϕ1 , . . . , ϕq ) mapping T × Rn × Rm to R, Rn and Rq (q ≤ m) respectively. Let A := {(t, x, u) ∈ T × Rn × Rm | ϕα (t, x, u) ≤ 0 (α ∈ R), ϕβ (t, x, u) = 0 (β ∈ Q)} 1370 J. F. Rosenblueth where R = {1, . . . , r}, Q = {r + 1, . . . , q} (0 ≤ r ≤ q). If r = 0, then R = ∅ and we disregard statements involving ϕα . Similarly, if r = q, then Q = ∅ and we disregard statements regarding ϕβ . Denote by X the space of piecewise C 1 functions mapping T to Rn , by U the space of piecewise continuous functions mapping T to Rm , set Z := X ×U, D := {(x, u) ∈ Z | ẋ(t) = f (t, x(t), u(t)) (t ∈ T )}, Ze (A) := {(x, u) ∈ D | (t, x(t), u(t)) ∈ A (t ∈ T ), x(t0 ) = ξ0 , x(t1 ) = ξ1 }, and consider the functional I: Z → R given by I(x, u) := t1 t0 L(t, x(t), u(t))dt ((x, u) ∈ Z). The problem we shall be concerned with, which we label (P), is that of minimizing I over Ze (A). A common and concise way of formulating this problem is as follows: Minimize I(x, u) = tt01 L(t, x(t), u(t))dt subject to a. b. c. d. x: T → Rn piecewise C 1 ; u: T → Rm piecewise continuous; ẋ(t) = f (t, x(t), u(t)) (t ∈ T ); x(t0 ) = ξ0 , x(t1 ) = ξ1 ; ϕα (t, x(t), u(t)) ≤ 0 and ϕβ (t, x(t), u(t)) = 0 (α ∈ R, β ∈ Q, t ∈ T ). Elements of Z will be called processes, of Ze (A) admissible processes, and a process (x, u) solves (P) if (x, u) is admissible and I(x, u) ≤ I(y, v) for all admissible processes (y, v). For any (x, u) ∈ Z we use the notation (x̃(t)) to represent (t, x(t), u(t)) (similarly (x̃0 (t)) represents (t, x0 (t), u0 (t))), and ‘∗ ’ denotes transpose. We assume that L, f and ϕ are C 2 and the q × (m + r)dimensional matrix ∂ϕi δiα ϕα ∂uk (i = 1, . . . , q; α = 1, . . . , r; k = 1, . . . , m) has rank q on A (here δαα = 1, δαβ = 0 (α = β)). This condition is equivalent to the condition that, at each point (t, x, u) in A, the matrix ∂ϕi ∂uk (i = i1 , . . . , ip ; k = 1, . . . , m) has rank p, where i1 , . . . , ip are the indexes i ∈ {1, . . . , q} such that ϕi (t, x, u) = 0 (see [1] for details). The theory of second order necessary conditions in optimal control has received considerable attention since the pioneering work of Hestenes [3, 4] and Warga [14]. A wide range of problems, under different assumptions, have 1371 Mixed constraints in optimal control been successfully studied (see, in particular, [2, 7, 13] and references therein) but the problem posed above with equality-inequality constraints in both the state (piecewise smooth) and control (piecewise continuous) functions conveys serious difficulties which make unusable some standard techniques. Some widely quoted references treat only the case of equality state-control (mixed) constraints (see, for example, [7, 13]) and, for the case involving inequality mixed constraints, only first order conditions are derived (see [3, 4, 6–8]). Second order conditions for the problem in hand can be found, for example, in [1, 12] but, as we shall explain below, the conditions obtained are expressed in terms of a set of “admissible variations” which may give little or no additional information even under strong normality assumptions. To understand the type of necessary conditions given in those references, let us briefly recall a similar situation that occurs in the finite dimensional case (see [5] for details). Suppose we are interested in minimizing a function f : Rn → R on the set S = {x ∈ Rn | gα (x) ≤ 0 (α ∈ A), gβ (x) = 0 (β ∈ B)} where A = {1, . . . , p}, B = {p + 1, . . . , m}. The cases p = 0 and p = m are to be given the obvious interpretations. Let F (x, λ) = f (x) + m 1 λα gα (x) ((x, λ) ∈ Rn × Rm ) and denote by I(x) = {α ∈ A | gα (x) = 0} the set of active indexes at x. It is well-known that (under certain normality and smooth assumptions) if x0 affords a local minimum to f on S then there exists a unique λ ∈ Rm with λα ≥ 0 (α ∈ I(x0 )), λα = 0 (α ∈ A \ I(x0 )), such that Fx (x0 , λ) = 0. Moreover, h, Fxx (x0 , λ)h ≥ 0 for all h in the set of tangential constraints of S at x0 given by RS (x0 ) = {h ∈ Rn | gi (x0 ; h) = 0 (i ∈ I(x0 ) ∪ B)}. A stronger set of necessary conditions states that h, Fxx (x0 , λ)h ≥ 0 for all h in the set of modified tangential constraints of S at x0 given by R̃S (x0 ; λ) := {h ∈ Rn | gα (x0 ; h) ≤ 0 (α ∈ I(x0 ), λα = 0), gβ (x0 ; h) = 0 (β ∈ Γ ∪ B)} where Γ = {α ∈ A | λα > 0}. One can easily find examples for which a point x0 satisfies the first order condition for some λ ∈ Rm and, moreover, 1372 J. F. Rosenblueth h, Fxx (x0 , λ)h ≥ 0 for all h ∈ RS (x0 ), but h, Fxx (x0 , λ)h < 0 for some h ∈ R̃S (x0 ; λ). In this event, the former result gives no additional information, but one concludes from the latter one that the point x0 does not afford a local minimum to f on S. Now, for the optimal control problem we are dealing with, second order conditions of the first (weak) type, which can be found in the references mentioned above, are expressed in terms of a set of “admissible variations” (y, v) which satisfy the relations i. ẏ(t) = fx (x̃0 (t))y(t) + fu (x̃0 (t))v(t) (t ∈ T ), and y(t0) = y(t1) = 0; ii. ϕix (x̃0 (t))y(t) + ϕiu (x̃0 (t))v(t) = 0 for all i ∈ Ia (x̃0 (t)) ∪ Q, t ∈ T , where Ia (x̃0 (t)) denotes the set of active indexes at (x̃0 (t)) = (t, x0 (t), u0(t)) and the notations ϕix (x̃0 (t)), ϕiu (x̃0 (t)) represent ∂ϕi (t, x0 (t), u0 (t)), ∂x ∂ϕi (t, x0 (t), u0 (t)) ∂u respectively. A set of “modified admissible variations” where one would expect to have the second (strong) type of necessary conditions corresponds to pairs (y, v) satisfying, instead of (ii) above, the relations ϕix (x̃0 (t))y(t) + ϕiu (x̃0 (t))v(t) ≤ 0 (i ∈ Ia (x̃0 (t)) with μi (t) = 0), ϕjx (x̃0 (t))y(t) + ϕju (x̃0 (t))v(t) = 0 (j ∈ R with μj (t) > 0, or j ∈ Q) where the multiplier μ, as in the finite dimensional case, appears in the first order conditions and is such that μα (t) ≥ 0 with μα (t) = 0 whenever ϕα (x̃0 (t)) < 0 (α ∈ R, t ∈ T ). Our main objective in this paper is precisely to derive second order conditions of this “strong” type for the optimal control problem posed above. 2 First order conditions and strong normality First order conditions for problem (P) are well established (see, for example, [3, 4, 6–8]) and one version, expressed in terms of a maximum principle, can be written as follows. For all (t, x, u, p, μ, λ) in T × Rn × Rm × Rn × Rq × R let H(t, x, u, p, μ, λ) := p, f (t, x, u) − λL(t, x, u) − μ, ϕ(t, x, u), and denote by Uq the space of all piecewise continuous functions mapping T to Rq . Mixed constraints in optimal control 1373 Theorem 2.1 Suppose (x0 , u0) solves (P). Then there exist λ0 ≥ 0, p ∈ X, and μ ∈ Uq continuous on each interval of continuity of u0 , not vanishing simultaneously on T , such that a. μα (t) ≥ 0 with μα (t) = 0 whenever ϕα (x̃0 (t)) < 0 (α ∈ R, t ∈ T ); b. ṗ(t) = −Hx∗ (x̃0 (t), p(t), μ(t), λ0 ) and Hu (x̃0 (t), p(t), μ(t), λ0) = 0 on every interval of continuity of u0 ; c. H(t, x0 (t), u, p(t), 0, λ0) ≤ H(x̃0 (t), p(t), 0, λ0) for all (t, u) ∈ T × Rm with (t, x0 (t), u) ∈ A. Note that (a) and (c) are equivalent, respectively, to the following conditions: a. μα (t) ≥ 0 and μα (t)ϕα (x̃0 (t)) = 0 (α ∈ R, t ∈ T ); c. H(t, x0 (t), u, p(t), μ(t), λ0)+μ(t), ϕ(t, x0 (t), u) ≤ H(x̃0 (t), p(t), μ(t), λ0) for all (t, u) ∈ T × Rm with (t, x0 (t), u) ∈ A. Based on this theorem, let us introduce a set M(x, u) of multipliers together with a set E whose elements have associated a nonzero cost multiplier normalized to one. Definition 2.2 For all (x, u) ∈ Z let M(x, u) be the set of all (p, μ, λ0) ∈ X × Uq × R with λ0 + |p| = 0 satisfying a. μα (t) ≥ 0 and μα (t)ϕα (x̃(t)) = 0 (α ∈ R, t ∈ T ); b. ṗ(t) = −Hx∗ (x̃(t), p(t), μ(t), λ0) (t ∈ T ); c. Hu (x̃(t), p(t), μ(t), λ0) = 0 (t ∈ T ). Denote by E be the set of all (x, u, p, μ) ∈ Z × X × Uq such that (p, μ, 1) ∈ M(x, u), that is, a. μα (t) ≥ 0 and μα (t)ϕα (x̃(t)) = 0 (α ∈ R, t ∈ T ); b. ṗ(t) = −fx∗ (x̃(t))p(t) + L∗x (x̃(t)) + ϕ∗x (x̃(t))μ(t) (t ∈ T ); c. fu∗ (x̃(t))p(t) = L∗u (x̃(t)) + ϕ∗u (x̃(t))μ(t) (t ∈ T ). The notion of “strong normality,” as defined below, is introduced to assure that, if (p, μ, λ0) is a triple of multipliers corresponding to a strongly normal solution to the problem, then λ0 > 0 and, when λ0 = 1, the pair (p, μ) is unique. Definition 2.3 An admissible process (x, u) will be said to be “strongly normal” if, given p ∈ X and μ ∈ Uq satisfying i. μα (t)ϕα (x̃(t)) = 0 (α ∈ R, t ∈ T ); ii. ṗ(t) = −fx∗ (x̃(t))p(t) + ϕ∗x (x̃(t))μ(t) [ = −Hx∗ (x̃(t), p(t), μ(t), 0) ] (t ∈ T ); iii. 0 = fu∗ (x̃(t))p(t) − ϕ∗u (x̃(t))μ(t) [ = Hu∗ (x̃(t), p(t), μ(t), 0) ] (t ∈ T ), then p ≡ 0. In this event, clearly, also μ ≡ 0. 1374 J. F. Rosenblueth Theorem 2.4 If (x0 , u0 ) solves (P) then M(x0 , u0 ) = ∅. If also (x0 , u0) is strongly normal then there exists a unique (p, μ) ∈ X × Uq such that (x0 , u0 , p, μ) ∈ E. Proof: Let (x0 , u0 ) solve (P). By Theorem 2.1 there exists (p, μ, λ0 ) ∈ M(x0 , u0). Suppose (x0 , u0) is strongly normal. Clearly we have λ0 = 0 and, if (q, ν, λ0) ∈ M(x0 , u0 ), then i. [μα (t) − να (t)]ϕα (x̃0 (t)) = 0 (α ∈ R, t ∈ T ); ii. [ṗ(t) − q̇(t)] = −fx∗ (x̃0 (t))[p(t) − q(t)] + ϕ∗x (x̃0 (t))[μ(t) − ν(t)] (t ∈ T ); iii. 0 = fu∗ (x̃0 (t))[p(t) − q(t)] − ϕ∗u (x̃0 (t))[μ(t) − ν(t)] = 0 (t ∈ T ), implying that p ≡ q and μ ≡ ν. The result follows by choosing λ0 = 1 since (p/λ0 , μ/λ0 , 1) ∈ M(x0 , u0). 3 Second order necessary conditions For any (x, u, p, μ) ∈ Z × X × Uq let J((x, u, p, μ); (y, v)) := t1 t0 2Ω(t, y(t), v(t))dt ((y, v) ∈ Z) where, for all (t, y, v) ∈ T × Rn × Rm , 2Ω(t, y, v) := −[y, Hxx(t)y + 2y, Hxu(t)v + v, Huu (t)v] and H(t) denotes H(x̃(t), p(t), μ(t), 1). For all (t, x, u) ∈ T × Rn × Rm define the set of active indexes at (t, x, u) as Ia (t, x, u) := {α ∈ R | ϕα (t, x, u) = 0}. As mentioned in the introduction, a set of weak second order conditions for problem (P) can be found in the literature. In particular, the following result was derived in [1] by reducing the original problem into a problem involving only mixed equality constraints. Theorem 3.1 If (x0 , u0 ) is a strongly normal solution to (P) then there exists a unique (p, μ) ∈ X × Uq such that (x0 , u0, p, μ) ∈ E. Moreover, J((x0 , u0 , p, μ); (y, v)) ≥ 0 for all (y, v) ∈ Z satisfying i. ẏ(t) = fx (x̃0 (t))y(t) + fu (x̃0 (t))v(t) (t ∈ T ), and y(t0 ) = y(t1 ) = 0; ii. ϕix (x̃0 (t))y(t) + ϕiu (x̃0 (t))v(t) = 0 for all i ∈ Ia (x̃0 (t)) ∪ Q, t ∈ T . 1375 Mixed constraints in optimal control The same set of “admissible variations” defined by relations (i) and (ii) yields second order necessary conditions in other references mentioned in the introduction. Those conditions are obtained in different ways and, in some cases, under different assumptions, but they are all expressed in terms of that set of variations. Let us briefly mention that the same device used in [1], which consists in defining the functions ψα (t, x, u, w) := ϕα (t, x, u) + (w α )2 (α ∈ R), ψβ (t, x, u, w) := ϕβ (t, x, u) (β ∈ Q), is also used in [12]. The purpose of this section is to enlarge that set by considering “modified admissible variations” as defined below, thus obtaining an improved set of necessary conditions. Let us point out that the underlying ideas which yield a direct approach in the derivation of second order necessary conditions have been recently used in [9, 11] for simpler problems, though the difficulties appearing in the problem we are dealing with make it a much more complicated setting. Let us first introduce a set whose elements are embedded into a oneparameter family of admissible processes and for which the derivation of second order conditions is straightforward. Definition 3.2 For all (x0 , u0 ) ∈ Ze (A) and μ ∈ Uq denote by W(x0 , u0 , μ) the set of all (y, v) ∈ Z for which there exist δ > 0 and a one-parameter family (x(·, ), u(·, )) (|| < δ) of processes such that i. (x(t, 0), u(t, 0)) = (x0 (t), u0(t)) (t ∈ T ); ii. (x (t, 0), u (t, 0)) = (y(t), v(t)) (t ∈ T ); iii. (x(·, ), u(·, )) ∈ Ze (A) (0 ≤ < δ); iv. μα (t)ϕα (t, x(t, ), u(t, )) = 0 (α ∈ R, t ∈ T , 0 ≤ < δ). Lemma 3.3 If (x0 , u0 ) solves (P) and there exists (p, μ) ∈ X ×Uq such that (x0 , u0 , p, μ) ∈ E then J((x0 , u0, p, μ); (y, v)) ≥ 0 for all (y, v) ∈ W(x0 , u0, μ). Proof: Define K(x, u) := p(t1 ), ξ1 − p(t0 ), ξ0 + t1 t0 F (t, x(t), u(t))dt ((x, u) ∈ Z) where, for all (t, x, u) ∈ T × Rn × Rm , F (t, x, u) := L(t, x, u) − p(t), f (t, x, u) + μ(t), ϕ(t, x, u) − ṗ(t), x. 1376 J. F. Rosenblueth Observe that F (t, x, u) = −H(t, x, u, p(t), μ(t), 1) − ṗ(t), x and, if (x, u) is an admissible process, then K(x, u) = I(x, u) + t1 t0 μ(t), ϕ(t, x(t), u(t))dt. Let (y, v) ∈ W(x0 , u0 , μ) and let δ > 0 and (x(·, ), u(·, )) (|| < δ) be as in Definition 3.2. Then g() := K(x(·, ), u(·, )) (|| < δ) satisfies g() = I(x(·, ), u(·, )) ≥ I(x0 , u0 ) = K(x0 , u0 ) = g(0) Note that (0 ≤ < δ). Fx (x̃0 (t)) = −Hx (x̃0 (t), p(t), μ(t), 1) − ṗ∗ (t) = 0, Fu (x̃0 (t)) = −Hu (x̃0 (t), p(t), μ(t), 1) = 0 and therefore g (0) = 0. Consequently 0 ≤ g (0) = K ((x0 , u0 ); (y, v)) = J((x0 , u0 , p, μ); (y, v)). Let us now introduce a set of “modified admissible variations” which, under certain assumptions, coincides with W(x, u, μ). Given a process (x, u) let A(t) := fx (x̃(t)), B(t) := fu (x̃(t)) (t ∈ T ), and consider for (y, v) ∈ Z the system ẏ(t) = A(t)y(t) + B(t)v(t) (t ∈ T ) which we label L(x, u). Definition 3.4 Given (x, u) ∈ Ze (A) and μ ∈ Uq , a solution (y, v) of L(x, u) will be called a “modified differentially admissible variation” along (x, u, μ) if it satisfies i. ϕix (x̃(t))y(t) + ϕiu (x̃(t))v(t) ≤ 0 for all i ∈ Ia (x̃(t)) with μi (t) = 0 (t ∈ T ); ii. ϕjx (x̃(t))y(t) + ϕju (x̃(t))v(t) = 0 for all j ∈ R with μj (t) > 0, or j ∈ Q (t ∈ T ). Denote by Y (x, u, μ) the set of all modified differentially admissible variations (y, v) along (x, u, μ) satisfying y(t0) = y(t1) = 0. There is a strong relation between W(x0 , u0, μ) and Y (x0 , u0 , μ). To begin with, as we show next, W(x0 , u0, μ) is a subset of Y (x0 , u0, μ). 1377 Mixed constraints in optimal control Proposition 3.5 For all (x0 , u0) ∈ Ze (A) and μ ∈ Uq , W(x0 , u0, μ) ⊂ Y (x0 , u0 , μ). Proof: Let (y, v) ∈ W(x0 , u0 , μ) and let δ > 0 and (x(·, ), u(·, )) (|| < δ) be as in Definition 3.2. By 3.2(iii) we have, for all 0 ≤ < δ, ẋ(t, ) = f (t, x(t, ), u(t, )) (t ∈ T ), x(t0 , ) = ξ0 , x(t1 , ) = ξ1 and so (y, v) solves L(x0 , u0 ) with y(t0 ) = y(t1 ) = 0. Also by 3.2(iii) we have, for all (t, ) ∈ T × [0, δ), ϕα (t, x(t, ), u(t, )) ≤ 0 (α ∈ R), ϕβ (t, x(t, ), u(t, )) = 0 (β ∈ Q). Fix i ∈ R ∪ Q and t ∈ T , and set γ() := ϕi (t, x(t, ), u(t, )) so that γ (0) = ϕix (x̃0 (t))y(t) + ϕiu (x̃0 (t))v(t). If i ∈ Ia (x̃0 (t)) then γ (0) ≤ 0 and, if μi (t) > 0 or i ∈ Q, then γ ≡ 0. Thus 3.4(i) and (ii) hold and so (y, v) ∈ Y (x0 , u0 , μ). Let us now show that, under certain conditions, the two sets Y (x0 , u0 , μ) and W(x0 , u0 , μ) coincide. Lemma 3.6 Let (x0 , u0 ) ∈ Ze (A) and suppose Ia (x̃0 (·)) is piecewise constant and there exist (yi , vi ) (i = 1, . . . , n) solutions of L(x0 , u0) satisfying a. yi (t0 ) = 0 (i = 1, . . . , n); b. |y1(t1 ) · · · yn (t1 )| = 0; c. ϕjx (x̃0 (t))yi (t) + ϕju (x̃0 (t))vi (t) = 0 (j ∈ Ia (x̃0 (t)) ∪ Q, i = 1, . . . , n, t ∈ T ). Then Y (x0 , u0 , μ) = W(x0 , u0 , μ) for any μ ∈ Uq with μα (t) ≥ 0 and μα (t)ϕα (x̃0 (t)) = 0 (α ∈ R, t ∈ T ). Proof: Let μ ∈ Uq with μα (t) ≥ 0 and μα (t)ϕα (x̃0 (t)) = 0 (α ∈ R, t ∈ T ). In view of Proposition 3.5 it suffices to show that Y (x0 , u0 , μ) ⊂ W(x0 , u0 , μ). Let (y, v) ∈ Y (x0 , u0 , μ) and let Tj (j = 1, . . . , s) with cl Tj = [τj , τj+1 ], τ0 = t0 , and τs+1 = t1 , be the subintervals of T where Ia (x̃0 (·)) is constant and u0 , v, v1 , . . . , vn are continuous. For all j = 1, . . . , s and t ∈ Tj let pj be the cardinality of Ia (x̃0 (t)) ∪ Q and denote by ϕj the function mapping Tj × Rn × Rm to Rpj given by ϕj (t, x, u) = (ϕi1 (t, x, u), . . . , ϕipj (t, x, u)) where Ia (x̃0 (t)) ∪ Q = {i1 , . . . , ipj }. Set Λj (t) := ϕju (x̃0 (t))ϕj∗ u (x̃0 (t)), and consider the n × n matrix −1 j Cj (t) := A(t) − B(t)ϕj∗ u (x̃0 (t))Λj (t)ϕx (x̃0 (t)) (t ∈ Tj ). 1378 J. F. Rosenblueth For each i ∈ {i1 , . . . , ipj } let ηij be the unique solution of the system η̇(t) = Cj (t)η(t) + B(t)ϕj∗ iu (x̃0 (t)) (t ∈ Tj ), η(τj ) = 0 and let η j be the n × pj matrix (ηij1 , . . . , ηijpj ). Let −1 j j γj (t) := ϕj∗ u (x̃0 (t))[Ipj ×pj − Λj (t)ϕx (x̃0 (t))η (t)] (t ∈ Tj ) and observe that η j (τj ) = 0 and j η̇ j (t) = Cj (t)η j (t) + B(t)ϕj∗ u (x̃0 (t)) = A(t)η (t) + B(t)γj (t) (t ∈ Tj ). For all (t, , α, λ) ∈ Tj × R × Rn × Rpj define ū(t, , α, λ) := u0 (t) + v(t) + n i=1 αi vi (t) + γj (t)λ. By the embedding theorem of differential equations, the equations ẋ(t) = f (t, x(t), ū(t, , α, λ)) (t ∈ Tj ), x(τj ) = x0 (τj ) have, for some τ > 0, unique solutions x̄(t, , α, λ) (t ∈ Tj , || < τ, |αi| < τ, |λk | < τ, i = 1, . . . , n, k = 1, . . . , pj ) such that x̄(t, 0, 0, 0) = x0 (t) (t ∈ T ). By differentiation with respect to λ it is found that x̄˙ λ (t, 0, 0, 0) = A(t)x̄λ (t, 0, 0, 0) + B(t)γj (t) (t ∈ Tj ), x̄λ (τj , 0, 0, 0) = 0 and therefore x̄λ (t, 0, 0, 0) = η j (t) (t ∈ Tj ). For all (t, , α, λ) ∈ Tj × (−τ, τ ) × (−τ, τ )n × (−τ, τ )pj , j = 1, . . . , s, let hj (t, , α, λ) := ϕj (t, x̄(t, , α, λ), ū(t, , α, λ)) − [ϕjx (x̃0 (t))y(t) + ϕju (x̃0 (t))v(t)]. Note that hj (t, 0, 0, 0) = 0 (t ∈ Tj ) and |hjλ (t, 0, 0, 0)| = |ϕjx (x̃0 (t))η j (t) + ϕju (x̃0 (t))γj (t)| = |Λj (t)| = 0 (t ∈ Tj ). By the implicit function theorem there exist 0 < νj < τ and functions σ j : Tj × (−νj , νj ) × (−νj , νj )n → Rpj such that, for all t ∈ Tj , σ j (t, 0, 0) = 0, σ j (t, ·, ·) is C 2 and hj (t, , α, σ j (t, , α)) = 0. 1379 Mixed constraints in optimal control Let ν := min{νj }j and let σ(t, , α) := σ j (t, , α) (t ∈ Tj , j = 1, . . . , s, || < ν, |αi | < ν). Thus ϕj (t, x̄(t, , α, σ(t, , α)), ū(t, , α, σ(t, , α))) −[ϕjx (x̃0 (t))y(t) + ϕju (x̃0 (t))v(t)] = 0 for all t ∈ T , || < ν, |αi | < ν. Taking the derivative with respect to and αi at (, α) = (0, 0), and using the fact that x̄ (t, 0, 0, 0) = y(t) and x̄αi (t, 0, 0, 0) = yi (t) (t ∈ T ), we have 0 = ϕjx (x̃0 (t))[y(t) + η j (t)σ (t, 0, 0)] + ϕju (x̃0 (t))[v(t) + γj (t)σ (t, 0, 0)] −ϕjx (x̃0 (t))y(t) − ϕju (x̃0 (t))v(t) = Λj (t)σ (t, 0, 0) and, by assumption (c), 0 = ϕjx (x̃0 (t))[yi (t) + η j (t)σαi (t, 0, 0)] + ϕju (x̃0 (t))[vi (t) + γj (t)σαi (t, 0, 0)] = Λj (t)σαi (t, 0, 0), implying that σ (t, 0, 0) = σαi (t, 0, 0) = 0 (t ∈ T ). For all t ∈ T , || < ν, |αi | < ν, let w(t, , α) := ū(t, , α, σ(t, , α)), z(t, , α) := x̄(t, , α, σ(t, , α)) and observe that, in view of the above relations, w (t, 0, 0) = v(t), wαi (t, 0, 0) = vi (t) (t ∈ T ). Moreover, z(t, , α) is the unique solution of ż(t) = f (t, z(t), w(t, , α)) (t ∈ T ), z(t0 ) = ξ0 satisfying z(t, 0, 0) = x0 (t). Now, let S := (−ν, ν) and define g: S × S n → Rn by g(, α) := z(t1 , , α) − ξ1 . Note that g(0, 0) = 0 and |gα(0, 0)| = |M | = 0 1380 J. F. Rosenblueth where M = (y1 (t1 ) · · · yn (t1 )). By the implicit function theorem there exist 0 < δ < ν and β: (−δ, δ) → Rn of class C 2 such that β(0) = 0 and g(, β()) = 0 (|| < δ). We have, taking the derivative with respect to at = 0, that 0 = g (0, 0) + gα (0, 0)β (0) = y(t1 ) + Mβ (0) = Mβ (0) implying that β (0) = 0. By continuity we may choose δ > 0 so that |βi ()| < ν for all || < δ, i = 1, . . . , n. Let us now prove that the one-parameter family x(t, ) := z(t, , β()), u(t, ) := w(t, , β()) (t ∈ T, || < δ) has the properties of the theorem. Observe first that x (t, 0) = y(t) + zα (t, 0, 0)β (0) = y(t), u (t, 0) = v(t) + wα (t, 0, 0)β (0) = v(t). Moreover, x(t1 , ) − ξ1 = z(t1 , , β()) − ξ1 = g(, β()) = 0 so that x(·, ) (|| < δ) joins the endpoints of x0 . Now, for all || < δ and t ∈ Tj , we have ϕj (t, x(t, ), u(t, )) = [ϕjx (x̃0 (t))y(t) + ϕju (x̃0 (t))v(t)]. By Definition 3.4, this implies that, for t ∈ T and 0 ≤ < δ, ϕi (t, x(t, ), u(t, )) ≤ 0 for all i ∈ Ia (x̃0 (t)) with μi (t) = 0 and, for t ∈ T and || < δ, ϕj (t, x(t, ), u(t, )) = 0 for all j ∈ Ia (x̃0 (t)) with μj (t) > 0, or j ∈ Q. For the case i ∈ Ia (x̃0 (t)), that is, ϕi (x̃0 (t)) < 0, we have μi (t) = 0 and we can diminish δ > 0, if necessary, so that ϕi (t, x(t, ), u(t, )) < 0 (|| < δ). Consequently (x(·, ), u(·, )) ∈ Ze (A) (0 ≤ < δ) and μi(t)ϕi (t, x(t, ), u(t, )) = 0 (i ∈ R, t ∈ T, 0 ≤ < δ). This shows that (y, v) ∈ W(x0 , u0 , μ). As we show next, the existence of n differentially admissible variations satisfying the assumptions of Lemma 3.6 is assured if the process under consideration is strongly normal. 1381 Mixed constraints in optimal control Lemma 3.7 Let (x0 , u0 ) ∈ Ze (A) and suppose Ia (x̃0 (·)) is piecewise constant. If (x0 , u0 ) is strongly normal then there exist (yi, vi ) (i = 1, . . . , n) solutions of L(x0 , u0 ) satisfying (a)-(c) of Lemma 3.6. Proof: Let T1 , . . . , Ts be the subintervals of T where Ia (x̃0 (·)) is constant and, for all j = 1, . . . , s and t ∈ Tj , let pj be the cardinality of Ia (x̃0 (t)) ∪ Q. For all t ∈ Tj define ϕ̂(t, x, u) = (ϕi1 (t, x, u), . . . , ϕipj (t, x, u)) where Ia (x̃0 (t)) ∪ Q = {i1 , . . . , ipj }. To simplify the notation, let ϕ̂(t) = ϕ̂(x̃0 (t)) and similarly for ϕ. Let Z(t) ∈ Rn×n satisfy Ż(t) = −Z(t)C(t) (t ∈ T ), Z(t1 ) = I where the n × n matrix C(t) is given by C(t) := A(t) − B(t)ϕ̂∗u (t)Λ−1 (t)ϕ̂x (t) (t ∈ T ) and Λ(t) = ϕ̂u (t)ϕ̂∗u (t). Note that, since ϕ̂u (t)ϕ̂∗u (t)Λ−1 (t) = Λ−1 (t)∗ ϕ̂u (t)ϕ̂∗u (t) = Ipj ×pj (t ∈ Tj ) we have Λ−1 (t) = Λ−1 (t)∗ . Denote by z1 , . . . , zn the row vectors of Z so that, for all t ∈ T and i = 1, . . . , n, żi (t) = −C ∗ (t)zi (t) = [−A∗ (t) + ϕ̂∗x (t)Λ−1 (t)ϕ̂u (t)B ∗ (t)]zi (t). Define μ̂i (t) = (μ̂i1 (t), . . . , μ̂ipj (t)) by μ̂i (t) := Λ−1 (t)ϕ̂u (t)B ∗ (t)zi (t) so that (t ∈ Tj , i = 1, . . . , n) żi (t) = −A∗ (t)zi (t) + ϕ̂∗x (t)μ̂i (t), and extend the function μ̂i to include all other indexes in R by setting μi (t) = (μi1 (t), . . . , μiq (t)) where μiα (t) := i μ̂ ir (t) 0 if α = ir , r = 1, . . . , pj otherwise. Clearly we have μiα (t)ϕα (t) = 0 for all i = 1, . . . , n, α ∈ R, and t ∈ T . Moreover, ⎛ pj ⎞ ∂ϕi i k (t)μ̂ik (t) ⎟ ⎜ ⎜ ⎟ ⎜ k=1 ∂u1 ⎟ ⎜ ⎟ ∗ i ∗ i . ϕ̂u (t)μ̂ (t) = ⎜ ⎟ = ϕu (t)μ (t) . . ⎜ p ⎟ j ⎜ ⎟ ∂ϕik ⎝ ⎠ (t)μ̂iik (t) ∂u m k=1 1382 J. F. Rosenblueth and, similarly, ϕ̂∗x (t)μ̂i (t) = ϕ∗x (t)μi (t) (t ∈ T ). Now, let yi be the solution of ẏ(t) = C(t)y(t) + [B(t)B ∗ (t) − B(t)ϕ̂∗u (t)Λ−1 (t)ϕ̂u (t)B ∗ (t)]zi (t), y(t0 ) = 0 and set, for t ∈ T and i = 1, . . . , n, vi (t) := B ∗ (t)zi (t) − ϕ̂∗u (t)μ̂i (t) − ϕ̂∗u (t)Λ−1 (t)ϕ̂x (t)yi (t). As one readily verifies, we have ẏi(t) = A(t)yi (t) + B(t)vi (t) (t ∈ T ), ϕ̂x (t)yi (t) + ϕ̂u (t)vi (t) = 0 (t ∈ T ) and so (yi , vi ) are solutions of L(x0 , u0 ) satisfying (a) and (c). It remains to show that |y1 (t1 ) · · · yn (t1 )| = 0. Let wi (t) := B ∗ (t)zi (t) − ϕ̂∗u (t)μ̂i (t) so that wi (t) = vi (t) + ϕ̂∗u (t)Λ−1 (t)ϕ̂x (t)yi (t) and define αij := t1 t0 wi (t), wj (t)dt (i, j = 1, . . . , n). Note first that the functions w1 , . . . , wn are linearly independent on T since, otherwise, there would exist constants a1 , . . . , an not all zero such that 0= n 1 ai wi (t) = In this event, if μ(t) := (α ∈ R, t ∈ T ), ż(t) = n 1 n 1 n 1 ai [B ∗ (t)zi (t) − ϕ̂∗u (t)μ̂i (t)] (t ∈ T ). ai μi (t) and z(t) := n 1 ai zi (t), then μα (t)ϕα (t) = 0 ai żi (t) = [−A∗ (t) + ϕ̂∗x (t)Λ−1 (t)ϕ̂u (t)B ∗ (t)]z(t) = −A∗ (t)z(t) + ϕ∗x (t)μ(t), and B ∗ (t)z(t) = ϕ∗u (t)μ(t) (t ∈ T ), and the function z would be a nonnull solution to the system given in Definition 2.3, contradicting the strong normality of (x0 , u0 ). Hence the rank of the matrix (αij ) is n. Now, observe that d zi (t), yj (t) = zi∗ (t)[A(t)yj (t) + B(t)vj (t)] − zi∗ (t)A(t)yj (t) dt + μi∗ (t)ϕx (t)yj (t) = zi∗ (t)B(t)vj (t) + zi∗ (t)B(t)ϕ̂∗u (t)Λ−1 (t)ϕ̂x (t)yj (t) = zi∗ (t)B(t)wj (t) = [wi∗(t) + μ̂i∗ (t)ϕ̂u (t)]wj (t) = wi(t), wj (t), 1383 Mixed constraints in optimal control the last equality holding since ϕ̂u (t)wj (t) = ϕ̂u (t)B ∗ (t)zj (t) − Λ(t)μ̂j (t) = 0. Therefore zi (t1 ), yj (t1 ) = αij (i, j = 1, . . . , n). Since the right member has rank n and Z(t1 ) = I, the matrix (yji (t1 )) has rank n. In view of Note 2.4 and Lemmas 3.3, 3.6 and 3.7, we obtain the following necessary conditions for optimality of the strong type. Theorem 3.8 If (x0 , u0 ) is a strongly normal solution to (P) then there exists a unique (p, μ) ∈ X × Uq such that (x0 , u0 , p, μ) ∈ E. If also Ia (x̃0 (·)) is piecewise constant then J((x0 , u0 , p, μ); (y, v)) ≥ 0 for all (y, v) ∈ Z satisfying i. ẏ(t) = fx (x̃0 (t))y(t) + fu (x̃0 (t))v(t) (t ∈ T ), and y(t0 ) = y(t1) = 0; ii. ϕix (x̃0 (t))y(t) + ϕiu (x̃0 (t))v(t) ≤ 0 for all i ∈ Ia (x̃0 (t)) with μi (t) = 0 (t ∈ T ); iii. ϕjx (x̃0 (t))y(t) + ϕju (x̃0 (t))v(t) = 0 for all j ∈ R with μj (t) > 0, or j ∈ Q (t ∈ T ). 4 Examples In this section we provide two simple examples which illustrate some important features of the theory developed in Section 3. The first one shows that the modified (strong) conditions of Theorem 3.8 may give more information than the classical (weak) conditions of Theorem 3.1. The second one shows that if, in Theorems 3.1 and 3.8, the strong normality assumption imposed on a solution to the problem is replaced by a weaker assumption, the conclusion of the theorems may not hold. In other words, a “weakly normal solution” to the problem (as defined below) may yield a negative second variation on the set of admissible variations (and so also on the set of modified admissible variations). Example 4.1 Consider the problem (P) of minimizing I(x, u) = π 0 {u22 (t) − x2 (t)}dt subject to ẋ(t) = u1 (t) + u2 (t), x2 (t) + u1 (t) ≥ 0 (t ∈ [0, π]), x(0) = x(π) = 0. 1384 J. F. Rosenblueth In this case we have T = [0, π], n = r = q = 1, m = 2, ξ0 = ξ1 = 0 and, for any t ∈ T , x ∈ R, and u ∈ R2 with u = (u1 , u2), L(t, x, u) = u22 − x2 , f (t, x, u) = u1 + u2 , ϕ1 (t, x, u) = −x2 − u1 . Observe first that H(t, x, u, p, μ, 1) = p(u1 + u2 ) − u22 + x2 + μ(x2 + u1 ) so that Hu = (p + μ, p − 2u2), Hxx = 2(1 + μ), Hxu = 0, Hx = 2x(1 + μ), Huu = 0 0 0 −2 with the partial derivatives of H evaluated at (t, x, u, p, μ, 1). Therefore, for any (x, u, p, μ) ∈ Z × X × U1 and (y, v) ∈ Z, J((x, u, p, μ); (y, v)) = 2 π 0 {v22 (t) − (1 + μ(t))y 2(t)}dt. Consider the admissible process (x0 , u0 ) ≡ (0, 0). Note that (x0 , u0 ) is strongly normal since condition (iii) in Definition 2.3 corresponds to (0, 0) = p(t)(1, 1) − μ(t)(−1, 0) = (p(t) + μ(t), p(t)). Let (p, μ) ≡ (0, 0). Clearly (x0 , u0 , p, μ) belongs to E and we have Ia (t, x0 (t), u0 (t)) = {1}. By Theorem 3.1, if (y, v) ∈ Z is an admissible variation (in the sense of that theorem), that is, ẏ(t) = v1 (t) + v2 (t), y(0) = y(π) = 0 and 0 = ϕ1x (x̃0 (t))y(t) + ϕ1u (x̃0 (t))v(t) = −v1 (t), then J((x0 , u0, p, μ); (y, v)) ≥ 0. But the above relations imply that ẏ(t) = v2 (t) and so, by Theorem 3.1, π 0 {ẏ 2(t) − y 2(t)}dt ≥ 0 for all y ∈ X satisfying y(0) = y(π) = 0, which is a well-known fact. Thus the conclusion of the theorem gives no information with respect to (x0 , u0). On the other hand, if we let v(t) = (v1 (t), v2 (t)) := (cos t, 0) if t ∈ [0, π/2] (0, cos t) if t ∈ [π/2, π] 1385 Mixed constraints in optimal control and y(t) := sin t (t ∈ [0, π]), then (y, v) is a modified admissible variation, that is, ẏ(t) = v1 (t) + v2 (t), y(0) = y(π) = 0 and 0 ≥ ϕ1x (x̃0 (t))y(t) + ϕ1u (x̃0 (t))v(t) = −v1 (t), but J((x0 , u0 , p, μ); (y, v)) = −2 π/2 0 sin2 tdt + 2 π π/2 {cos2 t − sin2 t}dt = −π/2 < 0. By Theorem 3.8 we conclude that (x0 , u0) ≡ (0, 0) is not a solution to the problem. Note that the sign of μα (t) is not considered in the definition of strong normality. By adding such condition, we obtain the following weaker notion of normality. Definition 4.2 An admissible process (x, u) will be said to be “weakly normal” if, given p ∈ X and μ ∈ Uq satisfying i. μα (t) ≥ 0 and μα (t)ϕα (x̃(t)) = 0 (α ∈ R, t ∈ T ); ii. ṗ(t) = −fx∗ (x̃(t))p(t) + ϕ∗x (x̃(t))μ(t) [ = −Hx∗ (x̃(t), p(t), μ(t), 0) ] (t ∈ T ); iii. 0 = fu∗ (x̃(t))p(t) − ϕ∗u (x̃(t))μ(t) [ = Hu∗ (x̃(t), p(t), μ(t), 0) ] (t ∈ T ), then p ≡ 0. In this event, clearly, also μ ≡ 0. Observe that in Note 2.4, if we replace the assumption of strong with that of weak normality, the result remains valid except for the uniqueness of (p, μ). The next example provides a negative second variation along a solution to the problem which is weakly but not strongly normal. Example 4.3 Consider the problem (P) of minimizing I(x, u) = 1 0 u2 (t)dt subject to ẋ(t) = u21 (t) + u2 (t) − u3 (t) (t ∈ [0, 1]), u2 (t) ≥ 0, x(0) = x(1) = 0, x2 (t) + u3 (t) ≥ 0 (t ∈ [0, 1]). In this case we have T = [0, 1], n = 1, m = 3, r = q = 2, ξ0 = ξ1 = 0 and, for all t ∈ T , x ∈ R, and u ∈ R3 with u = (u1 , u2, u3 ), L(t, x, u) = u2 , f (t, x, u) = u21 + u2 − u3 , ϕ1 (u) = −u2 , ϕ2 (u) = −x2 − u3 . 1386 J. F. Rosenblueth Observe first that H(t, x, u, p, μ, 1) = p(u21 + u2 − u3 ) − u2 + μ1 u2 + μ2 (x2 + u3 ) so that Hu (t, x, u, p, μ, 1) = (2pu1, p − 1 + μ1 , −p + μ2 ), Hx (t, x, u, p, μ, 1) = 2xμ2 , ⎛ Hxx (t, x, u, p, μ, 1) = 2μ2 , ⎞ 2p 0 0 ⎜ ⎟ Huu (t, x, u, p, μ, 1) = ⎝ 0 0 0 ⎠ . 0 0 0 Therefore, for any (x, u, p, μ) ∈ Z × X × U2 and (y, v) ∈ Z, J((x, u, p, μ); (y, v)) = −2 1 0 {μ2 (t)y 2 (t) + p(t)v12 (t)}dt. Clearly (x0 , u0 ) ≡ (0, 0) is a solution to the problem. It is weakly normal since, given p ∈ X and (μ1 , μ2 ) ∈ U2 satisfying μα (t) ≥ 0 (α ∈ {1, 2}), ṗ(t) = 0, and p(t)(0, 1, −1) − (0, −μ1 (t), −μ2 (t)) = 0 then, necessarily, p ≡ 0. Now, let μ = (μ1 , μ2 ) ≡ (0, 1) and p ≡ 1 and note that (x0 , u0 , p, μ) ∈ E. Let v = (v1 , v2 , v3 ) ≡ (1, 0, 0) and y ≡ 0. Then (y, v) ∈ Z is an admissible variation in the sense that it satisfies (i) and (ii) of Theorem 3.1, given in this case by i. ẏ(t) = v2 (t) − v3 (t) (t ∈ T ) and y(0) = y(1) = 0; ii. −v2 (t) = 0, −v3 (t) = 0 (t ∈ T ). Being (y, v) an admissible variation, it is also a modified admissible variation since it satisfies (i)–(iii) of Theorem 3.8. However, the conclusion of the theorems does not hold since J((x0 , u0 , p, μ); (y, v)) = −2 1 0 p(t)v12 (t)dt = −2 < 0. References [1] de Pinho MR, Rosenblueth JF (2007) Mixed constraints in optimal control: an implicit function theorem approach, IMA Journal of Mathematical Control and Information, 24: 197-218 [2] Gilbert EG, Bernstein DS (1983) Second order necessary conditions in optimal control: accessory-problem results without normality conditions, Journal of Optimization Theory & Applications, 41: 75-106 Mixed constraints in optimal control 1387 [3] Hestenes MR (1965) On variational theory and optimal control theory, SIAM Journal on Control, 3: 23-48 [4] Hestenes MR (1966) Calculus of Variations and Optimal Control Theory, John Wiley & Sons, New York [5] Hestenes MR (1975) Optimization Theory, The Finite Dimensional Case, John Wiley & Sons, New York [6] Makowski K, Neustadt LW (1974) Optimal control problems with mixed control-phase variable equality and inequality constraints, SIAM Journal on Control, 12: 184-228 [7] Milyutin AA, Osmolovskiı̌ (1998) Calculus of Variations and Optimal Control, Translations of Mathematical Monographs 180, American Mathematical Society, Providence, Rhode Island [8] Neustadt LW (1976) Optimization. A Theory of Necessary Conditions, Princeton University Press, Princeton [9] Rosenblueth JF (2007) A direct approach to second order conditions for mixed equality constraints, Journal of Mathematical Analysis & Applications, 333: 770-779 [10] Rosenblueth JF (2007) Convex cones and conjugacy for inequality control constraints, Journal of Convex Analysis, 14: 361-393 [11] Rosenblueth JF (2007) A new derivation of second order conditions for equality control constraints, Applied Mathematics Letters, 21: 910-915 [12] Russak IB (1975) Second order necessary conditions for general problems with state inequality constraints, Journal of Optimization Theory and Applications, 17: 43-92 [13] Stefani G, Zezza PL (1996) Optimality conditions for a constrained control problem, SIAM Journal on Control & Optimization, 34: 635-659 [14] Warga J (1978) A second order Lagrangian condition for restricted control problems, Journal of Optimization Theory & Applications, 24: 475-483 Received: November, 2008
© Copyright 2026 Paperzz