Second-order Subdifferentials of C Functions and Optimality

Second-order Subdifferentials of C1,1 Functions
and Optimality Conditions∗
Pando Gr. Georgiev, Nadia P. Zlateva
University of Sofia, Department of Mathematics and Informatics
5 James Bourchier Blvd., 1126 Sofia, Bulgaria.
Abstract. We present second-order subdifferentials of Clarke’s type of
C 1,1 functions, defined in separable Banach spaces with separable duals. One
of them is an extension of the generalized Hessian matrix of such functions
in Rn , considered by J.B. H.-Urruty, J.J. Strodiot and V.H. Nguyen. Various properties of these subdifferentials are proved. Second order optimality
conditions (necessary, sufficient) for constrained minimization problems with
C 1,1 data are obtained.
Mathematics Subject Classifications (1991): 49J52, 49K30.
Key words: second order subdifferentials, Clarke’s subdifferential, twice
strict differentiability, sufficient optimality conditions, necessary optimality
conditions.
∗
This work was partially supported by the National Foundation for Scientific Investigations in Bulgaria under contract No. MM-406/1994.
1
0
Introduction
One of the motivations of this paper is the article of J.B. Hiriart-Urruty, J.J.
Strodiot and V.H. Nguyen [8]. They defined so called generalized Hessian
matrix of C 1,1 functions in Rn and investigated many of its properties, including necessary second-order optimality conditions for minimization problems
with C 1,1 data.
Our goal in this paper is to define similar notions in infinite dimensional
Banach spaces and to investigate their properties.
The Rademacher theorem plays a crucial role for defining in Rn the generalized Hessian matrix to C 1,1 functions. In infinite dimensional Banach
spaces we use a weaker generalization of it, due to J.P.R. Christensen [3,
Theorem 7.5], but for C 1,1 functions defined on separable Banach spaces
with separable duals and we define an extension of the generalized Hessian
matrix, called here second-order subdifferential. Many of the properties of
the generalized Hessian matrix are valid also in our case.
Let us note that the above mentioned result of Christensen was used by
L. Thiboult in [16], where he extended the notion of Clarke’s subdifferential
to mappings acting from separable Banach spaces to reflexive spaces.
We obtain a necessary condition and sufficient conditions for constrained
minimization problems with C 1,1 data. The necessary condition cannot be
proved by the method in [8], which does not work in infinite dimensional
Banach spaces.
Our approach is based on the method, described by V.M. Alekseev, V.M.
Tikhomirov and S.V. Fomin [1] for obtaining necessary and sufficient conditions for constrained minimization problems, defined by twice Frechet differentiable functions.
We refer to the following (incomplete) list of publications and references
therein, concerning the recent development of the second-order derivatives
and optimality conditions in the non-smooth analysis: [2, 10, 11, 12, 14, 15,
17].
Some of the results in this paper are announced in [6].
1
Basic definitions and properties
Let (E, k.k) be a real separable Banach space with separable dual (E ∗ , k.k)
and G be an open subset of E.
2
Consider the class C 1,1 (G) of all functions f : G → R, whose first Gateaux
derivatives are locally Lipschitz (then, by the mean value theorem, f is
strictly Frechet differentiable on G). Having in mind that every separable dual space E ∗ has Radon-Nikodym property (see [13]) it follows from a
theorem of J.P.R. Christensen [3, Theorem 7.5] that for every f ∈ C 1,1 (G),
f 0 is Gateaux differentiable on a dense subset G(f ) of G. In fact G(f ) is
Haar-null set (see [3]). We shall say that f is twice Gateaux differentiable
on G(f ) and shall denote by f 00 (x) the Gateaux derivative of f 0 at x ∈ G(f ).
We shall denote by x∗ [h] the value of the linear functional x∗ ∈ E ∗ on
the element h ∈ E and by L[h1 , h2 ] - the value of the bilinear functional L
defined on E × E on the pairs of elements h1 , h2 ∈ E.
Let L(E × E) be the Banach space of all bilinear continuous functionals
L : E × E → R with the norm kLk = sup | L[h1 , h2 ] |, and L(E, E ∗ ) be the
kh1 k=1
kh2 k=1
Banach space of all linear continuous mappings L : E → E ∗ with the norm
kLk = sup kL(h)k∗ .
khk=1
It is well-known that L(E ×E) and L(E, E ∗ ) are isometrically isomorphic
(see [1], section 2.2.5). So, in the sequel we shall identify L(E × E) and
L(E, E ∗ ).
In the sequel we shall suppose that the function f belongs to the class
1,1
C (G).
Definition 1.1 For every x ∈ G, h1 , h2 ∈ E we define
f 00 (x; h1 , h2 ) := lim sup
y→x
t↓0
f 0 (y + th1 )[h2 ] − f 0 (y)[h2 ]
,
t
d2 f (x; h1 , h2 ) := lim sup f 00 (z)[h1 , h2 ].
G(f )3z→x
Proposition 1.2 For every x ∈ G, h1 , h2 ∈ E we have
f 00 (x; h1 , h2 ) = d2 f (x; h1 , h2 ).
Proof. From [16, Proposition 2.2] we have
f 00 (x; h1 , h2 ) = hf 0 (.), h2 i0 (x; h1 ) = lim sup hf 0 (.); h2 i0 (z; h1 ) =
G(f )3z→x
= lim sup f 00 (z)[h1 , h2 ].
G(f )3z→x
3
Proposition 1.3 The function f 00 (.; h1 , h2 ) is upper semicontinuous for every h1 , h2 ∈ E and
| f 00 (x; h1 , h2 ) | ≤ lx kh1 k.kh2 k,
where lx is a Lipschitz constant of f 0 on a neighbourhood of x.
Proof. Since f 00 (x; h1 , h2 ) = hf 0 (.), h2 i0 (x; h1 ), the assertion follows from
the properties of the Clarke derivative (see [4, Proposition 2.1.1]).
Having in mind Proposition 1.2 we claim that d2 f (x; h1 , h2 ) has the same
properties.
L(E, E ∗ ) is a conjugate space (see Holmes [7, Chapter 23B]); the w∗ topology of L(E, E ∗ ) is called weak∗ -operator topology. The predual of
L(E, E ∗ ) is the linear hull V of all functionals lh1 ,h2 ∈ (L(E, E ∗ ))∗ , h1 , h2 ∈ E
of the form lh1 ,h2 (L) := L[h1 , h2 ] with the norm in (L(E, E ∗ ))∗ . Then V is a
separable normed space, therefore we can note the following.
Remark 1.4 Every w∗ - compact subset in L(E × E) is metrizable (see
Holmes [7, Chapter 12F]).
We have that a sequence {Ln } ⊂ L(E × E) converges in the w∗ -topology
to some L0 ∈ L(E × E) iff Ln [h1 , h2 ] → L0 [h1 , h2 ] for every h1 , h2 ∈ E.
There are two natural ways to define second order subdifferentials of
f ∈ C 1,1 (by analogy with the first order Clarke’s subdifferential (see [4,
p.27] and [4, Theorem 2.5.1]).
Definition 1.5
∂c2 f (x) := { L ∈ L(E ×E) : L[h1 , h2 ] ≤ f 00 (x; h1 , h2 ), ∀(h1 , h2 ) ∈ E ×E}.
Definition 1.6
∂ 2 f (x) := co∗ {L ∈ L(E × E) : L = w∗ −
lim
G(f )3z→x
f 00 (z) }.
An analogue of Definition 1.6 was introduced by L. Thibault in [16] in
order to extend the notion of Clarke’s subdifferential to locally Lipschitz
mappings, acting from a separable Banach space to a reflexive space.
It’s easy to see that ∂ 2 f (x) ⊂ ∂c2 f (x), for every x ∈ G. Indeed, let L =
∗
w − lim f 00 (zn ). Since f 00 (zn )[h1 , h2 ] ≤ f 00 (zn ; h1 , h2 ) for every h1 , h2 ∈
G(f )3zn →x
E, by Proposition 1.3 we have
4
L[h1 , h2 ] = lim f 00 (zn )[h1 , h2 ] ≤ lim sup f 00 (zn ; h1 , h2 ) ≤ f 00 (x; h1 , h2 ).
n→∞
n→∞
Hence L ∈ ∂c2 f (x), and since ∂c2 f (x) is obviously convex and w∗ -closed, we
obtain ∂ 2 f (x) ⊂ ∂c2 f (x).
Also, it is easy to see that
d2 f (x; h1 , h2 ) = sup{L[h1 , h2 ] ; L ∈ ∂ 2 f (x)}.
(1.1)
In the case when E = Rn Definition 1.6 was considered and used by
J.B.H. Urruty, J.J. Strodiot and V.H. Nguyen [8] (∂ 2 f (x) was called there
generalized Hessian matrix). They used Rademacher’s theorem (instead of
Christensen’s one); then f is twice Frechet differentiable almost everywhere.
Therefore f 00 (z) (when it exists) is a symmetric matrix (see [1], Section 2.2.5),
and hence ∂ 2 f (x) consists of symmetric matrices. In our infinite dimensional
case we cannot say that ∂ 2 f (x) consists of symmetric bilinear functionals.
Note that in Rn , ∂c2 f (x) is in fact the plenary hull of ∂ 2 f (x) in the terminology of [9]. ∂c2 f (x) can be essentially bigger than ∂ 2 f (x), but they coincide
when ∂ 2 f (x) is a singleton (see Corollary 1.11).
Proposition 1.7 For every x ∈ E the sets ∂ 2 f (x) and ∂c2 f (x) are nonempty convex and w∗ -compact. The multivalued mappings ∂ 2 f and ∂c2 f are
locally norm bounded in L(E × E).
Proof. Let xn ∈ G(f ) and xn → x, as n → ∞. Since the set D :=
{f (xn ) : n > ν} is norm bounded in L(E × E) for some ν by the Lipschitz
constant of f 0 at x, we can apply the Alaoglou-Bourbaki theorem. Thus the
limit of a w∗ -convergent subsequence of D belongs to ∂ 2 f (x). Since ∂ 2 f (x) ⊂
∂c2 f (x), then ∂c2 f (x) 6= ∅ too. The convexity and w∗ -closedness of ∂ 2 f and
∂c2 f follows directly from the definitions. The locally boundedness follows
from the fact that f 0 is locally Lipschitz.
Again by the Alaouglou-Bourbaki theorem we obtain that ∂ 2 f (x) and
∂c2 f (x) are w∗ -compact.
00
Definition 1.8 The function f : G → R is said to be twice strictly Gateaux
differentiable at x ∈ G if there exists Ds2 f (x) ∈ L(E × E), such that:
lim
y→x
t↓0
f 0 (y + th1 )[h2 ] − f 0 (y)[h2 ]
= Ds2 f (x)[h1 , h2 ],
t
5
∀h1 , h2 ∈ E.
The following three propositions are analogues of the corresponding ones
concerning the first-order case (see Clarke [4]).
Proposition 1.9 If f ∈ C 1,1 is twice strictly Gateaux differentiable at x ∈
G, then ∂c2 f (x) = {Ds2 f (x)}.
Proof. By definition, f 00 (x; h1 , h2 ) = Ds2 f (x)[h1 , h2 ] and then L[h1 , h2 ] ≤
Ds2 f (x)[h1 , h2 ] for all h1 , h2 ∈ E and L ∈ ∂c2 f (x). Hence L[h1 , h2 ] =
Ds2 f (x)[h1 , h2 ] and therefore L = Ds2 f (x).
Proposition 1.10 Let f ∈ C1, 1(G). Then ∂ 2 f (x) is a singleton if and only
if f is twice strictly Gateaux differentiable at x ∈ G.
The idea of the proof is from [4].
Proof. Let ∂ 2 f (x) = {L}. Then L = w∗ −
lim
G(f )3z→x
f 00 (z) and from
Proposition 1.2 and (1.1),
f 00 (x; h1 , h2 ) = lim sup f 00 (z)[h1 , h2 ] = L[h1 , h2 ].
G(f )3z→x
We may write:
lim0 inf
x →x
t↓0
(f 0 (x0 ) − f 0 (x0 + th1 ))[h2 ]
(f 0 (x0 + th1 ) − f 0 (x0 ))[h2 ]
= − lim sup
=
t
t
x0 →x
t↓0
= − lim sup
x0 →x
t↓0
(f 0 (x0 + th1 − th1 ) − f 0 (x0 + th1 ))[h2 ]
= −f 00 (x; −h1 , h2 ) =
t
= −L[−h1 , h2 ] = L[h1 , h2 ] = f 00 (x; h1 , h2 ),
whence f is twice strictly differentiable.
The other direction follows from the inclusion ∂c2 f (x) ⊃ ∂ 2 f (x) and from
Proposition 1.9.
By Propositions 1.9 and 1.10 we obtain immediately the following.
Corollary 1.11 For a C 1,1 function f : G → R, ∂c2 f (x) is a singleton
(x ∈ G) if and only if ∂ 2 f (x) is a singleton.
The arguments for proving the following proposition are now classical
(similar to the first order case) and the proof is omitted.
6
Proposition 1.12 The multivalued mappings ∂c2 f and ∂ 2 f : (G, k.k) →
(L(E × E), w∗ ) are upper semicontinuous.
Let ψ : R → G be an affine function, φ : G → R be a C 1,1 function . It
is clear that φ ◦ ψ ∈ C 1,1 (R).
It is easy to derive in the same way like in [8, Theorem 2.2] the following.
Proposition 1.13 For all x0 , u, v ∈ R
(φ ◦ ψ)00 (x0 ; u, v) = φ00 (ψ(x0 ); ψ 0 (x0 )u, ψ 0 (x0 )v).
From Proposition 1.2 and Proposition 1.13 we have that d2 (φ◦ψ)(x0 ; u, v) =
d2 φ(ψ(x0 ); ψ 0 (x0 )u, ψ 0 (x0 )v). Since d2 f (x0 ; h1 , .) is by (1.1) the support function of ∂ 2 f (x0 )[h1 ], we derive that following:
∂ 2 (φ ◦ ψ)(x0 ) = ∪{L[ψ 0 (x0 ), ψ 0 (x0 )] : L ∈ ∂ 2 φ(ψ(x0 ))}.
(1.2)
The following proposition is stated in [8] without proof. Here we include
the proof for completeness.
Proposition 1.14 Let I be an open interval containing [0,1] and let φ ∈
C 1,1 (I). Then
1
φ(1) − φ(0) − φ0 (0) ∈ ∂ 2 φ(t)
2
for some t ∈ (0, 1).
Proof. Define:
h(t) = φ(1) − φ(t) − φ0 (t)(1 − t) − (1 − t)2 λ, t ∈ [0, 1],
where λ = φ(1) − φ(0) − φ0 (0). So we have h(1) = h(0) = 0. Obviously h is
locally Lipschitz on [0,1]. There exists ξ ∈ (0, 1) such that either
1) ξ is a minimum of h over [0,1], or
2) ξ is a maximum of h over [0,1].
Let 1) be fulfilled. Then by the necessary condition for a local minimum
(see Clarke [4])
0 ∈ ∂h(ξ) = −∂φ0 (ξ)(1 − ξ) + 2(1 − ξ)λ.
Hence λ ∈ 12 ∂φ0 (ξ). But by Clarke [4, Theorem 2.5.1] and by definition of
∂ 2 φ we have ∂φ0 (ξ) = ∂ 2 φ(ξ).
7
Let 2) be fulfilled. Then ξ is a minimum point of the function −h over [0,1]
and since ∂(−h)(ξ) = −∂h(ξ) (Clarke [4]) we have 0 ∈ ∂(−h)(ξ) = −∂h(ξ),
1
so 0 ∈ ∂h(ξ) and as in case 1) we obtain λ ∈ ∂ 2 φ(ξ).
2
Using (1.2) and Proposition 1.14 for φ(x) = f (x) and ψ(t) = a+t(b−a) we
immediately obtain the second-order expansion (the same as in [8, Theorem
2.3]).
Proposition 1.15 Let f ∈ C 1,1 (G). Then for every a, b ∈ G, with [a, b] ⊂
G there exists c ∈ (a, b) and Lc ∈ ∂ 2 f (c) such that
1
f (b) = f (a) + f 0 (a)[b − a] + Lc [b − a, b − a].
2
We shall use essentially this proposition in the sequel.
2
Necessary and sufficient optimality conditions
We consider now the following constrained minimization problem:
P (E)
f0 (x) → min
x∈E
fi (x) ≤ 0 , 1 ≤ i ≤ m
F (x) = 0,
where F (x) = (g1 (x), ..., gk (x))T and the functions f0 , fi , 1 ≤ i ≤ m, gj , 1 ≤
j ≤ k are C 1,1 (E) functions. The Lagrangian function for P(E) is
L(x; λ, µ) =
m
X
λi fi (x) +
i=0
k
X
µj gj (x),
j=1
where (λ, µ) := (λ0 , ..., λm , µ1 , ..., µk ) ∈ Rm+1 × Rk are the Lagrange multipliers.
Denote by locminP (E) the set of all points of local minimum of P(E)
and define
Λ(x) = {(λ, µ) ∈ Rm+1 × Rk :
m
X
i=0
8
λi fi0 (x) +
k
X
j=1
µj gj0 (x) = 0,
m
X
λi fi (x) = 0, 1 ≤ i ≤ m λi ≥ 0, 0 ≤ i ≤ m,
λi = 1};
i=0
K(x) = {h ∈ E : fi0 (x)[h] ≤ 0, 0 ≤ i ≤ m, gj0 (x)[h] = 0 1 ≤ j ≤ k}.
Further we need the following facts.
Proposition 2.1 (Necessary condition, [1, Section 3.4.2]) Let x0 ∈ locminP (E)
and Im F 0 (x)= Rk . Then Λ(x0 ) is a non-empty convex compact set.
Lemma 2.2 (minimax, [1, Section 3.3.4]) Let A : E → Rk be a linear
continuous surjective operator, AE = Rk , x∗i ∈ E ∗ , 0 ≤ i ≤ m, y ∈ Rk , a ∈
Rm+1 ,
max x∗i [x] ≥ 0 ∀x ∈ Ker A.
(2.1)
0≤i≤m
Denote
S(a, y) =
max (ai + x∗i [x]).
inf
0≤i≤m
Ax+y=0
Then
m
X
a) S(a, y) = sup (
λ i ai +
(λ,µ)∈Λ i=0
k
X
µj yj ),
(2.2)
(2.3)
j=1
where
Λ = {(λ, µ) ∈ Rm+1 × Rk : λi ≥ 0,
m
X
λi = 1,
i=0
m
X
λi x∗i + A∗ µ = 0}.
i=0
b) inf in (2.2) and sup in (2.3) are attained.
Let x0 ∈ locminP (E). Without loss of generality we may assume in the
sequel that fi (x0 ) = 0, 0 ≤ i ≤ m (see the reduction in [1, Section 3.2.3].
Consider the problem
0
(P )
f (x) := max{f0 (x), ..., fm (x)} → min ; F (x) = 0.
It is easy to see the validity of the following.
Lemma 2.3 ([1, Section 3.4.2]) x0 ∈ locmin(P 0 ).
Now we can state the necessary optimality condition for P(E). The proof
uses ideas of the proof of [5, Theorem 9.1.2] (see also [1, Theorem 3.4.2]),
where the functions are assumed to be of class C 2 .
9
Theorem 2.4 Let in P(E) Im F 0 (x0 ) = Rk . If x0 ∈ locminP (E), then for
every h ∈ K(x0 ) there exist Li ∈ ∂ 2 fi (x0 ), 0 ≤ i ≤ m, Mj ∈ ∂ 2 gj (x0 ) 1 ≤
j ≤ k such that
max
m
X
(λ,µ)∈Λ(x0 )
(
λi Li [h, h] +
i=0
k
X
µj Mj [h, h]) ≥ 0.
j=1
Proof. By Proposition 2.1, the set Λ(x0 ) is non-empty.
Let us assume the contrary, i.e. there exists h ∈ K(x0 ) such that for
every Li ∈ ∂ 2 fi (x0 ) , 0 ≤ i ≤ m, and every Mj ∈ ∂ 2 gj (x0 ), 1 ≤ j ≤ k we
have
max
m
X
(λ,µ)∈Λ(x0 )
(
λi Li [h, h] +
i=0
k
X
µj Mj [h, h]) < 0.
(2.4)
j=1
It is clear that khk 6= 0. Let 0 < tn → 0 as n → ∞. From Proposition 1.15
we have
t2
gj (x0 + tn h) = gj (x0 ) + tn gj0 (x0 )[h] + n Mj,n [h, h]
2
2
t
= n Mj,n [h, h],
2
where Mj,n ∈ ∂ 2 gj (x0 + γj,n tn h) and γj,n ∈ (0, 1).
Since x0 +γj,n tn h → x0 , as n → ∞, and ∂ 2 gj are locally bounded and have
w∗ -closed graphs, we can chose w∗ -convergent subsequences from {Mj,n }n≥1 ,
whose w∗ -limits Mj are in ∂ 2 gj (x0 ) for 1 ≤ j ≤ k.
Analogously, having in mind that fi (x0 ) = 0 (see the remark before
Lemma 2.3),
fi (x0 + tn h) = fi (x0 ) + tn fi0 (x0 )[h] +
t2n
Li,n [h, h]
2
t2n
Li,n [h, h],
2
where Li,n ∈ ∂ 2 fi (x0 + ηi,n tn h), ηi,n ∈ (0, 1) and choose w∗ -convergent subsequences from {Li,n }n≥1 , whose w∗ -limits Li are in ∂ 2 fi (x0 ) for 0 ≤ i ≤ m.
For every ε > 0 there exists an integer N1 such that for every n ≥ N1 the
following inequalities are fulfilled:
= tn fi0 (x0 )[h] +
| Mj,n [h, h] − Mj [h, h] | < 2ε,
10
| Li,n [h, h] − Li [h, h] | < 2ε.
Hence for n ≥ N1 we have:
gj (x0 + tn h) =
t2n
Mj [h, h] + φj,n (ε),
2
(2.5)
where
| φj,n (ε) | < t2n ε f or
1 ≤ j ≤ k, n ≥ N1
and
fi (x0 + tn h) = tn fi0 (x0 )[h] +
t2n
Li [h, h] + ψi,n (ε),
2
(2.6)
(2.7)
where
| ψi,n (ε) | < t2n ε f or
0 ≤ i ≤ m, n ≥ N1 .
(2.8)
Denote x∗i = fi0 (x0 ), ai = 12 Li [h, h], 0 ≤ i ≤ m, yj = 12 Mj [h, h], 1 ≤
j ≤ k, y = (y1 , ..., yk ), A = F 0 (x0 ). From [1, Section 3.2.4, Lemma 1], the
condition (2.1) of Lemma 2.2 is fulfilled. Applying Lemma 2.2, we can find
ξ = ξ(h) ∈ E, such that
F 0 (x0 )ξ + y = 0
(2.9)
and
max (ai +
0≤i≤m
fi0 (x0 )[ξ])
=
max
m
X
(λ,µ)∈Λ(x0 )
(
λ i ai +
i=0
k
X
µj yj ) =: Ψ(h),
(2.10)
j=1
where Ψ(h) < 0 from (2.4).
Let l be the maximum of the Lipschitz constants for gj0 and fi0 in a neighborhood U1 of x0 . There exists a neighbourhood U2 ⊂ U1 of x0 and a constant
s such that gj0 and fi0 are norm bounded by s there.
Using the mean value theorem, (2.5) and (2.9), for large n we have:
gj (x0 + tn h + t2n ξ) = gj (x0 + tn h + t2n ξ) − gj (x0 + tn h) + gj (x0 + tn h)
t2n
Mj [h, h] + φj,n (ε)
2
= t2n gj0 (x0 + tn h + κj,n t2n ξ)[ξ] − t2n gj0 (x0 )[ξ] + φj,n (ε)
= t2n gj0 (x0 + tn h + κj,n t2n ξ)[ξ] +
≤ t2n lktn h + κj,n t2n ξkkξk + φj,n (ε)
≤ t3n l(khk + tn kξk)kξk+ | φj,n (ε) |=: θj,n (ε),
11
where κj,n ∈ (0, 1).
Using (2.6) we obtain
θj,n (ε) ≤ o(t2n ) + t2n ε,
∀j = 1, ..., k,
(2.11)
o(t2n )
→ 0.
t2n
Now we apply a generalization of the implicit function theorem (see [1,
Theorem 2.3.1]) and obtain that there exist a constant q and a map Φ : U →
Rk , where U ⊂ U2 is a neighbourhood of the point x0 , such that:
where
F (x + Φ(x)) = 0,
kΦ(x)k ≤ qkF (x)k,
∀x ∈ U,
Substitute r(tn ) = Φ(x0 + tn h + t2n ξ). Then for n sufficiently large, x0 + tn h +
t2n ξ ∈ U , gj (x0 + tn h + t2n ξ + r(tn )) = 0 for 1 ≤ j ≤ k, and by (2.11) we have
kr(tn )k ≤ qkF (x0 + tn h + t2n ξ)k
= q(
k
X
√
gj2 (x0 + tn h + t2n ξ))1/2 ≤ q k(o(t2n ) + t2n ε).
(2.12)
j=1
Again using the mean-value theorem, (2.7) and the fact that h ∈ K(x0 ), for
large n we obtain
fi (x0 +tn h+t2n ξ +r(tn )) = fi (x0 +tn h+t2n ξ +r(tn ))−fi (x0 +tn h)+fi (x0 +tn h)
= fi0 (x0 + tn h + νi,n (t2n ξ + r(tn ))[t2n ξ + r(tn )]+
t2n
+ Li [h, h] + ψi,n (ε)
2
2
≤ tn lktn h + νi,n (t2n ξ + r(tn ))k.kξk+
+tn fi0 (x0 )[h]
t2n
Li [h, h] + ψi,n (ε),
2
∈ (0, 1). Hence from (2.8), (2.10) and (2.12) we obtain
+skr(tn )k + t2n fi0 (x0 )[ξ] +
where νi,n
f (x0 + tn h + t2n ξ + r(tn )) := max fi (x0 + tn h + t2n ξ + r(tn ))
0≤i≤m
≤ max [t2n lktn h + νi,n (t2n ξ + r(tn ))k.kξk+
0≤i≤m
+skr(tn )k + t2n fi0 (x0 )[ξ] +
12
t2n
Li [h, h] + ψi,n (ε)]
2
≤ o1 (t2n ) + (t2n lkξk + s)kr(tn )k + t2n Ψ(h) + t2n ε
√
≤ o1 (t2n ) + (t2n lkξk + s)q k[o(t2n ) + t2n ε] + t2n Ψ(h) + t2n ε
√ o(t2n )
o1 (t2 )
= t2n [ 2 n + (t2n lkξk + s)q k
+
ε
+ Ψ(h) + ε],
tn
t2n
o1 (t2n )
where
→ 0.
t2n
Now, since Ψ(h) < 0, it is clear that if we chose ε to be sufficiently small,
then for sufficiently large n, we have
f (x0 + tn h + t2n ξ + r(tn )) < 0,
which is in contradiction with Lemma 2.3.
The second-order sufficient condition for the problem P(E) is the following.
Theorem 2.5 Let in P(E) fi (x0 ) = 0, 0 ≤ i ≤ m, Im F 0 (x0 ) = Rk ,
Λ(x0 ) 6= ∅ and there exists a constant α > 0 such that for every Li ∈
∂ 2 fi (x0 ) , 0 ≤ i ≤ m, Mj ∈ ∂ 2 gj (x0 ) , 1 ≤ j ≤ k we have
max
(λ,µ)∈Λ(x0 )
m
X
(
i=0
λi Li [h, h] +
k
X
µj Mj [h, h]) ≥ αkhk2 ∀h ∈ K(x0 ).
j=1
Then x0 is a strict local minimum (i.e. unique minimum in a neighborhood
of x0 ) of the problem P(X) for every finite-dimensional subspace X 3 x0 of
E. If, in addition, the functions fi , 0 ≤ i ≤ m are convex and the functions
gj , 1 ≤ j ≤ k are affine, then the problem P(X) has unique solution x0 . Also
in this case the problem P(E) has unique solution x0 .
Proof. We use ideas of [5, Theorem 10.1.1], where the functions are
assumed to be C 2 .
We shall show that for every finite-dimentional subspace X 3 x0 there
exists δ > 0 such that the conditions: h ∈ δB ∩ X with h 6= 0 and
fi (x0 + h) ≤ 0, 0 ≤ i ≤ m,
F (x0 + h) = 0,
(2.13)
where B is the unit ball in E, are inconsistent. From this we will obtain
immediately the desired conclusion.
13
Let us assume the contrary: there exists a finite-dimentional subspace
X 3 x0 , such that for every δ > 0 there exists a nonzero hδ ∈ δB ∩ X such
that the conditions (2.13) are fulfilled.
Denote
[
H=
hδ .
δ>0
Let h ∈ H. From Proposition 1.15 we have:
1
fi (x0 + h) = fi0 (x0 )[h] + Li (x0 + ηi h)[h, h],
2
2
where ηi ∈ (0, 1), Li (x0 + ηi h) ∈ ∂ fi (x0 + ηi h), 0 ≤ i ≤ m;
1
gj (x0 + h) = gi0 (x0 )[h] + Mj (x0 + γj h)[h, h],
2
(2.14)
(2.15)
where
γj ∈ (0, 1), Mj (x0 + γj h) ∈ ∂ 2 gj (x0 + γj h), 1 ≤ j ≤ k.
Since Λ(x0 ) 6= ∅, it follows from the proof of Theorem in [1, Section 3.4.2]
that it is a compact and, therefore, there exists a constant c1 , such that for
Pk
every (λ, µ) ∈ Λ(x0 ) we have
i=1 | µi |≤ c1 , where µ = (µ1 , ..., µk ).
Substitute
x∗i = fi0 (x0 )
1
ai = Li (x0 + ηi h)[h, h], 0 ≤ i ≤ m, A = F 0 (x0 ) (recall that F (x) =
2
(g1 (x), ..., gk (x))),
1
yj = Mj (x0 + γj h)[h, h], 1 ≤ j ≤ k, y = (y1 , ..., yk ),
2
f (x) = max fi (x)
0≤i≤m
From (2.14) and (2.15) we obtain that (2.13) is equivalent to
x∗i [h] + ai = fi (x0 + h) ≤ 0, 0 ≤ i ≤ m, Ah + y = 0.
(2.16)
Since ∂ 2 fi and ∂ 2 gj are locally bounded, we have
∃c2 > 0, ∃δ1 > 0 : x ∈ B(x0 , δ1 ), Li ∈ ∂ 2 fi (x), 0 ≤ i ≤ m,
Mj ∈ ∂ 2 gj (x), 1 ≤ j ≤ k ⇒ kLi k < c2 , kMj k < c2
(2.17)
and if we define x∗i [h]+ := max{x∗i [h], 0}, then
khk < δ1 ⇒ x∗i [h]+ ≤| ai |≤
c2
c2
khk2 , kAhk = kyk ≤ khk2 .
2
2
14
(2.18)
For (λ, µ) ∈ Λ(x0 ) we have
m
X
λi x∗i [x] +
i=0
Hence
m
X
k
X
µj gj0 (x0 )[x] = 0 ∀x ∈ E.
j=1
λi x∗i [x] = 0, ∀x ∈ Ker A, and therefore
i=0
max x∗i [x] ≥ 0, ∀x ∈ Ker A,
0≤i≤m
which is the condition (2.1) of Lemma 2.2. From (2.16) and Lemma 2.2 we
obtain
max fi (x0 + h) = max (x∗i [h] + ai )
0≤i≤m
≥
=
0≤i≤m
min max (x∗i [x] + ai )
Ax+y=0 0≤i≤m
(2.19)
m
k
X
1 X
( λi Li (x0 + ηi h)[h, h] +
µj Mj (x0 + γj h)[h, h]).
(λ,µ)∈Λ(x0 ) 2
i=0
j=1
max
We estimate the distance from h to the cone K(x0 ) by Hoffman’s lemma ([1,
Section 3.3.4]) and after that by (2.18):
m
X
d(h, K(x0 )) := inf{kh − yk, y ∈ K(x0 )} ≤ c(
x∗i [h]+ + kAhk) < c3 khk2
i=0
for some constant c, which does not depend on h; c3 := c.c2 .
Hence h can be represented in the type h = h0 + h00 , where
h0 ∈ K(x0 ), kh00 k < c3 khk2 .
If khk <
(2.20)
1
, then
2c3
1
khk < (1 − c3 khk)khk ≤ kh0 k ≤ khk(1 + c3 khk) ≤ 2khk.
2
(2.21)
Since ∂ 2 fi and ∂ 2 gj are w∗ -upper semicontinuous and locally bounded,
and since the w∗ -compact sets in separable dual spaces are sequentially w∗ compact (from Remark 1.4), we can find (applying Alaoglou-Bourbaki theorem) a sequence {hn } ⊂ H, khn k → 0 and elements Li ∈ ∂ 2 fi (x0 ), Mj ∈
∂ 2 gj (x0 ) such that
w∗
Li (x0 + ηi hn ) −→ Li ,
15
w∗
Mj (x0 + γj hn ) −→ Mj .
Hence there exists ν such that for every h ∈ {hn }n≥ν we have
| Li (x0 + ηi h)[h, h] − Li [h, h] |<
αkhk2
,
16
(2.22)
αkhk2
(2.23)
16c1
(here we use the fact that X is finite-dimensional i.e. the restrictions of
Li (x0 + ηi hn ) and Mj (x0 + γj hn ) to X × X converge in the norm topology
respectively to the restrictions of Li and Mj to X × X, when n → ∞).
1
Then for every h ∈ {hn }n≥ν , with khk < min{
, δ1 }, having in mind
2c3
that h = h0 + h00 , where h0 ∈ K(x0 ) and h00 satisfy (2.20) and (2.21), and
using (2.19), (2.22) and (2.23) we obtain:
| Mj (x0 + γj h)[h, h] − Mj [h, h] |<
0 ≥ f (x0 + h)
≥
m
k
X
1 X
λi Li (x0 + ηi h)[h, h] +
µj Mj (x0 + γj h)[h, h]
(λ,µ)∈Λ(x0 ) 2
i=0
j=1
max
≥−
≥−
k
m
X
αkhk2
1 X
µj Mj [h, h]
λi Li [h, h] +
+ max
(λ,µ)∈Λ(x0 ) 2
16
j=1
i=0
k
m
X
αkhk2
1 X
µj Mj [h0 , h0 ]+
λi Li [h0 , h0 ] +
+ max
(λ,µ)∈Λ(x
)
16
0 2
j=1
i=0
+
m
X
λi Li [h0 , h00 ] +
k
X
µj Mj [h0 , h00 ] +
αkhk
α
≥−
+ kh0 k2 −
16
2
+
k
X
k
X
λi Li [h00 , h00 ]+
i=0
µj Mj [h00 , h0 ] +
k
X
µj Mj [h00 , h00 ]
j=1
m
X
m
1X
λi kLi kkh kkh k +
λi kLi kkh00 k2
2 i=0
i=0
0
| µj | kMj kkh0 kkh00 k +
j=1
≥−
m
X
j=1
j=1
2
λi Li [h00 , h0 ] +
i=0
i=0
+
m
X
00
k
1X
| µj | kMj kkh00 k2
2 j=1
αkhk2 α
1
+ khk2 − c2 kh0 kkh00 k(1 + c1 ) − c2 kh00 k2 (1 + c1 )
16
8
2
16
i
1
− 2c2 c3 (1 + c1 )khk − c2 c23 (1 + c1 )khk2 .
16
2
The last expression is positive, when h is sufficiently small. This is a
contradiction.
Therefore x0 is a strict local minimum of P(X). When fi , 0 ≤ i ≤ m are
convex and gj , 1 ≤ j ≤ k are affine functions, then the Lagrange function
is convex and the admissible set is convex, therefore the local minimum is
global. Obviously x0 is a strict global minimum of the problem P(E) too.
The following sufficient condition is a modification of that one in [1, Section 3.4.3].
≥ khk2
hα
Theorem 2.6 Let in P(E) fi (x0 ) = 0, 1 ≤ i ≤ m, ImF 0 (x0 ) = Rk , there
exists a number α > 0 and Lagrange’s multipliers (λ, µ) ∈ Rm+1 × Rk , such
that λ0 = 1, λi > 0, 1 ≤ i ≤ m,
L0x (x0 ; λ, µ) = f00 (x0 ) +
m
X
λi fi0 (x0 ) +
k
X
µj gj0 (x0 ) = 0
(2.24)
j=1
i=1
and for all L ∈ ∂ 2 L(x0 ; λ, µ)
L[h, h] ≥ 2αkhk2 , ∀h ∈ C(x0 ),
(2.25)
where C(x0 ) = {h ∈ E : fi0 (x0 )[h] = 0, 1 ≤ i ≤ m, F 0 (x0 )[h] = 0}.
Then x0 is a local minimum of the problem P(X) for every finite dimensional subspace X 3 x0 .
Proof.
We shall follow the proof of Theorem in [1, Section 3.4.3].
Instead of the usual Taylor expansion, we use Proposition 1.15.
Assume the contrary, i.e. there exists a finite dimentional subspace X 3
x0 , such that for every δ > 0 there exists hδ ∈ X ∩ δB (B is the closed unit
ball in E) such that
fi (x0 + hδ ) ≤ 0, 1 ≤ i ≤ m, F (x0 + hδ ) = 0, and f0 (x0 + hδ ) < f0 (x0 ).
Denote H :=
Since
m
X
[
hδ and let h ∈ H, h 6= 0.
δ>0
λi fi (x0 + h) ≤ 0, from Proposition 1.15 applied for the La-
i=1
grangian function L we have:
1
f0 (x0 + h) ≥ L(x0 ; λ, µ) + L(x0 + ηh h)[h, h],
2
17
where L(x0 + ηh h) ∈ ∂ 2 L(x0 + ηh h; λ, µ), ηh ∈ (0, 1). The local boundedness
of ∂ 2 L(.; λ, µ) allows us to find a sequence {hn } ⊂ H, khn k → 0, such that
Ln := L(x0 + ηhn hn ) is w∗ -convergent to some L ∈ ∂ 2 L(x0 ; λ, µ) (by the
w∗ -upper semicontinuity of ∂ 2 L(.; λ, µ)). Since X is finite dimensional, the
restrictions of Ln to X × X converge to the restriction of L to X × X in the
norm topology.
So we have
α
1
f0 (x0 + hn ) ≥ f0 (x0 ) + L[hn , hn ] − khk2 , ∀n > ν
2
2
(2.26)
for some ν.
Further the proof is the same as the proof of Theorem in [1, Section 3.4.3].
The only difference is that the inequality (7) from [1, p. 191] is replaced by
(2.26). In such a way, for n > ν, we have f0 (x0 + hn ) ≥ f0 (x0 ), which is a
contradiction.
The authors are very grateful to the referees for their valuable remarks
and suggestions, and esspecially for observing the new proof of Proposition
1.2.
References
[1] Alekseev, V.M., Tykhomirov, V.M. and Fomin, S.V.: Optimal Control,
Contemporary Soviet Mathematics, R. Gamkrelidze (ed.), Consultants
Bureau, New York and London, 1987.
[2] Aubin, J.-P. and Frankowska, H.: Set-Valued Analysis, BirkhauserVerlag, Boston, 1990.
[3] Christensen, J.P.R.: Topology and Borel Structure, North-Holland Mathematics Studies 10, Notas de Matematica (51), L. Nachbin (ed.), NorthHolland, American Elservier, Amsterdam, London, New-York, 1974.
[4] Clarke, F.H.: Optimization and Nonsmooth Analysis, John Wiley and
Sons, New York, 1983.
[5] Galeev, E.M. and Tykhomirov, V.M.: A Brief Course of the Theory of
Extremal Problems, Moscow state Univ., 1989 (in Russian).
18
[6] Georgiev, P.G. and Zlateva, N.P.: Second subdifferentials of C 1,1 functions: optimality conditions and well posedness, Compt. Rend. Acad.
Bulg. Sci. 46, 11(1993), 25-28.
[7] Holmes, R.B.: Geometric Functional Analysis and its Application, Graduate Text in Mathematics 24, Springer-Verlag, New-York, 1975.
[8] Hiriart-Urruty, J.B., Strodiot, J.J. and Nguyen, V.H.: Generalized Hessian matrix and second-order optimality conditions for problems with
C1,1 data, Appl. Math. Optim. 11 (1984), 43-56.
[9] Hiriart-Urruty, J.B.: Characterizations of the plenary hull of the generalized Jacobian matrix, Math. Programming Study, 17 (1982), 1-12.
[10] Ioffe, A.D.: On some recent developments in the theory of second order
optimality conditions, in S.Dolecki (ed.), Optimizatin, Proc. 5th FrenchGerman Conf. Varetz / Fr 1988, Lect. Notes Math. 1405, SpringerVerlag, Berlin, 1989, pp. 55-68.
[11] Ioffe, A.D.: Composite optimization: Second order conditions, value
functions and sensitivity, in Analysis and optimization on systems, Proc.
9th Int. Conf., Antibes / Fr. 1990, Lecture Notes Control Inf. Sci.144
1990, pp. 442-451.
[12] Penot, J.P.and Michael, P.: Second order moderate derivatives, Nonlinear analysis Theory, Meth. and Appl. 22, (1994), 809-823.
[13] Phelps, R.R.: Convex Functions, Monotone Operators and Differentiability, Lecture Notes in Math. 1364, Springer-Verlag, Berlin, 2nd ed.,
1993.
[14] Rockafellar, R.T.: First- and second-order epi-differentiability in nonlinear programming, Trans. Amer. Math. Soc. 307 (1988), 75-108.
[15] Rockafellar, R.T.: Generalized second derivatives of convex functions
and saddle functions, Trans. Amer. Math.Soc. 322 (1990), 51-77.
[16] Thibault, L.: On generalized differentials and subdifferentials of Lipschitz vector-valued functions, Nonlinear Analysis Theory, Meth. and
Appl. 6 (1982), 1037-1053.
19
[17] Yang, X.: Second-order conditions in C 1,1 optimization with applications, Numer. Funct. Anal. Optim. 14 (1993), 621-633.
20