on a theorem of danskin with an application to a - Sophia

ON A THEOREM OF DANSKIN
WITH AN APPLICATION TO A
THEOREM OF VON NEUMANN-SION
Pierre BERNHARD and Alain RAPAPORT
INRIA Sophia Antipolis
august 1992
Abstract Several versions of Danskin’s theorem, which deals with the derivative (or sub¯
differential) of the upper envelope J(u)
= supv J(u, v) of a family of functions, are given.
Some versions do not require compacity of the set over which the maximizing variable v
ranges. It is then shown that the theorem of Von-Neumann Sion can be seen as a simple
consequence of a convex-concave version of Danskin’s theorem.
0. Introduction
0.1. The problem considered and related work
In a book published in 1967 [7], Danskin proves the following theorem.
Hypotheses. Let V be a compact topological space, and J a map from IRn × V into IR,
assumed to be jointly continuous, and C 1 w.r.t. the first variable. Let
¯
J(u)
= max J(u, v),
v∈V
and
¯
V̂ (u) = {v ∈ V |J(u, v) = J(u)}.
The theorem is as follows:
Theorem 0. (Danskin) The function J¯ has, for every u and h in IRn a directional derivative at u in the direction h given by
¯ h) = max
DJ(u;
v∈V̂
n
X
hi Ji (u, v),
i=1
where Ji stands for the partial derivative w.r.t. the component ui of u.
Let D1 J(u, v; h) represents the directional partial derivative of J w.r.t. its first variable
in the direction h, then that the above formula can be written:
¯ h) = max D1 J(u, v; h).
DJ(u;
v∈V̂
1
(0)
Since 1967, much work has been devoted to improve this result, or to related ones.
There have been two main directions of research, one in the domain of convex analysis,
and the other for non-convex non-differentiable functions.
Early work in the first area are described in Valadier’s contribution [15]. A recent
account can be found in [2], or [1]. This last reference, for instance, contains in the
Theorem 4.4 p 53 exactly as in our theorem C1 below. Most of the literature has been
in the infinite dimension, so as we do. However, there has been little work carried out
the compacity assumption. Valadier’s work is a notable exception, and although rarely
quoted, supersedes many later accounts. We applied his formula to the Von NeumannSion theorem. While M. Valadier needed to look at the subdifferential at neighboring
points, as well as the generalized subdifferentials work quoted in the next paragraph, we
propose instead a set of hypotheses with more regularity (mainly uniformity) (Furthermore
the simple “convex-concave” result we obtained, somewhat ad hoc for the application to
the problem of Von Neumann, does not seem to have been pointed out before).
Altough early work needed the differentiable hypothesis (A form can be found, for
instance, in [8], lemma 15.1, p 53) later work concentrated on the use of generalized
(sub)differentials, such as Clarke’s. A typical case of such results can be found in [11].
These results are always in finite dimensional spaces, and led to estimations of a generalized
subdifferential (generally Clarke’s subdifferential) of the upper envelope, i.e. supersets,
while we concentrate on exact expressions, in infinite dimension spaces. More importantly,
this work proves the existence of a generalized subdifferential while we have results giving
the existence of ordinary directional derivatives. The price to be paid is that our results
require more regularity, (again, uniformity), and in particular do not deal in detail with
infinite slopes, and singular subdifferentials. Again, we give results without compactness,
particularly useful in the context of infinite dimensional spaces, that do not seem to have
been considered before.
Related work on the so-called sensitivity optimization function, where what is sought
is the derivative of a constrained max (or min) with respect to a variable occuring in the
constraint of the optimization problem, should also be mentioned. Typically, the derivative
ˆ
or subdifferential of J(u)
is defined by
Ṽ (u) = {v ∈ V | A(u, v) ≤ 0},
or, more specifically
Ṽ (u) = {v ∈ V | A(v) ≤ u},
and
ˆ
J(u)
= max J(v).
v∈Ṽ (u)
The two problems are very closely connected: one simple way of observing this is to
rewrite the latter as follows. Let
˜ v) = J(v) − χ
J(u,
Ṽ (u) (v),
ˆ
˜ v)
J(u)
= max J(u,
v∈V
2
which is the form of problem considered. (Although this identification yields results in
convex analysis, not in differentiable analysis).
Results on this problem most often fall into the second category above, typical examples being [6] and [9]. Let us quote also the earlier paper [10], which dealt with the more
general problem where the variable u enters both the function to be maximized and the
constraint. This paper falls into the category mentioned : finite dimension, and inclusions,
very much in the spirit of those in the later papers quoted, which improve the results by
weakening the regularity hypotheses, and by treating the singular subdifferential.
A somewhat different issue was taken up by [5], who was able to relax all qualification hypotheses for the constraint, at the cost of having directional derivatives in certain
directions only. This is still in finite dimension.
Finally, let us quote the notable exception of [14], which deals with the same general
problem as [10], obtaining estimations of generalized subdifferentials, including the singular
case, but in infinite dimension. However, his results depended on a hypothesis on the
regularity of the solution of the optimization problem, and the only sufficient conditions
we know to satisfy this hypothesis have been stated in a finite dimensional space. (Although
we could not check one of his references, to Dolecki, “to appear”).
Finally, the link we show between these results and the Von Neumann-Sion theorems
seems to be new.
This paper is based on a previous internal report [4]. A specialized version of the
“differentiable” theorem can be found in [3], chapter 9.
0.2 General framework
The following framework holds throughout the paper, and will not be repeated in the
sequel.
U and V are subsets of a Banach space U and a topological space V respectively. J
is a mapping from U × V into IR. The directional derivative of u 7→ J(u, v) in a direction
h of U is denoted by D1 J(u, v; h), and its subdifferential, in the case of a convex function,
by ∂1 J(u, v). Let also be:
¯
J(u)
= sup J(u, v),
(1)
v∈V
and, when it exists
¯
V̂ (u) = {v ∈ V J(u, v) = J(u)}.
It will be also needed to consider:
¯
W(u) = {vn } J(u, vn ) → J(u)
as n → ∞
(2)
the set of maximizing sequences {vn } at u.
¯ h) of J,
¯ or, when it is
Our aim is to characterize the directional derivatives DJ(u;
¯
convex, its subdifferential ∂ J(u).
1. The differentiable case
1.1 V compact
Firstly, a slightly improved version of theorem 0 above is stated.
3
Hypotheses D1.
D1.0 V is compact.
D1.1 ∀v ∈ V , the application (t, v) 7→ J(u + th, v) is upper semi-continuous (u.s.c.) at
(0, v).
D1.2 ∀v ∈ V and ∀t in a right neighborhood of 0, there exists a bounded directional
derivative
1
D1 (u + th, v; h) = lim
J u + (t + τ )h, v − J(u + th, v) ,
τ →0+ τ
D1.3 moreover, the map (t, v) 7→ D1 J(u + th, v) is upper semi-continuous at (0, v).
Theorem D1. Under hypotheses D.1, the function J¯ has a directional derivative at u in
the direction h, given by the formula
¯ h) = max D1 J(u, v; h).
DJ(u;
v∈V̂ (u)
Proof. It should be noticed that by the assumption D1.3, the map v 7→ D1 J(u, v; h) is
u.s.c., so that, V being compact, the maximum is reached.
Let, for convenience,
1 ¯
¯
+ th) − J(u)
.
(1.1)
∆(t) = J(u
t
Proposition 1. One has
lim inf ∆(t) ≥ max D1 J(u, v; h).
t→0
v∈V̂ (u)
¯
¯ + th) ≥
Proof of the proposition. Let v̂ ∈ V̂ (u). By definition, J(u)
= J(u, v̂), and J(u
J(u + th, v̂). Thus
1
∆(t) ≥ J(u + th, v̂) − J(u, v̂) .
t
Taking the liminf,
lim inf ∆(t) ≥ D1 J(u, v̂),
t→0
and since this holds for any v̂ in V̂ (u), the proposition is proved.
Proposition 2. Let {tn } be a sequence of real positive numbers, going to zero, and for
all n, vn ∈ V̂ (u + tn h). Then
vn → V̂ (u),
and
¯
J(u + tn h, vn ) → J(u).
(The map t 7→ V̂ (u + th) is said to be u.s.c. at 0.)
Proof of the proposition. Proposition 1 implies that ∆(tn ) is bounded below : ∃a such
that, for n large enough, ∆(tn ) ≥ a. Thus also
¯ + tn h) ≥ J(u)
¯ + atn .
J(u
4
¯ + tn h) ≥ J(u).
¯
Therefore, lim inf J(u
Now, V is compact. Let therefore v̄ be a cluster
point of the sequence {vn }. One has
¯
¯
J(u)
≥ J(u, v̄) ≥ lim sup J(u + tn h, vn ) ≥ lim inf J(u + tn h, vn ) ≥ J(u).
¯ the second one from the hypothesis
The first inequality follows from the definition of J,
D1.1, the third one from what has just been mentioned. Thus all inequalities are equalities,
from which it can be concluded that v̄ ∈ V̂ (u), and the existence of the limit lim J(u +
¯
tn h, vn ) = J(u).
Proposition 3.
lim sup ∆(t) ≤ max D1 J(u, v; h).
v∈V̂ (u)
t→0
Proof of the proposition. With the same notations as in proposition 2, one has
∆(tn ) =
1
1
¯
J(u + tn h, vn ) − J(u, vn ) +
J(u, vn ) − J(u)
.
tn
tn
¯ the second term is nonpositive, hence
By definition of J,
∆(tn ) ≤
1
J(u + tn h, vn ) − J(u, vn ) .
tn
The function t 7→ J(u + th, vn ) having for all t ∈ [0, tn] a bounded directional derivative,
it is absolutely continuous, and there exists t′n ∈ [0, tn ] such that
D1 J(u + t′n h, vn ; h) ≥
1
J(u + tn h, vn ) − J(u, vn ) ,
tn
hence ∆(tn ) ≤ D1 J(u + t′n h, vn ; h). Due to the hypothesis D1.3, taking the limsup
lim sup ∆(tn ) ≤ D1 J(u, v̄; h),
where v̄ is any cluster point of the sequence {vn }. Using proposition 2, v̄ ∈ V̂ (u), and thus
a fortiori the result claimed.
Finally, propositions 1 and 3 together prove the theorem.
Corollary. If u 7→ J(u, v) has a Gâteaux derivative Ju′ , and if the max is unique : V̂ (u) =
{v̂}, then J¯ has a Gâteaux derivative J¯′ (u) given by the simple formula
J¯′ (u) = Ju′ (u, v̂).
Proof. It follows from theorem 1 that, since D1 J(u, v; h) = Ju′ (u, v).h, then
¯ h) = J ′ (u, v̂).h .
DJ(u;
u
This equality proves the claim.
1.2 Uniform case
The compacity hypothesis on V can be traded for more regularity on J, for instance in
the following way (u and h are as in hypothesis D1).
5
Hypotheses D2.
D2.1 The map u 7→ J(u, v) is uniformly directionally differentiable in the following sense
:
1
∀ǫ > 0, ∃τ > 0 : ∀t ∈ (0, τ ), ∀v ∈ V, J(u + th, v) − J(u, v) − D1 J(u, v; h) ≤ ǫ.
t
D2.2 The directional derivative D1 J(u + th, v; h) is bounded in a right neighborhood of 0
in t, uniformly in v ∈ V .
D2.3 The map t 7→ D1 J(u + th, v; h) is u.s.c. at 0, uniformly in v ∈ V .
Remark. Hypotheses D2.1 and D2.3 may be lumped into any of the following two stronger
hypotheses :
D2.a The map u 7→ J(u, v) is uniformly directionally differentiable in the stronger following sense : for λ > 0, write uλ = u + λh. The hypothesis reads
∃θ > 0 : ∀ǫ > 0, ∃τ > 0 : ∀t ∈ (0, τ ), ∀λ < θ, ∀v ∈ V,
1
J(uλ + th, v) − J(uλ , v) − D1 J(uλ , v; h) ≤ ǫ.
t
D2.b At point u, J has a second directional derivative with respect to its first variable in
the direction h, uniformly bounded in v.
Theorem D2. Under hypotheses D2, for all t in a left neighborhood of 0, there exits
¯ + th) < ∞, and J¯ has a directional derivative in the direction h, given by
J(u
¯ h) =
DJ(u;
sup
lim sup D1 J(u, vk ; h).
{vk }∈W(u) k→∞
Remark. It could be agreed, with no ambiguity, to simply write the r.h.s. above as
¯ h) = lim sup D1 J(u, vk ; h).
DJ(u;
{vk }∈W(u)
Proof. Let us call D the r.h.s. of the above equality, and let us define ∆(t) as in (1.1).
In the sequel, it has been selected two sequences {tn } and {ǫn } of positive numbers such
that tn → 0 and ǫn /tn → 0 as n → ∞. (say, e.g., ǫn = t2n ).
Proposition 1. lim inf ∆(tn ) ≥ D.
Proof of the proposition. Let δ be a positive integer. Choose N such that ∀n > N ,
δ
ǫn
< ,
tn
3
and
∀v ∈ V,
1
δ
J(u + tn h, v) − J(u, v) ≥ D1 J(u, v; h) − .
tn
3
6
(i)
(ii)
This is possible due to hypothesis D2.1. Let also {vk } ∈ W be a maximizing sequence at
u,
¯ − ǫn .
∀n, ∃Kn : ∀k > Kn , J(u, vk ) ≥ J(u)
Hence, ∀n > N, ∀k > Kn ,
∆(tn ) ≥
ǫn
δ
1 ¯
1
J(u + tn h) − J(u, vk ) −
≥
J(u + tn h, vk ) − J(u, vk ) − .
tn
tn
tn
3
By ii), ∀k > Kn , ∆(tn ) ≥ D1 J(u, vk ) − 2δ/3. Let k go to infinity to conclude that
∆(tn ) ≥ lim sup D1 J(u, vk ; h) − 2δ/3.
But since {vk } is an arbitrary maximizing sequence, it may be chosen such that
lim sup D1 J(u, vk ; h) ≥ D − δ/3.
This way it gives ∀n > N , ∆(tn ) ≥ D − δ, and this proves the proposition given that δ
was an arbitrary positive number.
Proposition 2. Let {vn } be a sequence in V such that
∀n > 0,
¯ + tn h) − ǫn .
J(u + tn h, vn ) ≥ J(u
(1.2)
Then {vn } ∈ W(u).
Proof of the proposition. One has
¯
J(u)
≥ J(u, vn ) ≥ J(u + tn h, vn ) − tn D1 J(u, vn ; h) − tn ηn ,
where ηn → 0 by the hypothesis D2.1. D1 J being bounded by the hypothesis D2.2, it also
gives
¯
J(u)
≥ J(u, vn ) ≥ J(u + tn h, vn ) − δn ,
δn → 0.
Making use of the definition (1.2) of the sequence {vn }, and of proposition 1 that implies
¯ + tn h) ≥ J(u)
¯ + γn where γn → 0, finally it gives
that J(u
¯
¯ − δn − ǫn + γn → J(u),
¯
J(u)
≥ J(u, vn ) ≥ J(u)
which proves the proposition.
Proposition 3. lim sup ∆(tn ) ≤ D.
Proof of the proposition. The sequence {vn } is still as in (1.2). By definition, one has
∆(tn ) ≤
ǫn
ǫn
1
1
¯
J(u + tn h, vn ) − J(u)
+
J(u + tn h, vn ) − J(u, vn ) + .
≤
tn
tn
tn
tn
(1.3)
As in the proof of theorem D1, by hypothesis D2.2, there exists t′n ∈ [0, tn ] such that
7
D1 J(u + t′n h, vn ; h) ≥
1
J(u + tn h, vn ) − J(u, vn ) .
tn
Moreover, making use of hypothesis D2.3, for n large enough
D1 J(u + t′n h, vn ; h) ≤ D1 J(u, vn ; h) + ηn ,
ηn → 0,
so that, making further use of (1.3)
∆(tn ) ≤ D1 J(u, vn ; h) +
ǫn
+ ηn ,
tn
and taking a limsup
lim sup ∆(tn ) ≤ lim sup D1 J(u, vn ; h) ≤ D.
This proves the proposition, because due to proposition 2, {vn } ∈ W(u).
Finally, propositions 1 and 3 together prove the theorem.
2. The convex case
Versions in convex analysis of the preceding two theorems are now given. They are closely
connected to them by the remark that for a convex function f , the map h 7→ Df (u; h) is
the support function of its subdifferential ∂f (u). Thus the two theorems with compacity
have identical conclusions. However, slight differences in the regularity requirements seem
to prevent the “convex” theorems from being strict corollaries of the “differentiable” ones.
As it has been pointed out, the first theorem below is not new. (See [15], [1].) The
proof given below is not as elegant as in these references. It has been chosen on the one
hand to parallel the proofs in the differentiable case, and on the other hand to prepare the
stage for the proof of the theorem without compacity, which seems to be original.
2.1 V compact
Hypotheses C.
C0 V is (sequentially) compact in a topology for which ∀u ∈ U , the map v 7→ J(u, v) is
u.s.c.
C1 U is convex and ∀v ∈ V , the function u 7→ J(u, v) is convex. Let denote as ∂1 J(u, v)
its subdifferential.
C2 There exists u0 ∈ U , a neighborhood Ũ of u0 and a real number a such that ∀(u, v) ∈
Ũ × V , J(u, v) ≤ a.
¯
Let notice that hypothesis C2 implies that, ∀ũ ∈ Ũ , J(ũ)
≤ a. Therefore it will be
introduced the following definition :
Definition. Let U0 be the interior of the subset of U where J¯ is finite.
8
Lemma 1. In the presence of the hypothesis C1, the hypothesis C2 is equivalent to the
following hypothesis C2a :
C2a Let u0 ∈ U . There exists a (bounded) neighborhood Ũ of u0 and a real number b
such that
∀ũ ∈ Ũ , ∀v ∈ V, ∃p̃ ∈ ∂1 J(ũ, v) with kp̃k ≤ b
¯ 0 ) ≤ ∞ imply C2.
Proof of the lemma. Let us show that the hypothesis C2a and J(u
Let η be such that ũ ∈ Ũ imply kũ − uk ≤ η. One has, ∀(ũ, v) ∈ Ũ × V , and with
p̃ ∈ ∂1 J(ũ, v), chosen such that kp̃k ≤ b,
¯ 0 ) ≥ J(u0 , v) ≥ J(ũ, v) − (p̃, ũ − u0 )
J(u
hence
¯ 0 ) + (p̃, ũ − u0 ) ≤ J(u
¯ 0 ) + bη.
J(ũ, v) ≤ J(u
The converse is elementary, taking Ũ in C2a strictly included in Ũ in C2.
Theorem C1. Under the hypotheses C, the function J¯ is convex continuous over U0 , and
its subdifferential at u ∈ U0 is given by the formula
¯
∂ J(u)
= co
[
∂1 J(u, v).
v∈V̂ (u)
Proof. Let us first notice that being the upper envelope of a family of convex functions,
J¯ is itself convex. According to C2, it is bounded in a neighborhood of u0 , and thus also
over U0 , providing a uniform upperbound of J(u, v) in the neighborhood of every point of
U0 . Thus ∂ J¯ and ∂1 J exist over that set, and by compacity of V , V̂ (u) exists, so that the
above formula has a meaning.
Notice also that the classical proof of the continuity of a locally bounded convex
function also proves the uniformity in v of the continuity of u 7→ J(u, v), since the upper
bound is uniform. Then it easily follows, making use of C0, that the map (u, v) 7→ J(u, v)
is u.s.c.
Proposition 1. One has
¯
∂ J(u)
⊃ co
[
∂1 J(u, v).
v∈V̂ (u)
Proof of proposition 1. Let v̂ ∈ V̂ (u), and p ∈ ∂1 J(u, v̂). Then
∀w ∈ U,
¯ + (p, w − u),
J(w, v̂) ≥ J(u, v̂) + (p, w − u) = J(u)
and thus
∀w ∈ U,
¯
¯ + (p, w − u),
J(w)
≥ J(u)
¯
¯
i.e., p ∈ ∂ J(u).
Since v̂ was arbitrary in V̂ , and p ∈ ∂1 J(u, v̂), which infers that ∂ J(u)
includes the union of the subdifferentials ∂1 J. Finally, a subdifferential being convex, this
proves the proposition.
9
Proposition 2. Let h ∈ U −u and tn → 0+ (or tn ց 0) when n → ∞, and vn ∈ V̂ (u+tn h).
Then vn → V̂ (u).
Proof of proposition 2. Since V is compact, the sequence vn has at least one cluster
point v̄. Let v̂ ∈ V̂ (u). This gives :
¯
J(u, v̄) ≥ lim sup J(u + tn h, vn ) ≥ lim sup J(u + tn h, v̂) = J(u, v̂) = J(u).
The first inequality because of the semicontinuity of J, the second by the definition of
vn . The continuity of J in u and the definition of v̂ give the two equalities. Therefore
¯
J(u, v̄) = J(u),
and this proves the proposition.
Proposition 3. Let h, tn and vn be as above, and pn ∈ ∂1 J(u + tn h, vn ). There exists
v̂ ∈ V̂ (u) such that
lim sup(pn , h) ≤
sup
(p, h) = D1 J(u, v̂; h).
p∈∂1 J(u,v̂)
Proof of the proposition. Let L = lim sup(pn , h), and pm be a subsequence such that
(pm , h) → L. Let also vm ∈ V̂ (u + tm h), and again a subsequence with index k such that
vk → v̂ ∈ V̂ (u). Let us write D = D1 J(u, v̂; h).
Let ǫ > 0 be fixed. The slope [J(u + th, v̂) − J(u, v̂)]/t being, for a convex function,
deacreasing as t decreases to 0, this function has a directional derivative, and
∃τ > 0 : ∀t < τ,
J(u + th, v̂) < J(u, v̂) + t(D + ǫ).
(2.1)
On the other hand, one always has
∀t,
J(u + tk h + th, vk ) ≥ J(u + tk h, vk ) + t(pk , h) ≥ J(u + tk h, v̂) + t(pk , h).
Taking the limsup, and taking into account the fact that J is u.s.c.,
J(u + th, v̂) ≥ J(u, v̂) + t lim sup(pk , h) = J(u, v̂) + tL.
Comparing this last inequality with (2.1), for t < τ , it gives L < D + ǫ, and this proves
the proposition.
Proposition 4.
sup (p̄, h) ≤
¯
p̄∈∂ J(u)
sup
(p, h).
v∈V̂ (u)
p∈∂1 J(u,v)
Proof of the proposition. The subdifferential is a monotonous operator:
¯ + tn h), ∀p̄ ∈ ∂ J(u),
¯
∀p̄n ∈ ∂ J(u
(p̄n , h) ≥ (p̄, h).
Specifically,
inf
(p̄n , h) ≥
¯
p̄n ∈∂ J(u+t
n h)
10
sup (p̄, h).
¯
p̄∈∂ J(u)
Moreover, making use of proposition 1 at u + tn h, it gives, with the same notations as
above
inf
(p̄n , h) ≤
inf
(pn , h).
¯
p̄n ∈∂ J(u+t
n h)
pn ∈∂1 J(u+tn h,vn )
Regroup the two inequalities to obtain
sup (p̄, h) ≤ (pn , h) ∀pn ∈ ∂1 J(u + tn h, vn ).
¯
p̄∈∂ J(u)
Making use of proposition 3, it can be infered that there exists v̂ ∈ V̂ (u) such that
sup (p̄, h) ≤
¯
p̄∈∂ J(u)
sup
(p, h),
p∈∂1 J(u,v̂)
and a fortiori the inequality claimed.
Now, proposition 4 implies the inclusion opposite to that proved in proposition 1, and
the two together prove the equality claimed in the theorem.
2.2 The case without compacity
Let adopt hypotheses C1, C2, and D2.1, where it is recalled that the directional derivative
can be seen as the support function of the subdifferential. It will be seen further what can
be said without hypothesis D2.1, which is not very natural in this context.
The lemma 1 still holds, with the following precisions:
¯ 0) ≤
Lemma 2. Hypothesis C2 is implied by C2a and the hypothesis that there exists J(u
∞. Moreover, the hypotheses C1 and C2 imply that, if un → u ∈ U0 , {vn } ∈ W(u) and
pn ∈ ∂1 J(un , vn ), then lim sup kpn k ≤ ∞.
Proof of the lemma. The first claim has been proved in Lemma 1. Let now u be fixed
in U . Let us recall that by C2, J is continuous at u, uniformly in v. Let ρ > 0 be such
that the ball B(u, 2ρ) be included in Ũ . Let h ∈ U , with khk = ρ. For n large enough,
un + h ∈ Ũ , and thus
a ≥ J(un + h, vn ) ≥ J(un , vn ) + (pn , h) ≥ J(u, vn ) − ǫn + (pn , h)
where ǫn goes to zero independently of vn due to the uniform continuity in u. Then, taking
into account the fact that, by hypothesis, {vn } ∈ W(u),
¯ − ηn − ǫn + (pn , h),
a ≥ J(u)
where again, ηn → 0. Therefore
¯ + ǫn + ηn .
(pn , h) ≤ a − J(u)
whence,
¯
a − J(u)
.
ρ
¯
In fact, otherwise, it could be chosen δ > 0 such that lim sup kpn k > (a − J(u))/ρ
+ 2δ,
¯
N such that for n > N , ǫn + ηn < δρ, and k > N such that kpk k > (a − J(u))/ρ
+ δ.
Let then ūk ∈ U of unit norm such that (pk , ūk ) = kpk k, taking h = ρūk , one obtains a
contradiction with the above inequality.
In order to simplify the statement of the next theorem, let introduce the following
natural definition.
lim sup kpn k ≤
11
Definition. Let U ′ be the topological dual space of U, and {Dn } a sequence of subsets
of U ′ . Let define lim sup Dn as the set of all limits in the weak-star topology of sequences
{dn } of elements of Dn :
∞
∞ [
n
o
\
⋆
∂1 J(u, vn ).
lim sup Dn = d | ∃dn ∈ Dn : dn ⇀ d =
n→∞
k=1 n=k
(The closure operator in the last expression being in the sense of the weak-star topology).
It can now be stated the next theorem :
Theorem C2. Under hypotheses C1 and C2, the function J¯ has at every u ∈ U0 a
subdifferential given by the following formula :
¯
∂ J(u)
= co
[
lim sup ∂1 J(u, vn )
{vn }∈W(u)
n→∞
or equivalently
¯
∂ J(u)
= co
∞ [
∞
\
[
∂1 J(u, vn ).
{vn }∈W(u) k=1 n=k
(See another formulation after the proof.)
Proof. As in theorem C1, J¯ is convex continuous over U . Let us use the notation
D=
[
{vn }∈W(u)
lim sup ∂1 J(u, vn ).
n→∞
According to Lemma 2, D is bounded.
¯
Proposition 1. ∂ J(u)
⊃ coD.
Proof of the proposition. Let p̃ ∈ D. By definition, there exist a maximizing sequence
⋆
{vk } and a sequence pk ∈ ∂1 J(u, vk ) such that pk ⇀ p̃. It gives, ∀h
¯ + h) ≥ J(u + h, vk ) ≥ J(u, vk ) + (pk , h) = J(u)
¯ − ǫk + (pk , h),
J(u
where ǫk → 0, whence, taking the limit
¯ + h) ≥ J(u)
¯ + (p̃, h),
J(u
¯
thus D ⊂ ∂ J(u),
but as the latter is convex, the proposition is proved.
¯ + tn h) − ǫn ,
Proposition 2. Let tn ց 0, ǫn ց 0, and vn such that J(u + tn h, vn ) ≥ J(u
and finally pn ∈ ∂1 J(u + tn h, vn ). Then {vn } ∈ W(u).
12
Proof of the proposition. One has
¯
¯ + tn h) − ǫn − tn (pn , h).
J(u)
≥ J(u, vn ) ≥ J(u + tn h, vn ) − tn (pn , h) ≥ J(u
¯
Let p ∈ ∂ J(u).
(It has been seen that it is not empty). Using it to upperbound the last
occurrence of J above, it can be easily obtained
¯
¯ − tn (p, h) − ǫn − tn (pn , h).
J(u)
≥ J(u, vn ) ≥ J(u)
By the lemma 2, pn is bounded, hence the proposition.
Proposition 3. Let tn , ǫn , vn be as in proposition 2. Let furthermore
D = sup (p̃, h) and
Dn =
p̃∈D
sup
(pn , h).
pn ∈∂1 J(u,vn )
Then lim sup Dn ≤ D.
Proof of the proposition. For all n, one can choose p̂n ∈ ∂1 J(u, vn ) such that Dn ≥
(p̂n , h) ≥ Dn − ǫn . Thus
lim sup Dn = lim sup(p̂n , h).
Extracting a subsequence {p̂k } of {p̂n } such that (p̂k , h) → lim sup(p̂n , h), and again a
weak-star convergent subsequence converging to, say, p̃ ∈ D, it gives
lim sup(p̂n , h) = (p̃, h) ≤ D.
which proves the proposition.
Proposition 4. Let tn ց 0, and for each n, {vnk }k ∈ W(u + tn h). Let
Dn = lim sup ∂1 J(u + tn h, vnk )
k→∞
Then, if p̃n ∈ Dn , one has
lim sup(p̃n , h) ≤ D = sup (p̃, h).
p̃∈D
Proof of the proposition. Let pn ∈ Dn . Let us choose {vnk }k ∈ W(u + tn h) and
⋆
pkn ∈ ∂1 J(u + tn h, vnk ) such that pkn ⇀ p̃n . Let us also choose kn such that, for a fixed
sequence ǫn ց 0, with the notations vnkn = vn and pknn = pn , the following holds :
¯ + tn h) − ǫn
J(u + tn h, vn ) ≥ J(u
and
|(pn − p̃n , h)| ≤ ǫn .
The sequence {vn } is as in proposition 2, and in particular is in W(u). Moreover, for all
α > 0,
¯ + tn h) + α(p̃n , h) − 2ǫn ,
J(u + tn h + αh, vn ) ≥ J(u + tn h, vn ) + α(pn , h) ≥ J(u
13
¯
holds, i.e., for p̃ ∈ D, hence p̃ ∈ ∂ J(u)
according to proposition 1,
¯ + tn (p̃, h) − 2ǫn + α(p̃n , h).
J(u + tn h + αh, vn ) ≥ J(u)
On the other hand, let us set
Dn =
sup
(p̂n , h).
p̂n ∈∂1 J(u,vn )
For every positive η, there exists a positive α0 such that, for every positive α smaller or
equal to α0 , there holds
¯ + α(Dn + η).
J(u + αh, vn ) < J(u, vn ) + α(Dn + η) ≤ J(u)
Moreover, due to hypothesis D2.1, it may be picked α0 independently of vn . (i.e., fixed as
n → ∞). Since J is continuous in u, uniformly in v, for n large enough, one has
J(u + tn h + αh, vn ) ≤ J(u + αh, vn ) + ǫn ,
whence, regrouping the last three inequalities,
∃α0 > 0 : ∀α ∈ [0, α0 ],
α(Dn + η) ≥ α(p̃n , h) − 3ǫn .
Take the limit, using proposition 3, to derive
D + η ≥ lim sup(p̃n , h)
which proves the proposition, since η was arbitrary.
¯
Proposition 5. ∂ J(u)
⊂ coD
¯
Proof of the proposition. Let p̄ ∈ ∂ J(u).
Since ∂ J¯ is a monotonous operator,
¯ + tn h),
∀p̄n ∈ ∂ J(u
(p̄, h) ≤ (p̄n , h).
Therefore, making use of proposition 1 :
(p̄, h) ≤
inf
(p̄n , h) ≤ inf (p̃n , h)
¯
p̄n ∈∂ J(u+t
n h)
p̃n ∈Dn
Finally, taking the limsup and making use of proposition 4,
¯
∀p̄ ∈ ∂ J(u),
(p̄, h) ≤ sup (p̃, h).
p̃∈D
Thus, p̄ ∈ coD, which proves the proposition.
Finally, propositions 1 and 5 together prove the theorem.
It is useful, at this point, to give an alternate form of the formula of theorem C2.
Define the level sets at u, Vǫ , in the following way.
14
Definition. Let ǫ be a positive number, define
¯ −ǫ .
Vǫ (u) = v ∈ V | J(u, v) ≥ J(u)
They are convex sets, increasing with ǫ. When it exists, V̂ (u) is just V0 (u). In terms
of these sets, the formula of theorem C2 may be rewritten as follows.
[
\
¯
co
∂1 J(u, v).
∂ J(u)
=
ǫ>0
v∈Vǫ
The above formulation is the natural one to state the result without the uniformity
hypothesis D2.1. This theorem is proved in [15].
Theorem C3. (Valadier) Under hypotheses C1 and C2, one has
\
[
¯
∂ J(u)
=
co
∂1 J(u, v)
ǫ>0
Ω
v∈Vǫ
u∈Ω
where Ω ranges over a complete set of neighborhoods of u.
3. The convex concave case
The following additional hypothesis will be made
Hypothesis CC. V is a convex subset of a Banach space V , and ∀u ∈ U,
is concave.
v 7→ J(u, v)
Remark. In this case, if furthermore V is reflexive, in hypothesis C0 the compacity of V
may be replaced by V closed and bounded. Because v 7→ J(u, v) being concave, its being
u.s.c. is preserved in the weak topology.
The previous two theorems can be simplified in the following way.
Theorem CC1. Under hypotheses C and CC, the subdifferential of J¯ is given at any
point u in U0 by the formula
[
¯
∂1 J(u, v).
∂ J(u)
=
v∈V̂ (u)
Proof. According to theorem C1, it suffices to prove the following proposition :
S
Proposition. D = v∈V̂ (u) ∂1 J(u, v) is convex.
Proof. For i = 1, 2, let vi ∈ V̂ (u), and pi ∈ ∂1 J(u, vi ). We know that V̂ (u) is convex,
and thus ∀λ ∈ [0, 1], w = λv1 + (1 − λ)v2 ∈ V̂ (u). Let us also set q = λp1 + (1 − λ)p2 . Let
h ∈ U − u. Making use of hypothesis CC,
J(u + h, w) ≥ λJ(u + h, v1 ) + (1 − λ)J(u, v2 ) ≥ λJ(u, v1 ) + (1 − λ)J(u, v2 ) + (q, h).
¯
And since, by definition, J(u, vi ) = J(u),
¯ + (q, h) = J(u, w) + (q, h).
J(u + h, w) ≥ J(u)
Thus q ∈ ∂1 J(u, w), where w ∈ V̂ (u).
15
Theorem CC2. Under hypotheses C1, C2, and CC, the differential of J¯ is given at any
point u in U0 by the formula
¯
∂ J(u)
=
[
lim sup ∂1 J(u, vn ).
{vn }∈W(u)
n→∞
Proof. Again, it suffices to prove that
D=
[
{vn }∈W(u)
lim sup ∂1 J(u, vn )
n→∞
is convex.
For the sequel in this proof, the following notations will be used For i = 1, 2, let
⋆
pi ∈ D. There exist {vki }k ∈ W(u) and pik ∈ ∂1 J(u, vki ) such that pik ⇀ pi . For λ ∈ [0, 1],
let
wk = λvk1 + (1 − λ)vk2 , q = λp1 + (1 − λ)p2 , qk = λp1k + (1 − λ)p2k .
Proposition 1. {wk }k ∈ W(u).
Proof of the proposition. By concavity, one has
¯
J(u, wk ) ≥ λJ(u, vk1 ) + (1 − λ)J(u, vk2 ) → J(u).
¯ J(u, wk ) ≤ J(u),
¯
¯
And since, by definition of J,
J(u, wk ) → J(u),
which is the definition of
{wk }k ∈ W(u).
Proposition 2. Let Ck be a sequence of convex subsets of U ′ , and C = lim sup Ck . Let
Dk = sup(pk , h) and D = sup(p, h), for pk ∈ Ck and p ∈ C. Then
i) Ck is convex,
ii) lim sup Dk ≤ D.
Proof of the proposition. The first item is elementary. The second one is the proposition 3 of the proof of theorem C2.
Proposition 3. With the notations introduced for this proof, let Ck = ∂1 J(u, wk ), and
C = lim sup Ck . Then q ∈ C.
Proof of the proposition. As in the previous theorem, ∀α > 0,
J(u + αh, wk ) ≥ λJ(u + αh, vk1 ) + (1 − λ)J(u + αh, vk2 )
i.e.
∀α,
J(u + αh, wk ) ≥ λJ(u, vk1 ) + (1 − λ)J(u, vk2 ) + α(qk , h).
Making use of the uniform continuity of J, it can be infered that
∀α,
¯ − ǫk + α(qk , h),
J(u + αh, wk ) ≥ J(u)
where {ǫk } is a sequence decreasing to zero independently of α and h.
16
On the other hand, ∀h, ∀η > 0, ∃α0 : ∀α ∈ (0, α0 ),
¯ + α(Dk + η).
J(u + αh, wk ) ≤ J(u, wk ) + α(Dk + η) ≤ J(u)
Whence, comparing the last two inequalities, for α ≤ α0 ,
(qk , h) ≤ Dk + η +
ǫk
α
and making use of proposition 2,
(q, h) = lim(qk , h) ≤ D + η.
Since η was arbitrary, it can be concluded that (q, h) ≤ D, and since, according t proposition 2, C is convex, the proposition is proved.
Finally, since C ⊂ D, q = λp1 + (1 − λ)p2 ∈ D, and the theorem is proved.
4. Application to the Von Neumann-Sion Theorem.
It is shown here that classical theorems of the existence of a saddle point, or at least of
a value, (infsup = supinf), to a convex-concave function, are simple consequences of the
above theorems.
The first theorem below is often called “Von Neumann’s theorem”, although Von
Neumann himself only treated the case needed for matrix games, i.e. where U and V are
simplices in Euclidean space, and J is linear. Sion [13] credits Shiffman for a more general
form. The second theorem below is often called “Sion’s theorem”, although Sion credits
Kneser and Fan for it. In [13], Sion gives a rather complete, and more general treatment
of that question. An elegant theory can be found in [1].
Our hypotheses are similar to those of the previous section. We state them anew
adapted to the present aim.
Hypotheses VN.
VN1. U is convex compact, contained in an open subset Ũ ⊂ U, and ∀v ∈ V , the function
u 7→ J(u, v) is convex l.s.c. from Ũ into IR. Furthermore, J is bounded above, uniformly
in v, in a neighborhood of any point of U in Ũ .
VN2. V is convex, and ∀u ∈ Ũ , the function v 7→ J(u, v) is concave.
VN3. V is (sequentially) compact, and ∀u ∈ Ũ , the function v 7→ J(u, v) is u.s.c.
Theorem VN1. Under hypotheses VN1 to VN3, the function J has a saddle point over
U × V . i.e., there exist û ∈ U and v̂ ∈ V such that
∀(u, v) ∈ U × V,
J(û, v) ≤ J(û, v̂) ≤ J(u, v̂).
Remark. The existence of a saddle point implies that
min max J(u, v) = max min J(u, v) = J(û, v̂).
u∈U v∈V
v∈V u∈U
17
Proof. Theorem CC1 applies. In particular, hypothesis VN1 insures that U ⊂ U0 where
J¯ is continuous, therefore l.s.c. (even in the weak topology if necessary). It reaches its
¯
minimum at a point û ∈ U . There exists thus p̂ ∈ ∂ J(û)
such that
∀u ∈ U,
(p̂, u − û) ≥ 0.
Making use of theorem CC1, there exists v̂ ∈ V̂ (û) such that p̂ ∈ ∂1 J(û, v̂). Whence
J(û, v̂) ≤ J(u, v̂) − (p̂, u − û).
Remembering that v̂ ∈ V̂ (û), one can take the left hand inequality of the saddlepoint, and
with the above two inequalities, the right hand one.
Theorem VN2. Under hypotheses VN1 and VN2, there exists û ∈ U such that
sup J(û, v) = min sup J(u, v) = sup min J(u, v).
u∈U v∈V
v∈V
v∈V u∈U
Proof. The proof makes use of Valadier’s formula.
Let us first notice that again, hypothesis VN1 insures the existence of the minima in
u. In particular, J¯ has a minimum at a point û ∈ U . Let
¯
Jˆ = J(û)
= min sup J(u, v).
u∈U v∈V
¯
J¯ being convex, there exists p̂ ∈ ∂ J(û)
such that
∀u ∈ U,
(p̂, u − û) ≥ 0.
It will be exhibited a sequence {wk } ∈ V such that:
∀ǫ > 0,
Then, one can conclude:
∃N,
min J(u, wN ) ≥ Jˆ − ǫ.
u∈U
ˆ
sup min J(u, v) ≥ J.
v∈V u∈U
Let ǫk be a decreasing sequence of positive numbers, and consider Ωk ⊂ U and Vk ⊂ V
such that:
∀u ∈ Ωk , ∀v ∈ V, J(u, v) ≥ J(û, v) − ǫk ,
Vk = Vǫk (û) = {v ∈ V /J(û, v) ≥ Jˆ − ǫk }.
Such Ωk ’s exist because u 7→ J(u, v) is l.s.c. uniformly in v, and any sequence {uk } ∈ Ωk
converges to û. Due to the definition of the level sets, any sequence {vk } ∈ Vk belongs to
W(û). Let us also define the sequence of sets of subgradients:
[
Pk =
∂1 J(u, v)
u∈Ωk
v∈Vk
18
According to Valadier’s formula, there exists a sequence of finite barycenters over Pk (see
[12] T. 2,XIX, 2;2 for finitude even in infinite dimension) such that:
qk =
nk
X
⋆
λki pki ⇀ p̂,
i=0
and of course
∀k ≥ 0,
λki ≥ 0,
nk
X
λki = 1.
i=0
Then, for each k, by definition of Pk , we can define two maps: uk : Pk 7→ Ωk and vk :
Pk 7→ Vk such that:
∀p ∈ Pk , p ∈ ∂1 J(uk (p), vk (p)).
For all u ∈ U , one has:
J(u, vk (p)) ≥ J(uk (p), vk (p)) + (p, u − uk (p)).
Because uk (p) ∈ Ωk ,
∀u ∈ U,
J(u, vk (p)) ≥ J(û, vk (p)) + (p, u − uk (p)) − ǫk .
Due to the concavity of v 7→ J(u, v) and the convexity of V, one can take the convex
combination of all the inequalities in pki , and obtain:
∀u ∈ U,
J(u, wk ) ≥
nk
X
λki {J(û, vk (pki )) + (pki , u − uk (pki ))} − ǫk ,
i=0
where
wk =
nk
X
λki vk (pki )
∈ V.
i=0
It has been seen that {vk (p), p ∈ Pk } ∈ W(û), so:
∃K1 ,
∀k ≥ K1 ,
nk
X
ǫ
λki J(û, vk (pki )) ≥ Jˆ − .
4
i=0
Due to uk (p) → û, p ∈ Pk and lemma 2 which provides the fact that the elements of Pk
are bounded,
∃K2 ,
∀u ∈ U,
∀k ≥ K2 ,
∀p ∈ Pk ,
ǫ
(p, u − uk (p)) ≥ (p, u − û) − .
4
For all u in U , there exists an open neighborhood O(u) and an integer n such that
∀ũ ∈ O(u),
∀k ≥ n,
(qk , ũ − û) ≥ (qk , u − û) −
19
ǫ
ǫ
ǫ
≥ (p̂, u − û) − 2 ≥ − .
8
8
4
As U is compact, a finite covering from the O(u) can be extracted. Let K3 be the maximum
of the corresponding n’s. Then
∀u ∈ U,
∀k ≥ K3 ,
ǫ
(qk , u − û) ≥ − .
4
for u belongs to one of the O(u) selected in the finite covering.
Thus,
∀u ∈ U,
∀k ≥ max(K1 , K2 , K3 ),
ǫ
J(u, wk ) ≥ Jˆ − 3 − ǫk .
4
Then, a N ≥ max(K1 , K2 , K3) can be chosen such that ǫN ≤ 4ǫ , and the claim is proved.
A trivial, but may be useful corollary is as follows.
Corollary. Under the hypotheses VN1 and VN2, if Û ⊂ U ,
sup inf J(u, v) = inf sup J(u, v).
v∈V u∈Û
u∈Û v∈V
Proof. It suffices to make use of the continuity of J and J¯ and to apply the previous
theorem to the closure Ū of Û .
5. References
[1] Aubin J.P. L’analyse non linéaire et ses motivations économiques, Masson, Paris
(1984).
[2] Aubin J.P. and Ekeland I., Applied Nonlinear Analysis, Wiley-Interscience, New-York
(1984).
[3] Başar T. and Bernhard P. H∞ -Optimal Control and Related Minimax Design Problems, a Dynamic Game Approach, Birkhaüser, Boston (1991)
[4] Bernhard P. Variations sur un thème de Danskin avec une coda sur un thème de Von
Neumann, Rapport de recherche INRIA 1238, Sophia-Antipolis, France (1990).
[5] Bonnans J.F. Directional derivatives of optimal solutions in smooth nonlinear programming, Rapport de recherche INRIA 1006, Rocquencourt, France (1989).
[6] Clarke F.H. Optimization and Nonsmooth Analysis Wiley-Interscience (1983).
[7] Danskin J.M. The Theory of Max Min, Springer, Berlin, (1967).
[8] Fleming W.H. and Rishel R.W. Deterministic and Stochastic Optimal Control, Springer, New-York (1975).
[9] Gauvin J. The generalized gradient of a marginal function in mathematical programming, Math. Operat. Res. 4, 458-463 (1979).
[10] Hiriart-Urruty J.B. Gradients généralisés de fonctions marginales, SIAM J. Control
and Optimization 16, 301-316 (1978).
[11] Rockafellar R.T. Extensions of subgradients calculus with applications to optimization, Nonlinear Analysis 9, 665-698 (1985).
20
[12] Schwartz L. Analyse: Topologie générale et analyse fonctionnelle, Hermann, Paris,
(1970).
[13] Sion M. On General Minimax Theorems Pacific Jal. of Mathematics 8, 171–176
(1958).
[14] Thibault L. On subdifferentials of optimal value functions, SIAM J. Control and
Optimization 29 1019-1036 (1991).
[15] Valadier M. Contribution à l’analyse convexe, Thèse d’état, Paris (1970).
[16] Von Neumann J. and Morgenstern O. Theory of Games and Economic Behaviour
Princeton University Press, Princeton, (1947).
21