Approximation to Optimization Problems: An Elementary Review

Approximation to Optimization Problems:
An Elementary Review∗
Peter Kall
(Published: Mathematics of Operations Research 11 (1986) 9–18)
During the last two decades the concept of epi-convergence was introduced and then was
used in various investigations in optimization and related areas. The aim of this review is
to show in an elementary way how closely the arguments in the epi-convergence approach
are related to those of the classical theory of convergence of functions.
1
Introduction.
In mathematical programming problems of the type
inf{φ(x) | x ∈ Γ}
(O)
have to be solved, where Γ ∈ Rn and φ : Γ → R are given.
In designing solution methods for O it is quite common to replace the original problem
by a sequence of “approximating” problems
inf{φν (x) | x ∈ Γν }
(Oν )
which are supposed to be easier to solve than O.
To give some examples, we just mention cutting plane methods, penalty methods and
solution methods for stochastic programming problems.
To simplify the presentation we restate the above problems in the usual way by defining
(
φ(x) if
x ∈ Γ,
+ ∞ else,
and
(
φν (x) if
x ∈ Γν ,
+ ∞ else.
f (x) =
fν (x) =
Then obviously O and Oν are equivalent to
∗
Received September 4, 1984; revised November 5, 1984. AMS 1980 subject classification. Primary:
90C30. Secondary: 65K10. IAOR 1973 subject classification. Main: Programming: nonlinear. OR/MS
Index 1978 subject classification. 657 Programming: nonlinear/theory. keywords. Approximation, nonlinear, optimization, epigraphs, epi-convergence.
1
infn f (x) and
(P)
R
infn fν (x)
(Pν )
R
respectively.
In order to assure that the optimal values inf fν and the solutions x̂ν of Pν — if they
exist — approximate in some reasonable sense the optimal value and solution of P, we
need to know in which appropriate way the functions fν should approximate f .
Since solutions x̂ν to problems like Pν are in many cases not unique, we cannot expect
that the x̂ν converge. Hence the only reasonable requirement with respect to a meaningful
approximating procedure is that every accumulation point of {x̂ν } be a solution of P. To
assure this statement the classical type of assumption is uniform convergence of fν fo f
on each compact subset of Rn together with some continuity of f .
During the last two decades the concept of epi-convergence was introduced and then
was used in various investigations as an alternative type of assumption to prove — among
other things — the accumulation point statement just mentioned.
The aim of this review is to show in an elementary way how closely the arguments in the
epi-convergence approach are related to those of the classical theory: Restricting ourselves
to those consequences of uniform convergence really needed in proving the accumulation
point statement we end up with the epi-convergence assumption.
In what follows we restrict ourselves to finitely valued functions fν and f instead of
extended real valued ones. The only reason for this is to avoid in the various convergence
concepts purely technical additional assumptions which keep the machinery running but
do not give any insight.
Needless to say, all statements can be found or composed in an obvious way from
the references. Nevertheless our way of introducing epi-convergence as an appropriate
concept for optimization problems has apparently never been chosen before, although it
might facilitate access to this area for people working in OR but not in functional analysis.
2
Classical concepts of convergence.
We consider functions
f : Rn → R and fν : Rn → R, ν ∈ N. In this section we refer to
(U) Uniform convergence to a continuous limit. f continuous, fν → f uniformly on
any compact set D ⊂ Rn , i.e. given the compact set D,
∀ > 0 ∃N () : |fν (x) − f (x)| < ∀x ∈ D, ν ≥ N ().
(1)
(C) Continuous convergence. fν → f continuously on Rn , i.e. for any
x ∈ Rn , ∀{xν → x} : fν (xν ) → f (x).
(2)
(EQ) Convergence of (almost) equicontinuous functions. fν → f pointwise on Rn
and the fν are (almost) equicontinuous , i.e. for any x ∈ Rn and > 0 there exist a
neighbourhood of x, U (x, ), and a number N (x, ) such that
2
|fν (y) − fν (x)| < ∀y ∈ U (x, ), ν ≥ N (x, ).
(3)
REMARK. Obviously EQ is a relaxation of the classical concept. There {fν } is assumed to be an equicontinuous sequence converging pointwise to f , which also implies
that any fν is continuous. The assumption in EQ of fν being (almost) equicontinuous
does however not imply continuity in general. Consider for instance
(
fν (x) =
1
x
ν
0
if x rational
else.
If we choose for any > 0
U(x, ) = {y||y − x| < /2},
N (x, ) > 2|x|/,
we have for y ∈ U(x, ) and ν ≥ N (x, )
|fν (x) − fν (y)| < .
Hence the fν are (almost) equicontinuous, but obviously no one of the fν ’s is continuous.
THEOREM 1. Any one of the convergence Types U, C and EQ implies the two others.
PROOF. U ⇒ C. Let for some x ∈ Rn an arbitrary sequence {xν → x} be chosen.
Hence there is a compact set D containing {xν } and x. Since given U the limit f is
U
EQ
C
Figure 1: Relations between convergence types.
continuous, it is uniformly continuous on D, i.e.,
∀ > 0 ∃∂() > 0 : |f (x) − f (y)| < ∀x, y ∈ D : |x − y| < ∂().
Furthermore by uniform convergence of {fν } on D and xν ∈ D ∀ν,
3
∀ > 0 ∃N () : |fν (xν ) − f (xν )| < ∀ν > N ()
and hence, since xν → x implies the existence of some M () such that |xν −x| < ∂() ∀ν >
M (),
|f (x) − fν (xν )| ≤ |f (x) − f (xν )| + |f (xν ) − fν (xν )| < + for ν > max[M (), N ()], i.e. fν (xν ) → f (x).
C ⇒ EQ. Given C, by choosing xν = x ∀ν follows fν (x) → f (x), i.e. pointwise
convergence. Assuming to the contrary of EQ, that there are some x and some > 0 such
that for all U (x, 1/n), n = 1, 2, . . ., and all N, N = 1, 2, . . ., there exist yn ∈ U (x, 1/n)
and νn > N, {νn } strictly increasing, such that |fνn (yn ) − fνn (x)| ≥ , contradicts C, since
yn → x and hence
lim fνn (yn ) = f (x).
n→∞
EQ ⇒ U. For any x in a compact set D and any > 0, by EQ there exist a neighbourhood U (x, ) and a number N (x, ) such that |fν (y)−fν (x)| < ∀y ∈ U (x, ), ν ≥ N (x, ),
and hence |f (y) − f (x)| ≤ ∀y ∈ U (x, ), i.e. f is continuous. Since D is compact, there
is a finite subset {x1 , . . . , xk } ⊂ D such that D ⊂ ∪ki=1 U (xi , ). So, if y ∈ D, for some
j ∈ {1, . . . , k} it is true that y ∈ U (xj , ), and therefore
|f (y) − fν (y)| ≤ |f (y) − f (xj )| + |f (xj ) − fν (xj )| + |fν (xj ) − fν (y)|, where
|f (y) − f (xj )| ≤ by the above shown continuity of f ,
|f (xj ) − fν (xj )| < ∀ν ≥ Mj by the pointwise convergence of {fν },
|fν (xj ) − fν (y)| < ∀ν ≥ N (xj , ) by the equicontinuity of {fν }.
Hence
|f (y) − fν (y)| < 3 ∀ν ≥ max [N (xi , ); Mi ],
1≤i≤k
yielding the uniform convergence fν → f on D. This theorem may be illustrated by Figure 1.
REMARK. The most familiar of the above-mentioned equivalences is U ⇔ EQ. It
usually comes up in the preparation of the Arzelà-Ascoli theorem. But also U ⇔ C is a
classical statement which may be found for instance as an exercise in [A1].
4
3
Approximating minimization problems.
Let us consider
the following minimization problems: Find
infn f (x) and
(P )
R
infn fν (x)
(Pν )
R
f (x)
f (x)
f ν (x)
f ν (x)
3ν
0
x
Figure 2: Counterexample 1.
where we assume that the fν converge in some appropriate sense to f . What can we
or what might we expect on the behavior of solutions x̂ν of Pν — if they exist — and
the optimal values inf fν towards the respective results of P ? First, even under the
convergence assumption U (see previous section) we may not expect that x̂ν → x̂ nor that
inf fν → inf f , as the following trivial example shows:
EXAMPLE 1. Let us choose for x ∈ R
f (x) = |x| and

if |x| ≤ ν,
 |x|
2ν − |x| if ν < |x| ≤ 3ν,
fν (x) =

|x| − 4ν if |x| > 3ν,
which is shown in Figure 2. Obviously we have
x̂ = 0 with
inf f (x) = 0 and x̂ν = ±3ν with
inf fν (x) = −ν.
Hence the set of “approximating” solutions {x̂ν } has no accumulation point and inf fν
does not converge to inf f , although obviously fν → f uniformly on every compact set.
However the following statement is well known. Nevertheless we repeat it here to point
out those parts (or consequences) of the assumptions which are essential in its proof.
5
THEOREM 2. Assume U. Then
(a) limν inf fν ≤ inf f , and
(b) if x̂ν is a solution to Pν , ν = 1, 2, . . ., and if x̂ is an accumulation point of {x̂ν },
then x̂ is a solution to P and, {x̂νk } being a subsequence of {x̂ν } converging to x̂, we have
limk→∞ (inf fνk ) = inf f .
PROOF. We use C instead of U since by Theorem 1 they are equivalent. Let φ =
inf f, φν = inf fν .
For (a), let x ∈ Rn and suppose xν → x. By C we have
f (x) = lim fν (xν ) = lim fν (xν ) ≥ lim φν , i.e.
ν
ν
ν
f (x) ≥ lim φν .
(4)
ν
Take the infimum in x to get φ ≥ limν φν .
Observe that instead of C to derive (4) it would have been sufficient to assume that
∀x ∃{zν } : zν → x and lim fν (zν ) ≤ f (x).
ν
(5)
For (b) we use C to get
f (x̂) ≤ lim fνk (x̂νk ) = lim φνk .
k
(6)
k
Now, using the proof of (a), the trivial inequalities limk φνk ≤ limk φνk ≤ limν φν and the
definition of φ, we have
f (x̂) = φ = lim φνk = lim φν .
(7)
ν
k
Observe that to derive (7) it would be sufficient to assure (a) by assuming (5) and,
furthermore, to guarantee (6) by the assumption
∀{xν → x} : f (x) ≤ lim fν (xν ). (8)
ν
Considering this proof it becomes obvious that instead of U we only need the following
type of convergence to prove assertions (a) and (b) of Theorem 2:
(EP) For any x ∈ Rn
there exists a sequence {yν → x} such that limfν (yν ) ≤ f (x),
(9)
and for all sequences {xν → x} holds f (x) ≤ lim fν (xν ).
(10)
ν
ν
This is the so-called epi-convergence. Hence we have
COROLLARY 3. Assume EP. Then
(a) limν inf fν ≤ inf f and
6
(b) if x̂ν solves Pν , ν = 1, 2, . . . , and if x̂ is an accumulation point of {x̂ν }, the subsequence {x̂νk } being convergent to x̂, then f (x̂) = inf f = limk→∞ inf fνk .
REMARK. The assumption U for statements like Theorem 2 is classical and has been
made in various papers, e.g. [B1] - [B3]. As we have seen, this assumption is used indirectly by its equivalence to C, which may in turn be weakened to EP for the conclusions
to be drawn in the proof of Theorem 2.
In the publications [C1]- [C9] dealing with epi-convergence the emphasis is rather on
the relation to convergence of sets (applied to epigraphs) than on the relaxation of C.
But the equivalence to (9) and (10) in the above definition of EP is shown and used at
different times.
4
Modified concepts of convergence
Whereas the continuous
convergence C implies pointwise convergence to the same limit, epi-convergence does
not in general have the same implication as the following simple example shows.
EXAMPLE 2. Assume that the sequence {fν } is defined for ν = 1, 2, 3, . . . as
(
f2ν (x) =
1/ν
if x is rational,
1 − 1/ν else,
(
f2ν+1 (x) = 1 − f2ν (x) =
1 − 1/ν if x is rational,
1/ν
else.
Then we have the subsequences
0, x rational,
1, else,
and
1, x rational,
{f2ν+1 }converging to h(x) =
0, else.
{f2ν }converging to g(x) =
Since g(x) 6= h(x) ∀x ∈ R, the sequence {fµ } does not converge pointwise. On the
other hand, this sequence is easily shown to be epi-convergent with the epi-limit f (x) ≡ 0,
by choosing for any arbitrary x a sequence {yµ } according to
y2ν rational such that |y2ν − x| < 1/ν,
y2ν+1 irrational such that |y2ν+1 − x| < 1/ν;
then obviously yµ → x and fµ (yµ ) = 1/[µ/2] → 0 and hence limµ fµ (yµ ) ≤ f (x) = 0,
whereas 0 = f (x) ≤ lim µ fµ (xµ ) is trivially satisfied for any sequence {xµ } converging
to x.
Similarly we can find examples of pointwise converging sequences which at least do
7
not epi-converge to the pointwise limit. One of the reasons is given by
LEMMA 4. If {fν } epi-converges to f , then f is lower semicontinuous (l.s.c).
PROOF. Assume that at some x the limit f is not l.s.c. Then there exists an > 0
such that in any neighbourhood U(x, 1/n) = {z| k z − x k< 1/n} we can find an element
zn ∈ U(x, 1/n) satisfying f (zn ) < f (x) − .
Since fν epi-converges to f , we have
1
1
and fνn (ζνn ) < f (zn ) + ,
n
n
where {νn } can be chosen as to be strictly increasing. Hence we have a sequence {ζνn }
converging to x with
∀zn ∃ζνn :k ζνn − zn k<
1
1
< f (x) − +
n
n
yielding limn fνn (ζνn ) ≤ f (x) − in contradiction to f (x) ≤ limn fn (ζn ) ∀{ζn → x} according to the assumed epi-convergence. By construction of approximation schemes we usually know that we have pointwise
convergence fν → f. So the question is, which assumptions have to be satisfied in addition
to the lower semicontinuity of f due to Lemma 4 to guarantee also the epi-convergence
of fν to f . Since epi-convergence EP according to the remarks in the proof of Theorem 2
may be considered as a certain relaxation of continuous convergence C, which in turn was
equivalent to the convergence types U and EQ, it seems natural to ask for corresponding
relaxations U and EQ, which under the assumption of pointwise convergence imply EP.
Defining the convergence types
fνn (ζνn ) < f (zn ) +
(LU) Lower uniform convergence to a l.s.c. limit. f l.s.c., fν → f pointwise and lower
uniformly on any compact set D ⊂ Rn , i.e. given the compact set D,
∀ > 0 ∃N () : f (x) − fν (x) < ∀x ∈ D ν ≥ N ().
(11)
(EQL) Convergence of (almost) equi-l.s.c. functions. fν → f pointwise and the fν
are (almost) equi-l.s.c., i.e. for any x ∈ Rn and > 0 exists a neighbourhood U(x, ) and
a number N (x, ) such that
fν (y) > fν (x) − ∀y ∈ U(x, ),
ν ≥ N (x, ).
(12)
The following statement holds:
THEOREM 5. (a) LU implies EP;
(b) If fν → f pointwise, then EP and EQL are equivalent.
PROOF. (a) Let {xν } ⊂ Rn be an arbitrary sequence converging to some x. Hence
{xν } is contained in some compact set D ⊂ Rn . By LU we have for any > 0 a number
N () such that
8
f (y) − fν (y) < ∀y ∈ D,
ν ≥ N ()
and hence
f (xν ) < fν (xν ) + ∀ν ≥ N ().
Since—by LU—f is l.s.c. and xν → x, we have
lim f (xν ) ≥ f (x)
ν
and therefore
f (x) ≤ lim fν (xν ) + ν
for any > 0, implying
f (x) ≤ lim fν (xν ).
(13)
ν
On the other hand, from the pointwise convergence due to LU follows trivially the existence of a sequence yν → x, namely yν = x ∀ν, such that
lim fν (yν ) ≤ f (x).
ν
(14)
(13) and (14) yield EP.
(b) Assume that EQL is not satisfied, i.e. that the fν are not (almost) equi-l.s.c..
Then, for at least one x ∈ Rn , ∃ > 0 : ∀ U(x, 1/n) ∃yn ∈ U(x, 1/n) and an increasing
sequence νn , such that fνn (yn ) ≤ fνn (x) − yielding
lim fνn (yn ) ≤ lim fνn (x) − = f (x) − n
n
by the assumed pointwise convergence, which contradicts f (x) ≤ limν fν (xν ) ∀{xν → x}
under EP.
Hence pointwise together with epi-convergence imply EQL. On the other hand, if we
assume EQL, we have pointwise convergence and as in (14) limν fν (yν ) ≤ f (x) for the
trivial sequence yν = x.
If xν → x, then for any > 0 we get, by EQL, xν ∈ U(x, ) from some ν on and hence
fν (xν ) > fν (x) − for xν ∈ U(x, ) and ν ≥ N (x, ),
yielding together with the pointwise convergence included in EQL
f (x) = lim fν (x) ≤ lim fν (xν ) + .
ν
ν
Since this holds for all > 0, we have EP as a consequence of EQL. By this theorem we now have Figure 3, where EP + stands for epi- and pointwise convergence.
9
The following example shows that in general—differing from the complete symmetry
of the diagram in Figure 1—EP + and EQL do not imply LU.
EXAMPLE 3. Let for x ∈ R

x < 0,
 1,
0,
x > 1,
fn (x) =

n
1 − x , 0 ≤ x ≤ 1.
LU
EQL
E P+
Figure 3: Relations between modified convergence types.
Then obviously fn converges pointwise to the l.s.c. function
1, x < 1,
f (x) =
0, x ≥ 1.
Since
fn0 (x)
=
0,
x < 0 and x > 1,
n−1
−nx , 0 ≤ x < 1,
we have that fn0 (x) < 0 and fn0 is strictly decreasing on (0, 1). Hence for any x ∈ (0, 1)
and zx = 21 x + 12 (< 1) it follows for
x < y < zx :
fn (y) ≥ fn (x) + (y − x)fn0 (zx )
= fn (x) + (y − x)[−nzxn−1 ].
n→∞
Since nzxn−1 → 0, there is a number N (x, ) such that nzxn−1 < ∀n ≥ N (x, ). Using
that 0 < y − x < 1, we get fn (y) > fn (x) − ∀n ≥ N (x, ), whereas for
0 ≤ y ≤ x : fn (y) ≥ fn (x) ∀n.
Therefore the fn are equi-l.s.c., hence we have EQL. But this does not imply LU.
To see this, consider x ∈ (0, 1). For LU we should have for a compact set D, e.g.
D = [0, 1], f (x) − fn (x) < ∀x ∈ D, n ≥ N (), i.e. xn < ∀x ∈ (0, 1) ⊂ D, n ≥ N ().
From this inequality we get by taking logarithms
10
n log x < log ⇒ n > log / log x
showing that for ∈ (0, 1) and x approaching 1, we had n → ∞, or in other words: There
is no universal N () for D.1
According to this example we need some further assumptions in addition to EP + or
EQL to achieve LU. One possibility is given in
COROLLARY 6. If the limit f is continuous, then EQL implies LU.
PROOF. For any compact set D and y ∈ D we have, analogous to the proof of
EQ ⇒ U, Theorem 1, that y ∈ U(xj , ) for some j and
f (y) − fν (y) = f (y) − f (xj ) + f (xj ) − fν (xj ) + fν (xj ) − fν (y),
(15)
Sk
where D ⊂ i=1 U(xi , ) and now U(xi , ) = V (xi , ) ∩ W (xi , ) such that |f (y) − f (xi )| <
∀y ∈ V (xi , ) due to the assumed continuity of f and fν (y) − fν (xi ) > − ∀y ∈
W (xi , ), ν ≥ N (xi , ) due to EQL. Hence we get in (15)
f (y) − f (xj ) < , since y ∈ U(xj , ) ⊂ V (xj , ),
f (xj ) − fν (xj ) < ∀ν ≥ Mj by the pointwise convergence of {fν } due to EQL,
fν (xj ) − fν (y) < since y ∈ U(xj , ) ⊂ W (xj , ), ∀ν ≥ N (xj , ) due to EQL, implying
f (y) − fν (y) < 3 ∀ν ≥ max [N (xi , ); Mi ]. l≤i≤k
Example 3 shows that it would not be sufficient just to assume a monotonically increasing
sequence of l.s.c. functions fν converging to a l.s.c. limit f to assure LU. However it can
be helpful to recall the classical
COROLLARY 7. If the monotonically increasing sequence {fν } of l.s.c. functions
converges pointwise to the continuous function f , this implies U.
PROOF. According to the assumptions hν = fν − f, ν = 1, 2, . . ., is a monotonically
increasing sequence of l.s.c. functions converging to h(x) ≡ 0. Hence Dini’s theorem
asserts uniform convergence on any compact set D. COROLLARY 8. If the monotonically increasing sequence {fν } converges pointwise
to f , and if either fν , ν = 1, 2, . . ., or f are l.s.c. functions, this implies EP.
PROOF. From the assumed monotonicity follows for the epigraphs epifν+1 T
⊂ epifν , ν =
∞
1, 2, . . ., which in turn implies — with cl indicating closure — limν epifν = cl{
T∞ ν=1 epifν }.
Furthermore the assumed monotonicity and convergence imply epif = ν=1 epifν . Finally the assumed
lower semicontinuity implies either epif or epifν , ν = 1, 2, . . .
T∞
(and hence ν=1 epifν ) to be closed and therefore epif = limν epifν , i.e. EP according to
the remark following Corollary 3. 1
For this example I am indebted to H. Ammann.
11
Summing up, pointwise convergence together with EQL or LU or the additional assumptions of Corollaries 7 and 8 yield EP and hence the applicability of Corollary 3.
5
Concluding remarks.
As we have seen there are close relations between the epi-convergence concept and the
classical uniform convergence setup equivalent to the continuous convergence concept
leading—by a slight relaxation appropriate for dealing with optimization problems—to
epi-convergence again.
Nevertheless epi-convergence appears as the suitable concept in dealing with the approximation of optimization problems. And in some cases, e.g. penalties fν with barrier
functions failing to converge pointwise to f on the boundary of the feasible set, epiconvergence is the only one satisfied of the discussed concepts.
But being aware of the relations to a classical area of analysis familiar to us for a long
time might be helpful for a better understanding and feeling in applying a new concept
like epi-convergence.
Acknowledgement I owe thanks to the referee for helpful improvements of a first
draft of this paper.
References
[A] Classical Analysis Concepts
[A1] Royden, H.L. (1963). Real Analysis. Macmillan, New York
[B] Approximation of Optimization Problems
[B1] Kanniappan, P. and Sastry, S.M.A. (1983). Uniform Convergence of Convex Optimization Problems. J. Math. Anal. Appl. 96 1-12.
[B2] Kosmol, P. (1974). Algorithmen zur konvexen Optimierung. Methods Oper. Res. 18
176-186.
[B3] Römisch, W. (1981). An Approximation Method for Stochastic Optimization and
Control. Preprint, Humboldt-Univ., Berlin.
[C] Approximation of Optimization Problems
[C1] Attouch, H. and Wets, R.J.-B. (1981). Approximation and Convergence in Nonlinear
Optimization. NLP 4. Mangasarian, Meyer, and Roginson (Eds.), Academic Press,
New York, 367-394.
[C2] — and —. (1983) A Convergence Theory for Saddle Functions. Trans. Amer. Math.
Soc. 280 1-41.
12
[C3] Birge, J. and Wets, R. J.-B. (1983). Designing Approximation Schemes for Stochastic
Optimization Problems, in Particular for Stochastic Programs with Recourse. IIASAWorking Paper WP-83-111.
[C4] Mosco, U. (1969). Convergence of Convex Sets and of Solutions of Variational Inequalities. Adv. in Math 3 510-585.
[C5] Salinetti, G. and Wets, R.J.-B. (1977). On the Relations between Two Types of
Convergence for Convex Functions. J. Math. Anal. Appl. 60 211-226.
[C6] — and —. (1979). On the Convergence of Sequences of Convex Sets in Finite Dimensions. SIAM Review 21 18-33.
[C7] Wets, R. J.B. (1980). Convergence of Convex Functions. Variational Inequalities
and Convex Optimization Problems. Variational Inequalities and Complementarity
Problems. R. Cottle, F. Giannessi and J.-L. Lions (Eds.), Wiley and Sons, New York,
375-403
[C8] —. (1983). Stochastic Programming: Solution Techniques and Approximation
Schemes. Mathematical Programming: The State-of-the-Art 1982. A Bachem, M.
Grötschel and B. Korte (Eds.), Springer-Verlag, Berlin, 566-603.
[C9] Wijsman, R.A. (1966). Convergence of Sequences of Convex Sets, Cones and Functions. II. Trans. Amer. Math. Soc. 123 32-45.
13