LECTURES ON PARAMETRIC OPTIMIZATION: AN

LECTURES ON PARAMETRIC OPTIMIZATION: AN INTRODUCTION
GEORG STILL, UNIVERSITY OF TWENTE
Abstract: This is an introductory lecture in parametric programming. In the first 5
Sections the local stability of the problems is studied under assumptions such that the
Implicit Function Theorem can be applied. In the last section also more general results
are discussed.
C ONTENTS
1. Parametric equations
2. Parametric unconstrained optimization
2.1. Non-parametric minimization
2.2. Parametric minimization
3. Parametric constraint optimization
3.1. Non-parametric programs
3.2. Parametric programs
4. Linear parametric programs
4.1. Non-parametric linear programs
4.2. Parametric linear programs
5. Applications
5.1. Interior point methods
5.2. A parametric location problem
6. Pathfollowing in practice
7. General parametric programming
References
1. PARAMETRIC
1
3
3
3
4
4
5
7
7
8
10
10
11
11
12
18
EQUATIONS
Consider the non-parametric equation: Solve
F . x / := x2 − 2x − 1 = 0 or equivalently . x − 1 /2 − 2 = 0
√
with solutions x1;2 = 1 ± 2.
The parametric version is: For t ∈ R find a solution x = x .t / of
(1)
F . x; t / := x2 − 2t2 x − t4 = 0
or equivalently
Date: November 6, 2006.
1
. x − t2 /2 − 2t4 = 0 :
2
GEORG STILL, UNIVERSITY OF TWENTE
The solutions are
x 1;2 . t / = t 2 ±
√
2t2 = t2 .1 ±
√
2/
Ex. 1. Sketch the solution curve in the . x; t /-space.
At t = 0 with solution x = 0 we find for the gradient of F
2x − 2t2
0
∇F . x; t / =
=
:
3
−4tx − 4t
0
x;t
Rule. The solution set of F . x; t / = 0 where F : R2 → R is “normally” (locally)
given by a one-dimensional solution curve . x .t /; t /. However at points . x; t / where
∇F . x; t / = 0 holds the solution set has a singularity(such as a bifurcation or a nonsmoothness).
Two versions of the Implicit Function Theorem We consider systems of n equations
in n + p variables:
F . x; t / = 0
where
F : Rn × R p → Rn ; . x ; t / ∈ Rn × R p :
The Implicit Function Theorem (IFT) makes a statement on the structure of the solution set in the “normal” situation.
Theorem 1. (IFT for one equation)
Let F : Rk → R be a C 1 -function. Suppose for y ∈ Rk we have F . y / = 0 and ∇F . y / 6=
0. Then near y the solution set S . F / := {y ∈ Rk | F . y / = 0} is a C 1 -manifold of
dimension k − 1. Moreover at y
∇F . y / ⊥ S . F /
and the gradient ∇F . y / points into the region where F . y / > F . y /.
Ex. 2. Give a geometrical sketch of the statement.
Theorem 2. (General version of the IFT)
Let F : Rn × R p → Rn be a C 1 -function F . x; t / with . x; t / ∈ Rn × R p . Suppose for
. x; t / ∈ Rn × R p we have F . x; t / = 0 and the matrix ∇x F . x; t / is nonsingular. Then in
a neighborhood Ut .t / of t the solution set S . F / := {. x; t / | F . x; t / = 0} is described
by a C 1 -function x : Ut .t / → R such that x .t / = x and
F . x .t /; t / = 0
for t ∈ Ut .t / :
(So, locally, S . F / is a p dimensional C -manifold.) Moreover the gradient ∇x .t / is
given by
∇x .t / = −[∇x F . x .t /; t /]−1 ∇t F . x .t /; t / for t ∈ Ut .t / :
1
LECTURES ON PARAMETRIC OPTIMIZATION: AN INTRODUCTION
3
Proof. See e.g., [7]. Note that if x .t / is a C 1 -function satisfying F . x .t /; t / = 0 then
by differentiation wrt. t we find
∇x F . x .t /; t /∇x + ∇t F . x .t /; t / = 0 :
2. PARAMETRIC UNCONSTRAINED OPTIMIZATION
2.1. Non-parametric minimization. We assume that f : Rn → R is a C 2 -function.
The point x ∈ Rn is a local minimizer of f if there is some " > 0 such that
f .x/ ≤ f .x/
for all x satisfying kx − xk < " :
It is called a strict local minimizer if: f . x / < f . x / for all x 6= x, kx − xk < ".
Theorem 3. (Necessary and sufficient optimality conditions) (see e.g., [4])
(a) If x is a local minimizer of f then
∇ f .x/ = 0
and
∇2 f . x / ≥ 0 (positive semidefinite)
(b) If x satisfies
∇ f .x/ = 0
and ∇2 f . x / > 0 (positive definite)
then x is a strict local minimizer of f .
2.2. Parametric minimization. Let f . x; t / be a C 2 function, f : Rn × T → R, where
T ⊂ R p is open. We consider the parametric problem: for t ∈ T find a (local) solution
x = x .t / of
P .t / :
(2)
minn f . x; t /
x∈R
To solve this problem, for t ∈ T, we have to find solutions x of the critical point
equation
F . x; t / := ∇x f . x; t / = 0 :
(3)
The next examples show the possible bad behavior.
Ex. 3. For
1
min f . x; t / := x3 − t2 x
3
the minimizer is given by x .t / = |t| with minimal value v.t / := f . x .t /; t / = − 23 |t|3 .
P .t / :
Ex. 4. For
1
min f . x; t / := x3 − t2 x2 − t4 x
3
√
the critical points are given by the curves x1;2 .t / = t2 .1 ± 2 / and the minimizer by
x1 .t /.
P .t / :
4
GEORG STILL, UNIVERSITY OF TWENTE
Rule. The following appears:
• The value function v.t / = f . x .t /; t / may behave “smoother” than the minimizer function x .t /.
• A singular behavior appears at solution points . x; t / of ∇x f . x; t / = 0 where
the matrix ∇2x f . x; t / is singular.
Ex. 5. Check the singular behavior for the examples Ex.3 and Ex.4.
The next theorem describes the situation near a non-singular solution . x; t / of (3).
Theorem 4. (local stability result) Let x be a (local) solution of P .t /, t ∈ T, such that
∇x f . x; t / = 0
and
∇2x f . x; t / > 0 (pos. def.).
Then in a neighborhood Ut .t / there is a C 1 -function x : Ut .t / → Rn such that x .t / = x
and for any t ∈ Ut .t /, x .t / is a strict local minimizer of P .t /. Moreover for t ∈ Ut .t /,
∇x .t / = −[∇2x f . x .t /; t /]−1 ∇2xt f . x .t /; t / ;
and the value function v.t / := f . x .t /; t / is a C 2 -function with
∇v.t / = ∇t f . x .t /; t / and
∇2 v.t / = ∇2tx f . x .t /; t /∇x .t / + ∇2t f . x .t /; t / :
Proof. Apply the IFT to the equation ∇x f . x; t / = 0.
Rem. Note that x .t / is C 1 but v.t / is C 2 .
3. PARAMETRIC
CONSTRAINT OPTIMIZATION
3.1. Non-parametric programs. We consider nonlinear programs of the form (omitting equality constraints)
(4)
P:
minn f . x /
x∈R
s.t.
x ∈ F := {x | g j . x / ≤ 0; j ∈ J}
with index set J = {1; : : : ; m}. The set F is called feasible set. We again assume
f; g j ∈ C 2 .
Def. A point x ∈ F is called local minimizer of P of order s = 1 or s = 2 if there are
constants c; " > 0 such that
f . x / − f . x / ≥ ckx − xks
∀x ∈ F ; kx − xk < " :
It is a global minimizer if f . x / ≥ f . x holds ∀x ∈ F .
For x ∈ F we introduce the active index set
J0 . x / = { j ∈ J | g j . x / = 0}
LECTURES ON PARAMETRIC OPTIMIZATION: AN INTRODUCTION
5
and the Lagrangian function
L . x; / = f . x / +
X
j∈ J0 . x /
j g j .x/ :
The coefficients j are called Lagrangian multipliers. We say that the Linear Independence constraint qualification (LICQ) is satisfied at x ∈ F if
∇g j . x /; j ∈ J0 . x /; are linearly independent :
The next theorem gives the famous Karush-Kuhn-Tucker (KKT) sufficient optimality
conditions.
Theorem 5. (Sufficient optimality conditions) Let x ∈ F satisfy LICQ.
Let with multipliers j the KKT condition
X
j ∇g j . x / = 0 ; j ≥ 0; j ∈ J0 . x /
∇x L . x; / = ∇ f . x / +
(a) (Order one)
(5)
j∈ J0 . x /
be satisfied such that
j > 0; ∀ j ∈ J0 . x /
(Strict complementarity (SC))
and | J0 . x /| = n. Then x is a local minimizer of (P) of order s = 1.
(b) (Order two) Let with multipliers j the KKT condition (5) be satisfied such that
(SC) holds and the second order condition (SOC)
SOC :
d T ∇2x L . x; /d > 0 ∀d ∈ Tx \ {0}
where Tx is the tangentspace Tx = {d | ∇g j . x /d = 0; j ∈ J0 . x /}. Then x is a local
minimizer of (P) of order s = 2.
Ex. 6. Show that the point x = 0 is the minimizer of order s = 1 of the problem
min x2
s.t.
g1 . x / : e−x1 − x2 − 1 ≤ 0; g2 . x / := x1 − x2 ≤ 0
Rem. The KKT conditions can also be given in the equivalent (global) form:
X
(6)
∇x L . x; / := ∇ f . x / +
j ∇g j . x / = 0
j∈ J
j · g j . x / = 0; j ∈ J
j ; −g j . x / ≥ 0; j ∈ J :
3.2. Parametric programs. We consider nonlinear parametric programs of the form
(omitting equality constraints): Let T ⊂ R p be some open set. For t ∈ T find local
minimizers x = x .t / of
(7)
P .t / :
minn f . x; t /
x∈R
s.t.
x ∈ F .t / := {x | g j . x; t / ≤ 0 ; j ∈ J} :
6
GEORG STILL, UNIVERSITY OF TWENTE
For t ∈ T and feasible x ∈ F .t / we denote by J0 . x; t / the active index set and by
L . x; t; / the Lagrangian function
X
L . x; t; / = f . x; t / +
j g j . x; t / :
j∈ J0 . x;t /
To find (near . x; t /) local minimizers x of P .t / we are looking for solutions . x; t; /
of the KKT-equations (with j ≥ 0)
P
∇x f . x; t / +
j ∇x g j . x; t / = 0
j∈ J0 . x;t /
(8)
F . x; t; / :=
g j . x; t / = 0; j ∈ J0 . x; t / :
From the sufficient optimality conditions in Theorem 5 we obtain
Theorem 6. (Local stability result) Let x ∈ F .t /. Suppose that with multipliers j the
KKT condition ∇x L . x; t ; / = 0 is satisfied such that
(1) LICQ holds at x wrt. F .t /.
(2) j > 0; ∀ j ∈ J0 . x; t /
(SC)
and either
(3a) (order one) | J0 . x; t /| = n
or
(3b) (order two)
d T ∇2x L . x; t ; /d > 0 ∀d ∈ Tx;t \ {0}
where Tx;t is the tangentspace Tx = {d | ∇x g j . x; t /d = 0; j ∈ J0 . x; t /}.
(According to Theorem 5, x is a local minimizer of P .t / of order s = 1 in case (3a)
and of order s = 2 in case (3b)).
Then there exist a neighborhood Ut .t / of t and C 1 -functions x : Ut .t / → Rn , :
Ut .t / → R| J0 . x;t /| such that x .t / = x, .t / = and for any t ∈ Ut .t / the point x .t / is a
strict local minimizer of P .t / (of order 1 in case (3a) and of order 2 in case (3b)) with
corresponding multiplier .t /. Moreover for t ∈ Ut .t / the derivatives of x .t /; .t /
and the value function v.t / = f . x .t /; t / are given by
∇x .t /
= −[∇. x;/ F . x .t /; t; .t //]−1 ∇t F . x .t /; t; .t //
∇.t /
and
∇v.t / = ∇t L . x .t /; t; .t // :
Proof. Apply the IFT to the equation (8).
2
Ex. 7. Let A ∈ Rn×n be a symmetric matrix and B ∈ Rn×m (n ≥ m). Suppose the
matrix B has full rank m and the following holds:
d T Ad 6= 0
∀d ∈ Rn \ {0} such thatB T d = 0 :
LECTURES ON PARAMETRIC OPTIMIZATION: AN INTRODUCTION
7
Show that then the following matrix is nonsingular:
A B
BT 0
Rem. Many further (often difficult) results are dealing with the generalization of this
stability result under weaker assumptions (i.e., LICQ or SC does not hold; see e.g., [2],
[3]).
4. L INEAR PARAMETRIC PROGRAMS
4.1. Non-parametric linear programs. Let be given a matrix A ∈ Rm×n with m rows
a Tj ; j ∈ J := {1; : : : ; m} and vectors b ∈ Rm , c ∈ Rn . We consider primal problems of
the form
(9)
P:
max c T x
s.t.
x ∈ F P = {x | a Tj x ≤ b j ; j ∈ J} :
We often write the feasible set in compact form F = {x | Ax ≤ b}. The problem
m
X
(10)
D : min b T y s.t. y ∈ FD = {y |
y j a j = c; y ≥ 0} :
j=1
is called the dual problem. A vector x ∈ F P is called feasible for P and a vector y
satisfying the feasibility conditions A T y = c; y ≥ 0 is called feasible for D. Let v P ,
v D resp. denote the maximum value of P resp. minimum value of D.
Lemma 1. (weak duality) Let x be feasible for P and y be feasible for D. Then
cT x ≤ bT y
T
and thus
vP ≤ vD :
T
If c x = b y holds then x is a maximizer of P and y a minimizer of D.
Proof. As Ex.
For x ∈ F P as usual we define the active index set J0 . x / = { j | a Tj x = b j }. For a subset
J0 ⊂ J we denote by A J0 the submatrix of A with rows a Tj ; j ∈ J0 and for y ∈ Rm by
y J0 the vector . y j ; j ∈ J0 /.
Def. A feasible point x ∈ F P is called a vertex of the polyhedron F P if the vectors
a j ; j ∈ J0 . x / form a basis of Rn
or equivalently if A J0 . x / has rank n :
(This implies | J0 . x /| ≥ n. The vertex x is called nondegenerate if LICQ holds, i.e.,
a j ; j ∈ J0 . x / are linear independent (implying | J0 . x /| = n or equivalently A J0 . x / is
non-singular).
Theorem 7.
(a) (Existence and strong duality) If both P and D are feasible then there exist solutions
x of P and y of D. Moreover for (any of) these solutions we have
cT x = bT y
and thus
vP = vD :
8
GEORG STILL, UNIVERSITY OF TWENTE
(b) (Optimality conditions) A point x ∈ F P is a solution of P if and only if there is a
corresponding y ∈ FD such that the complementarity conditions
y T .b − Ax / = 0
y j .b j − a Tj x / = 0; ∀ j ∈ J
or
hold or equivalently if there exists y j ; j ∈ J0 . x /, such that KKT conditions are satisfied:
X
y j a j = c; y j ≥ 0; j ∈ J0 . x / :
j∈ J0 . x /
It appears that normally the solution of P arises at a vertex of F P .
Lemma 2. If the polyhedron F P has a vertex (at least one) and v P < ∞ then the max
value v P of P is also attained at some vertex of F P .
Rem. The Simplex algorithm for solving P proceeds from vertex to vertex of FP until
the optimality conditions in Theorem 7(b) are met.
4.2. Parametric linear programs. In a parametric LP we have given a C 2 -matrix
function A .t / : R p → Rm×n with m rows a Tj .t /; j ∈ J := {1; : : : ; m} and C 2 vector
functions b .t / : R p → Rm , c .t / : R p → Rn and the open parameter set T ⊂ R p : For
any t ∈ T we wish to solve the primal program
(11)
P .t / :
max c T .t / x
s.t.
x ∈ F P .t / = {x | A .t / x ≤ b .t /}
The corresponding dual reads
(12)
D .t / :
min b T .t / y
s.t.
y ∈ FD .t / = {y | A T .t / y = c .t /; y ≥ 0} :
For t ∈ T and x ∈ F P .t / the active index set is J0 . x; t /.
Linear Production Model and Shadow Prices
We discuss an interesting first parametric result. Assume a factory produces n different products P1 ; : : : ; Pn . The production relies on material coming from m different
resources R1 ; : : : ; Rm in such a way that the production of 1 unit of a product Pj requires aij units of resource Ri , for i = 1; : : : ; m.
Suppose we can sell our production for the price of c j per 1 unit of Pj and that bi
units of each resource Ri are available for the total production. How many units x j
of each product Pj should we produce in order to maximize the total receipt from the
sales?
An optimal production plan x̄ = . x̄1 ; : : : ; x̄n /T corresponds to an optimal solution of
the linear program
(13)
P:
max c T x
s.t.
Ax ≤ b; x ≥ 0 :
LECTURES ON PARAMETRIC OPTIMIZATION: AN INTRODUCTION
9
Here A = .aij / is the matrix with the elements aij . Let x be a solution with corresponding solution y of the dual problem
(14)
D:
min b T y
s.t.
A T y ≥ c; y ≥ 0 :
and maximum profit z̄ = c T x = b T y.
Could we possibly increase the profit by spending money on increasing the resource
capacity b and adjusting the production plan? If so, how much would we be willing to
pay for 1 more unit of resource Ri ?
Let us increase (for fixed i) the capacity of Ri from bi to bi0 = bi + t, ȳ is still feasible.
So weak duality (see Lemma1) allows us to compute the following upper bound on the
expected profit:
z0 ≤ b1 ȳ1 + : : : + bi0 ȳi + : : : + bm ȳm = t · ȳi +
m
X
bs ȳs = t · ȳi + z̄ :
s=1
Accordingly, we would not want to pay more than t · ȳi for t more units of Ri . In
this sense, the coefficients ȳi of the dual optimal solution ȳ can be interpreted as the
shadow prices of the resources Ri .
Note that for the value function v.t / in dependence from the parameter t we find
v.t / − v.0 /
t
=
z0 − z
≤ yi :
t
REMARK. The notion of shadow prices furnishes also an intuitive interpretation of compleP
mentary slackness. If the slack s̄i = bi − nj=1 aij x̄ j is strictly positive at the optimal production x̄, we do not use resource Ri to its full capacity. Therefore, we would expect no gain from
an increase of Ri ’s capacity.
General case. We come back to the general parametric LP in (11). Suppose for t ∈ T
the point x ∈ F P . x / is a vertex solution of P .t /. To find for t near t solutions x .t / of
P .t / we have to find feasible solutions x and y of the system of optimality conditions:
(15)
P:
A J0 . x;t / .t / x = b J0 . x;t / .t /
and
(16)
D:
A TJ0 . x;t / .t / y J0 . x;t / = c .t /; y J0 . x;t / ≥ 0
If x is a non-degenerate vertex, i.e., A J0 . x;t / .t / is nonsingular, this is possible by applying the IFT to these systems. So the next statment represents a special case of
Theorem 6 case (3a).
Theorem 8. (Local stability result) Let x ∈ F P .t / be a vertex solution of P .t / with
corresponding dual solution y J0 . x;t / such that
(1) x is a nondegenerate vertex, i.e., LICQ holds.
(2) y j > 0; ∀ j ∈ J0 . x; t /
(SC)
10
GEORG STILL, UNIVERSITY OF TWENTE
(According to Theorem 5 x is a local minimizer of P .t / of order s = 1.) Then there
exist a neighborhood Ut .t / and C 1 -functions x : Ut .t / → Rn , y J0 . x;t / : Ut .t / → R| J0 . x;t /|
such that x .t / = x, y J0 . x;t / .t / = y J0 . x;t / and for any t ∈ Ut .t / the point x .t / is a vertex
solution of P .t / (of order s=1) with corresponding multiplier y J0 . x;t / .t /. Moreover for
t ∈ Ut .t / the derivatives of x .t / and the value function v.t / = c T .t / x .t / are given by
∇x .t / = [ A J0 . x;t / .t /]−1 ∇b J0 . x;t / .t / − ∇ A J0 . x;t / .t / x .t /
and
∇v.t / = [∇c J0 . x;t / .t /]T x + [y J0 . x;t / .t /]T [∇b J0 . x;t / .t / − ∇ A TJ0 . x;t / .t / x .t /]
Rem. The production model case above is a special case where A ; c do not depend on
t ∈ R (so p = 1) and (for fixed i ∈ J0 . x; t /) dtd b J0 . x;t / .t / = ei (ei is the ith unit vector in
R| J0 . x;t /| ) the so that
d
v.t / = [y J0 .x;t / .t /]T ei = yi :
dt
5. A PPLICATIONS
5.1. Interior point methods. The basic idea of the interior point method for solving
a non-parametric program
P:
minn f . x /
x∈R
s.t.
g j . x / ≤ 0; j ∈ J
is simply as follows. Consider the perturbed KKT system
P
∇ f . x / + j∈ J j ∇g j . x / = 0
(17)
F . x; ; / :=
− j · g j . x / = ; j ∈ J ;
j ; −g j . x / ≥ 0; j ∈ J :
where > 0 is a perturbation parameter. The idea is to find solutions x ./ and j ./
of this system (satisfying −g j . x .//; j ./ > 0) and to let ↓ 0. We expect that
x ./ converges to a solution x of P. Under the sufficient optimality conditions this
procedure is well-defined (at least for small ).
Theorem 9. Let x be a local minimizer of P such that the sufficient optimality conditions of Theorem 5 are fulfilled with multiplier so that j > 0 holds for all j ∈ J0 . x /.
Then there exists C 1 -functions x : .−; / → Rn , j : .−; / → R; j ∈ J0 . x / ( > 0)
such that x .0 / = x; j .0 / = j , and x ./; j ./ are locally unique solutions of (17)
Proof. Follows by applying the IFT to the equation (17). By Ex. 7, the Jacobian
∇2x L . x; /
∇T g . x /
∇F . x; / =
−diag ./∇g . x / −diag . g . x //
is invertible at . x; / = . x; /.
2
LECTURES ON PARAMETRIC OPTIMIZATION: AN INTRODUCTION
11
5.2. A parametric location problem. We consider a concrete location problem in
Germany (see [6] for details). Suppose some good is produced at 5 existing plants at
location s j = .s1j ; s2j /, j = 1; ::; 5 (s1j longitude, s2j latitude in Germany) and a sixth new
plant has to be build at location t = .t1 ; t2 / (to be determined) to satisfy the demands
of Vi units of goods in 99 towns i at location `i = .`1i ; `2i /, i = 1; : : : ; 99. Suppose (for
simplicity) that the transportation cost cij from plant j to town i (per unit of the good)
is (proportionally to) the Euclidian distance
q
q
cij = .s1j − `1i /2 + .s2j − `2i /2 ; j = 1; ::; 5; ci6 .t / = .t1 − `1i /2 + .t2 − `2i /2 ; ∀i:
P
Suppose further that the total demand V = i Vi will be produced in the 6 plants with
a production of p j units of the good in plant j where
10
30
10
15
15
20
; p2 = V
; p3 = V
; p4 = V
; = p5 = V
; p6 = V
:
100
100
100
100
100
100
The problem now is to find the location t = .t1 ; t2 / of the new plant such that the total
transportation costs are minimized. For any fixed location t the optimal transportation
strategy is given by the solution of the transportation problem (standard LP),
p1 = V
P .t / :
v.t / := min
yij
X
5 X
X
j=1
i
cij yij +
X
ci6 .t / yi6
s.t.
i
yij = p j ; j = 1; ::; 6 ;
i
X
yij = Vi ; i = 1; ::; 99 :
j
Here yij is the number of units of the good to be transported from plant j to town i. The
problem is now to find a (global or local) minimizer of v.t /. Most local minimization
algorithm are based on the computation of the (negative) gradient of the objective
function at some actual point t = t k , d = −∇v.t k /.
Suppose that yijk is the vertex solution of P .t k /. Then by the results of Section 4.2
the gradient can be computed via the formula
k
99
X
1
.t1 − `1i /
k
k k
k
k
yi6 q
∇v.t / = ∇t L . y ; t ; / =
.t2k − `2i /
i=1
.t1k − `1i /2 + .t2k − `2i /2
6. PATHFOLLOWING IN PRACTICE
We shortly discuss how a solution curve . x .t /; t / of a one-parametric equation
F . x; t / = 0
F : Rn × R → Rn ; t ∈ R ;
can be followed numerically.
The basic idea is to use some sort of Newton procedure. Recall that the classical
Newton method is the most fundamental approach for solving a system of n equations
12
GEORG STILL, UNIVERSITY OF TWENTE
in n unknowns:
F .x/ = 0
F : Rn → Rn :
The famous Newton iteration for computing a solution is to start with some (appropriate) starting point x0 and to iterate according to
xk+1 = xk − [∇F . xk /]−1 F . xk /;
k = 0; 1; :::
It is well-known that this iteration converges quadratically to a solution x of F . x / = 0
if
• x0 is chosen close enough to x and if
• ∇F . x / is a non-singular matrix.
The simplest way to follow approximately a solution curve x .t / of F . x; t / = 0, i.e.,
F . x .t /; t / = 0, on an interval t ∈ [a; b] is to discretize [a; b] by
b−a
; ` = 0; : : : ; N
N
(for some N ∈ N) and to compute for any ` = 0; : : : ; N, a solution x` = x .t` / of
F . x; t` / = 0 by a Newton iteration,
t` = a + `
x`k+1 = x`k − [∇F . x`k ; t` /]−1 F . x`k ; t` /;
k = 0; 1; :::
/ x0 .t / (for ` ≥ 1). The derivative x0 .t / can be
starting with x0` = x`−1 + .b−a
`−1
`−1
N
computed with the formula in Theorem 2. We refer the reader to the book [1] for
details e.g., on:
• How to perform pathfollowing efficiently?
• How to deal with branching points . x; t / where different solution curves intersect?
7. G ENERAL PARAMETRIC PROGRAMMING
In the present section we analyze the parametric behavior under weaker assumptions
where the Implicit Function Theorem is no more applicable. We try to keep the introduction of ’new’ concepts to a minimum and to motivate the results by examples.
Consider again the parametric optimization problem
(18)
P .t / :
minn f . x; t / s.t.
x∈R
x ∈ F .t / := {x ∈ Rn | g j . x; t / ≤ 0; j ∈ J} ;
depending on the parameter t ∈ T, where T ⊂ R p is an open parameter set. Again
J := {1; : : : ; m}. All functions f; g j are assumed to be (at least) continuous everywhere.
Notation: Let v.t / = minx∈F .t / f . x; t / denote the minimal value of P .t / (v.t / = ∞
if F .t / = ∅) and let S .t / the set of (global) minimizers. The mappings F : T ⇒ Rn
and S : T ⇒ Rn are so-called setvalued mappings.
LECTURES ON PARAMETRIC OPTIMIZATION: AN INTRODUCTION
13
Problem of parametric optimization: How do the value function v.t / and the mappings F .t /, S .t / change with t. (Continuously, smoothly?)
Definition 1. Let v : T → R∞ be given, R∞ = R ∪ {−∞; ∞}.
(a) The function v is called upper semicontinuous (usc) at t ∈ T if for any " > 0 there
exists > 0 such that
v.t / ≤ v.t / + "
for all kt − tk < :
(b) The function v is called lower semicontinuous (lsc) at t ∈ T if for any " > 0 there
exists > 0 such that
v.t / ≥ v.t / − "
for all kt − tk < :
We shall see that the lower- and upper semicontinuity of the value function v.t / depend on different assumptions. Obviously to assure the lower semicontinuity of v at t
the feasible set F .t / should not become essentially larger by a small perturbation t of t
and to assure the upper semicontinuity of v the set F .t / should not become essentially
smaller after a perturbation. To avoid an ’explosion’ of F .t / we will need some compactness assumptions for F .t / and to prevent an ’implosion’ a Constraint Qualification
will be needed.
Definition 2. Let the set valued mapping F : T ⇒ Rn be given.
(a) F is called closed at t ∈ T if for any sequences tl ; xl , l ∈ N, with tl → t ; xl ∈ F .tl /
the condition xl → x implies x ∈ F .t /.
(b) (no explosion of F .t / after perturbation of t = t) F is called outer semicontinuous
(osc) at t ∈ T if for any sequences tl ; xl , l ∈ N with tl → t ; xl ∈ F .tl / there exists
xl ∈ F .t / such that kxl − xl k → 0 for l → ∞.
(c) (no implosion of F .t / after perturbation of t = t) F is inner semicontinuous (isc)
at t ∈ T if for any x ∈ F .t / and sequence tl → t there exists a sequence xl such that
xl ∈ F .tl /, for l large enough, and kxl − xk → 0.
The mapping F is called continuous at t if it is both osc and isc at t
Ex. 8. Show for F .t / = {x ∈ Rn | g j . x; t / ≤ 0; j ∈ J}; t ∈ T (g j continuous) that the
mapping F : T ⇒ Rn is closed on T.
Lower semicontinuity of v.t /. To assure lower semicontinuity of v at t we (minimally) need compactness of F .t / (even in the case that F .t / behaves continuously).
14
GEORG STILL, UNIVERSITY OF TWENTE
Ex. 9. (Linear problem, F .t / ≡ F constant not bounded and v is not lsc.) For the
problem min x2 − tx1 s.t. x1 ≥ 0; x2 ≥ 0 we find

{.0; 0 /}
for t < 0

0 for t ≤ 0
v.t / = −∞ for t > 0
and S .t / = {. x1 ; 0 / | x1 ≥ 0} for t = 0

∅
for t > 0
We even need the following stronger condition
LC. (local compactness of F at t) There exists " > 0 and a compact set C0 such that
[
F .t / ⊂ C0 :
kt−tk≤"
Without this condition LC the lower semicontinuity of v is not assured in general.
Ex. 10. (F .t /; S .t / compact, LC does not hold and v is not lsc and F .t / is not osc at
t.) Consider the problem
1
min x2 − x1 s.t. x2 ≤ tx1 − ; x2 ≤ −tx1 ; x2 ≥ x2 . x1 / ;
2
where x2 is piecewise linear with x2 .0 / = −1; x2 .t / = − 14 for |x1 | ≥ 1 (sketch the
problem as an Ex.) we find:
1
4
3
−1 − 6−8t
for t ≤ 0
{. 6−8t
; 6−8t
− 1 / for t ≤ 0
v.t / =
S
.
t
/
=
1 1
1
1
{ 4 . t ; −1 /}
for 0 < t ≤ 1=4 :
− 4 .1 + t / for 0 < t ≤ 1=4
The lower semicontinuity of v depends on the following technical outer semicontinuity
and compactness condition saying that for t near t at least one point xt ∈ S .t / can be
approached by elements in a compact subset of F .t /.
AL. There exists a neighborhood Ut .t / of t and a compact set C ⊂ Rn such that for
all t ∈ Ut .t / with F .t / 6= ∅ there is some xt ∈ S .t / and some xt ∈ F .t / ∩ C satisfying
kxt − xt k → 0 for t → t.
Lemma 3.
(a) Let AL be satisfied at t. Then v is lsc at t.
(b) Let the local compactness condition LC be fulfilled at t. Then AL is satisfied ( i.e.,
v is lsc at t).
Proof. (a) Choose any t ∈ Ut .t /. For F .t / = ∅ it follows ∞ = v.t / ≥ v.t /. In case
F .t / 6= ∅ let xt ∈ S .t /; xt ∈ F .t / satisfy the conditions in AL. Then by the uniform
continuity of f on cl Ut .t / × C " (C " is a compact set containing C and xt for small t)
we find for t → 0:
v.t / − v.t / ≥ f . xt ; t / − f . xt ; t / → 0 :
(b) Note that by LC for kt − tk < " and F .t / 6= ∅ the feasible set is contained in a
compact set. So S .t / 6= ∅ and we can choose elements xt ∈ S .t /. Now assume to the
LECTURES ON PARAMETRIC OPTIMIZATION: AN INTRODUCTION
15
contrary that AL is not fulfilled. Then there exist a sequence tl → t and " > 0 such
that for any xtl ∈ S .tl /
(19)
kxtl − xk > "
for all
x ∈ F .t / :
Since by LC the vectors xtl are contained in the compact set C0 (be taking a subsequence) we can assume xtl → x. By closedness of the mapping F (see Ex. 8) it
follows x ∈ F .t / contradicting (19).
2
Ex. 11. Let the local compactness condition LC be fulfilled at t then the mapping F is
osc at t.
Proof. Assume that F is not osc at t, i.e. there exist sequences tl ; xl , tl → t, xl ∈ F .tl /
and " > 0 such that for all l
(20)
kxl − xk ≥ "
for all x ∈ F .t / :
But the assumption LC implies xl ∈ C0 compact and there exists a convergent subsequence xl → e
x. By closedness of F it follows e
x ∈ F .t / contradicting (20).
2
The convex case. By Ex. 10 in the general (non-convex) case the condition ∅ 6= F .t /
compact is not sufficient to assure the outer semicontinuity of F and the condition
∅ 6= S .t / compact does not imply the lower semicontinuity of v. Under the following
convexity assumptions the situation is changed.
Recall that a function f . x /, f : Rn → R is called convex if for any x1 ; x2 ∈ Rn and
∈ [0; 1] it follows
f ..1 − / x1 + x2 / ≤ .1 − / f . x1 / + f . x2 / :
AC. For each (fixed) t ∈ T the functions g j . x; t / are convex in x for all j ∈ J.
AC0 . In addition to AC, for each (fixed) t ∈ T the function f . x; t / is convex in x, (i.e.,
the problems P .t / are convex programs).
Lemma 4. Let the convexity condition AC hold and assume ∅ 6= F .t / is bounded
(compact). Then the local compactness condition LC is satisfied. In particular F is
osc (cf. Ex. 10) and v is lsc at t (cf. Lemma 3).
Proof. Assume to the contrary that LC doesn’t hold, i.e., there exists tl → t, xl ∈ F .tl /
with kxl k → ∞. Since F .t / is bounded we can choose some > 0 satisfying
(21)
kxk < for all x ∈ F .t / :
16
GEORG STILL, UNIVERSITY OF TWENTE
Given some x ∈ F .t / determine 0 < l < 1 such that e
xl := x + l . xl − x / satisfies
ke
xl k = . Selecting some subsequence we can assume e
xl → e
x with ke
xk = . By
convexity of g j using xl ∈ F .tl / we find for all j ∈ J
g j .e
x l ; t l / ≤ .1 − l / g j . x ; t l / + l g j . x l ; t l / ≤ g j . x ; t l /
and then
g j .e
x; t / = lim g j .e
xl ; tl / ≤ lim g j . x; tl / = g j . x; t / ≤ 0 :
l→∞
l→∞
So e
x ∈ F .t / and ke
xl k = in contradiction to (21).
2
In the convex case even the boundedness of S .t / is sufficient to assure that v is lsc at t.
Theorem 10. Let the convexity assumption AC0 hold and let S .t / be nonempty and
bounded (compact). Then v is lsc at t.
Proof. Consider the lower level set e
F .t / := {x ∈ F .t / | f . x; t / ≤ v.t /}. Then the
set e
F .t / = S .t / is nonempty and bounded and the constraint set e
F .t / satisfies the
e
assumption of Lemma 4. So the mapping F has the local boundedness property, i.e.,
e0 and "0 > 0
with some C
[
e0 :
e
F .t / ⊂ C
kt−tk≤"0
Consider now any t, kt − tk ≤ "0 . If e
F .t / = ∅ it directly follows v.t / ≥ v.t /. In case
e
e
F .t / 6= ∅ the set F .t / is a (nonempty) compact lower level set for P .t /. This means
that the problem
e
P .t / :
min f .t; x /
x
s.t.
x∈e
F .t /
with value e
v .t / has a nonempty solution set e
S .t / and S .t / = e
S .t / implying v.t / = e
v .t /.
Since LC holds for e
F at t by Lemma 4 the function e
v .t / and thus v.t / is lsc at t.
2
The upper semicontinuity of v.t /. The upper semicontinuity of v depends on a different assumption. The local compactness condition LC is not sufficient (and not necessary).
p
Ex. 12. For the problem min x1 s.t. x12 + x22 ≤ −t , which obviously satisfies LC we
find
 p
 { x12 + x22 ≤ |t|} for t < 0
t for t ≤ 0
F .t / =
and v.t / =
{.0; 0 /}
for t = 0
∞ for t > 0

∅
for t > 0
LECTURES ON PARAMETRIC OPTIMIZATION: AN INTRODUCTION
17
The problem here is that the mapping F is not isc at t. To assure the upper semicontinuity of v at t we only need an inner semicontinuity condition at one point x ∈ S .t /.
AU. There exist a minimizer x ∈ S .t / and a neighborhood Ut .t / of t such that for all
t ∈ Ut .t / there is some xt ∈ F .t / satisfying kxt − xk → 0 for t → t.
Lemma 5. Let AU be fulfilled at t. Then v is usc at t.
Proof. Consider any sequence tl → t and the corresponding points xl , kxl − xk → 0
according to AU. By continuity, for any " > 0, the inequality | f .tl ; xl / − f .t ; x /| < "
holds for l large enough. So
v.tl / ≤ f . xl ; tl / ≤ f . x; t / + " = v.t / + "
proving that v is usc.
2
A natural condition to force the assumption AU is a so-called Constraint Qualification
(CQ).
Definition 3. The Constraint Qualification CQ is said to hold at . x; t / with x ∈ F .t / if
there is a sequence x → x such that
g j . x ; t / < 0
holds for all j ∈ J :
Corollary 1. Let CQ be satisfied at . x; t / for (at least) one point x ∈ S .t /. Then the
condition AU is satisfied, i.e., v is usc at t.
Proof. Assume to the contrary that AU is not satisfied for . x; t /. This means that there
exist a sequence tl → t and " > 0 such that for all l (large)
(22)
kxl − xk > "
for all xl ∈ F .tl / :
By CQ there is a sequence x → x with g j . x ; t / < 0 for j ∈ J. Choosing 0 large such
that kx0 − xk < ", by continuity of g j and using g j . x0 ; t / < 0, we find g j . x0 ; tl / < 0
for l large enough and thus x0 ∈ F .tl / in contradiction to (22).
2
Ex. 13. Let CQ hold at . x; t /, x ∈ F .t /. Then there is a neighborhood Ut .t / of t such
that F .t / is nonempty for all t ∈ Ut .t /. If CQ holds at . x; t / for each point x ∈ F .t /
then F is isc at t.
Proof. By choosing for x any element x ∈ F .t / by Corollary 1 the assumption AU
holds at . x; t / and condition for inner semicontinuity for F at t is met.
2
The behavior of S(t). Let us shortly study the continuity properties of the mapping
S .t /.
18
GEORG STILL, UNIVERSITY OF TWENTE
Obviously the inner semicontinuity of S is stronger than the condition AU. So if S is
isc the function v is usc (see Lemma 5). However the inner semicontinuity of S .t / is a
very ’strong’ condition. Even if the local compactness condition LC and the Constraint
Qualification hold it need not to be satisfied. We give an example.
Ex. 14. (LC and CQ holds but S is not isc.) min x2 − tx1 s.t. |x1 | ≤ 1; |x2 | ≤ 1: Then
near t = 0 we obtain

{.−1; −1 /}
for t < 0

S .t / = {. x1 ; −1 / | |x1 | ≤ 1} for t = 0
and v.t / = −1 − |t| :

{.1; −1 /}
for t > 0
We now discuss the closedness and the outer semicontinuity of S.
Ex. 15. Let CQ be satisfied at . x; t / for (at least) one point x ∈ S .t /. Then S is closed
at t
Proof. Assume to the contrary that S is not closed. This means that there are sequences
tl → t, xl → x with xl ∈ S .tl / but x ∈
= S .t /. By closedness of F it follows x ∈ F .t / and
x∈
= S .t / yields
v.t / < f . x; t / = lim f . xl ; tl / = v.tl /:
l→∞
So v is not usc at t in contradiction to Corollary 1.
Ex. 16. Let S .t / be compact and assume ∅ 6= S .t / for a neighborhood Ut .t / of t and
S is osc at t. Then AL holds implying that v is lsc at t (see Lemma 3).
Lemma 6. Let LC be satisfied at t and let CQ hold at . x; t / for (at least) one x ∈ S .t /.
Then there exists a neighborhood Ut .t / of t such that for all t ∈ Ut .t / the set S .t / is
nonempty and compact. Moreover the mapping S is osc at t.
Proof. By LC and CQ in some neighborhood Ut .t / of t the sets F .t / are nonempty (cf.
Ex. 13) and compact. So S .t / 6= ∅ (and compact) for t ∈ Ut .t /. Assume now that S is
not osc at t, i.e., there exist tl → t, xl ∈ S .tl /, " > 0 such that
(23)
kxl − xk > "
for all x ∈ S .t / :
By LC (taking a subsequence if necessary) we can assume xl → e
x ∈ F .t / (see Ex.2).
By Ex. 15 it follows e
x ∈ S .t / contradicting (23).
2
R EFERENCES
[1] Allgower E.L., Georg K, Numerical Continuation Methods., Springer-Verlag, Berlin, (1990).
[2] Bank G., Guddat J., Klatte D., Kummer B., Tammer D., Non-linear Parametric Optimization,
Birkhäuser Verlag, Basel, (1983).
[3] Bonnans F., Shapiro, Alexander Perturbation analysis of optimization problems. Springer Series
in Operations Research. Springer-Verlag, New York, (2000).
LECTURES ON PARAMETRIC OPTIMIZATION: AN INTRODUCTION
19
[4] Faigle U., Kern W., Still G., Algorithmic Principles of Mathematical Programming, Kluwer, Dordrecht, (2002).
[5] Guddat J., Guerra F. and Jongen H. Th., Parametric optimization: singularities, pathfollowing and
jumps, Teubner and John Wiley, Chichester (1990).
[6] Hettich R., Parametric Optimization: Applications and Computational Methods, Preprint, University of Trier (1985).
[7] W. Rudin, Principles of Mathematical Analysis, (third edition), McGraw-Hill, (1976).
[8] O. Stein, Parametrische Optimierung, FernUniversitt Hagen, 2004.