Exact and inexact penalty methods for the generalized bilevel

Mathematical Programming 74 (1996) 141 - 157
Exact and inexact penalty methods
for the generalized bilevel programming problem
P. Marcotte a,*, D.L. Zhu b
a
~
.
.
.
.
Departement d lnformattque et de Recherche Operatumnelle, UniL,ersitd de Montreal, C.P. 6128,
Succursale Centre-Ville, Montr&21. Qudbec, Canada H3C 3J7
b Centre de Recherche sur les Transports, Unioersit~ de Montrdal, Montrgal, Qudbee, Canada H3C 3J7
Received 5 February 1993; revised manuscript received 12 October 1995
Abstract
We consider a hierarchical system where a leader incorporates into its strategy the reaction of
the follower to its decision. The follower's reaction is quite generally represented as the solution
set to a monotone variational inequality. For the solution of this nonconvex mathematical program
a penalty approach is proposed, based on the formulation of the lower level variational inequality
as a mathematical program. Under natural regularity conditions, we prove the exactness of a
certain penalty function, and give strong necessary optimality conditions for a class of generalized
bilevel programs.
Keywords: Optimization; Variational inequalities; Bilevel programming; Exact penalty; Descent methods
1. I n t r o d u c t i o n
W e consider a t w o - l e v e l hierarchical system in Euclidian space where the upper level
decision m a k e r (hereafter the leader) controls a v e c t o r of variables x, and the l o w e r
level (hereafter the follower) controls a v e c t o r of variables y. The leader m a k e s its
decision first, taking into account the reaction of the f o l l o w e r to its course of action. The
reaction y ( x ) of the f o l l o w e r to the l e a d e r ' s decision x is a solution to an equilibrium
problem represented by a variational inequality. W e then obtain the generalized bilevel
program, or G B L P :
GBLP:
min
x~X,y~r(x)
subject to
f ( x, y )
(F(
x, y ) , y - z ) ~< 0,
(1)
for all z e Y ( x ) ,
' Corresponding author.
0025-5610 9 1996 - The Mathematical Programming Society, Inc. All rights reserved
SSDI 0025-5610(95)00057-7
i 42
P. Marcotte. D.L. Zhu / Mathematical Programming 74 (I996) 141-157
where the leader is implicitly restricted to x-vectors such that the lower level constraint
set Y(x) be nonempty. This mathematical program was first introduced by Marcotte [18]
for formulating an equilibrium-constrained network design problem.
The term "follower" may denote either one, several or infinitely many "players". If
it denotes a single player whose objective g(x, y) is convex then the corresponding
standard bilevel program (or Stackelberg game) can be put in the form GBLP, with
F=-V>.f. If it denotes more than one player, then F represents the cost mapping
associated with the corresponding equilibrium solution concept: Nash equilibrium,
perfect equilibrium, etc.
This difficult, nonconvex problem has drawn the attention of the mathematical
comnmnity, both from the theoretical and computational points of view. (See the special
issues of the journals Computers and Operations Research [5] and Annals qf Operations
Research [1 ] devoted to this topic.) The present paper studies the properties of a class of
penalty functions, both exact and inexact, for the GBLP. The idea is not new. Ishizuka
and Ayioshi [22] developed a double penalty scheme for the nonlinear bilevel programming problem, where the lower level problem (an optimization problem in their case) is
transfomaed, via a quadratic penalty, into an unconstrained optimization problem or,
equivalently, into a system of nonlinear equations. In a second stage, this nonlinear
system is appended, again via a penalty term, to the upper level objective, to yield an
unconstrained global optimization problem. Similarly, Haslinger and Neittaanmiiki
proposed (see [13, Chapter 10]) to penalize the state constraints associated with the
lower level of their optimal shape design problem. Other schemes have also been
proposed in the realm of linear bilevel programming (see [3]).
In this paper, we formulate the lower level variational inequality as a parameterized
equation related to the duality gap of the lower level problem. We then use this " g a p "
function as a penalty term for the upper level problem. Since the gap function
characterizing the lower level is nonnegative over the feasible domain, the penalty term
assumes a very simple form. In the separable case, our formulation coincides with that
found in a frequently overlooked paper of Fisk [8], who also penalizes the "standard"
gap function. However, the exactness of this penalty scheme has not be proved by the
author.
The remainder of the paper is organized as follows: We first formulate the problem;
then we prove convergence results for an inexact penalty scheme as well as the
exactness of penalty schemes based on some linear gap function, both under separable
and nonseparable constraints; finally we show how our results can be used to derive
necessary optimality conditions for the GBLP.
2. Formulation of the problem
Let C denote a continuously differentiable function from ~ " ' +,,2 into ~ " and denote:
S = { ( x , y) E ~ " , .... I C ( ~ , y) ~ 0 } ,
P. Marcotte, D.L. Zhu / Mathematical Programming 74 (1996) 141-157
143
Y(x) = {y ~ ~"-' I c(x, y) ~0} = {yE R": I(x, y) Es},
x = {.,~ ~", Iv(x) *~}.
Let f be a continuously differentiable function from S into ~ and F a continuously
differentiable, Lipschitz continuous (with Lipschitz constant L) function from S into
[R"=, monotone on Y ( x ) with respect to y for every x in X. In this paper we will
consider the generalized bilevel program:
min
f ( x, 3')
(2)
,~_x.,~Y~,~
( F ( x , y), y - z ) < ~ O ,
subjectto
Vz~Y(x).
We will assume throughout the paper that the set S is convex and compact and that, for
fixed x in X, the parameterized set Y(x) is convex. Under those assumptions, the
GBLP is well defined, possesses at least one solution, and the solution set to the lower
level variational inequality is, for fixed x in X, a compact, convex set. We will also
make use of the following definitions.
Definition 1. (Dussault and Marcotte [7]) Let F be a continuous, monotone function
from a convex polyhedron X c JR" into ~ " and denote by VIP(X, F ) the variational
inequality problem associated with X and F, i.e., find x~ in X such that
(F(x*),x*-x)<~O,
forall x i n X.
We say that VIP(F, X) is geometrically stable if, for any solution x ~' of the variational
inequality, ( F ( x ~ ) , x " - x ) = 0
implies that x lies on the optimal face, i.e. the
VIP(F,X):
minimal face of X containing the (convex) solution set to VIP(F, X).
Definition 2. Let /2 be a point-to-set mapping from X~ into 2 x:. Let {x~} be a
sequence of points in X~. The mapping /2 is closed at x ~ X~ if
x k E Xj
]
Xk ---.). X
Yk ~ [2 ( x k
y~ ~ y
Y
~O(x).
The mapping ,(2 is open at x ~ Xj if
xk~x
~
?{Yk} such that
y ~ O( x)
Yk ~ [ 2 ( x k )
y~--" y.
If the mapping [2 is both closed and open at x ~ X, we say that it is continuous at x.
Definition 3. The point-to-set mapping ~(2 is uniformly compact at 2 if there exists a
nonempty neighborhood N of 2 such that the closure of U .,.s x O ( x ) is compact.
We associate with the lower level variational inequality V I P ( F ( x , . ), Y(x)) the gap
function
G,~(x, y ) =
max
z~ Y( x)
& ( x , y, z),
(3)
144
P. Marcotte. D.L. Zhu / Mathematical Programming 74 (1996) 141-157
where 4,(x, y, z) = ( F ( x , y), y - z ) - 7~a l l y - z ] ] 2 and a is a positive number.
The gap function has been used to construct descent methods for solving variational
inequalities. We refer in particular to [18,19] for the linear gap function (ez = 0), and to
[10] for the quadratic, differentiable gap function ( ~ > 0).
The function q5 is concave in z (strongly concave if c~ is positive). Also, since G~ is
nonnegative over Y(x), and G~,(x, y) = 0 if an only if y is a solution to the lower level
variational inequality parameterized by x, V I P ( F ( x , 9 ), Y(x)) can be rewritten as the
nonlinear equation
G ~ ( x , y) =
max
c~(x,y,z)=qa(x,y,p~,(x,y))=O,
z~ I"(x)
(4)
where p~(x, y) is any solution of (3). Finally this leads to a reformulation of the GBLP
as the standard mathematical program
Pl:
rain
.,~x.y~ r(x)
subject to
f ( x, y)
(5)
G~,( x, y) = 0.
3. An inexact penalty approach
In this section we will approximate PI by the penalized problem
P2:
min % ( x, y, p-) = f ( x, y) + p.G~(x, y ) ,
(x.y)~s
(6)
where p- is a positive number. Problem P2 is nonlinear with convex constraints, and its
objective function is continuously differentiable when c~ is positive. For each value of
the weight p- we denote by (x(p-), 3'(P-)) a global optimum of P2. A penalty function
algorithm is obtained by specifying a sequence of increasing (unbounded) weights { p-k}
and the associated sequence of iterates {(x(p-k), Y( P-~))}. The main convergence result
of this section relies on the following lemma.
L e m m a 1. Let {(x~ = x(P-k), Y~ = Y( P-~))} be a sequence generated by a penalty
algorithm based on the penalty function P2. Let ( x ~ , y " ) denote an optimal solution to
the GBLP. S e r f * = f ( x* , y ~ ). Then we haue the relationships
p-k)
p-k+,),
(v)
f ( xk, Yk) <~f( xk+,, Yk+,),
(9)
ct~,( xk, Yk, P.k) <~f*.
Proof. Since x k, x k + ~ E X ,
(10)
)'k ~
Y(xk) and Yk+ L ~ Y(xk+ ~), we have
f ( x k + I , Yk+,) +txk+~G,(x~+i, Yk+l) <~.f(xk, Yk) + I z k + t q ( x k ,
Yk)
P. Marcotte, D.L. Zhu / Mathematical Programming 74 (1996) 141-157
145
and
q~(xk+l, Yk+,, / z k + , ) = f ( x k + , , Yk+,) +l~+,G~(xk+L, Yk+,)
>~f(xk+,, Yk+,) +txkG~(xt.+,, Yk+l)
>~f( xk, Yk) + I-t~G~(xk, Yk),
since (x~, Yk) is optimal,
= q,~( xk, Yk, /-Lk),
and this proves (7). Also, since x k and xk+ t are optimal for their respective penalized
programs, we can write
f ( x k , Yk) +lxkG,(x~, Yk)<~f(x,+,, Yk+,) +lx~G~(xk+,, Y*+,)"
Combining the above inequalities yields
( tzk+, -- txk)G~( xk+,, Yk+,) <~( btk+, -- tzk)G~( xk, Yk),
which leads to (8), after dividing by the positive number/zk+ ~- P~k. Next the inequality
f(xk+
Yk+,) +tzkG~(xk+,, Y~+t)>~f(xk, Yk)+txkG,~(x~, Yk),
combined with (8), yields (9). Finally, observe that
f~ = f ( x * , y*) +lxkG~(x*, y*)
>~f( xk, Y~) +/xkG,~( xk, Yk)
>~f(xk, Yk).
[]
Proposition 1. Let { x,, y~} be a sequence generated by a penalty algorithm based on
the penalty function P2. Any limit point of the sequence {x,, y,} is a solution of the
bilevel program P1.
Proof. The set S being compact, there exists a convergent subsequence {(x k, Yk)}k~ K"
Let (2, y) be its limit point. It follows from the continuity of f that
l i m f ( x k, y ~ ) = f ( 2 , .~).
k~K
(11)
Let f* be the optimal value of the GBLP. Lemma 1 implies that
q~ = lim q,~(Xk, Yk, /Zk) ~<f*
k~K
(12)
Subtracting (11) from (12) there comes
lim txkG~( x k, Yk) =q~ - f ( ~, Y) <~f~ --f( 2, ~) <~0.
k~K
(13)
Since G,~ is nonnegative, Eq. (13) implies that
lim G,~( x~, y~) = 0.
(14)
kEK
and, using the continuity of G~,, G,~(~, Y) = 0, i.e., that (2, ~) is feasible for problem
P1. Furthermore, the inequality (10) in Lemma 1 allows to write
f ( 2 , Y) = l i m / ( x ~ , Yk) ~<f*
kEK
(15)
and this shows that (2, p) is an optimal solution of P1 or, equivalently, of the GBLP.
[]
146
P. Marcotte. D.L. Zhu / Mathematical Programming 74 (1996) 141-157
Example 1. Consider the bilevel problem
rain
x 2_ y
r ~ - [ - l . l ] , y ~ [ - I,l]
subjectto
(2y, y-z)~O,
forall z ~ [ - 1 .
1].
(16)
Clearly the point (0, 0) is the unique solution of the above GBLP. For oe = 1, the gap
function associated with the lower-level variational inequality takes the form
GI(X, y ) =
max
(2y, y-z)-
4H 3 ' - a l l 2
--l~<z~<l
=23:2
if-l~y~<
1.
The corresponding penalized problem P2 is
min
--I~;~
1,
x 2 - y + 2/.*y 2
(17)
[~3"~ I
and its optimal solution is (0, 1/(4/,)), which converges to ( 0 . 0 ) as /.t goes to infinity.
4. An exact penalty function for the GBLP: the separable case
In the example of file previous section, there is no finite value of the penalty
p a r a r n e t e r # for which the solution to the penalized problem agrees with the optimal
solution of the GBLP. We will show that the exactness property can be achieved with
the "classical" gap function G 0, introduced by Heam [14] in an optimization setting,
and corresponding to the choice a = 0 in PI. In this section we focus on the case where
the set S is separable and refer to the corresponding problem as P3:
P3:
min
f ( x, y)
vex,>er
(18)
(F(x, y), y-z>~O,
subjectto
VzEY.
The gap formulation of the GBLP is now
P4:
f ( x, y)
rain
xex.,,e r
subject to
(19)
Go( x, y) = 0
and its penalized equivalent is
P5:
rain
f ( x, y) +/.tG0( x, y ) .
(20)
x~X.y~ Y
Note that the global convergence result (Proposition 1) of the previous section holds for
ce = 0, i.e. P5.
Example 2. Consider the bilevel problem of Example 1. The gap function G O of the
lower level problem is
Go(x,y)=
max
--1~<-~<1
<2y, y - z > = 2 y 2 + 2 l y ]
(21)
P. Marcotte. D.L. Zhu / Mathematical Programming 74 (1996) 141-157
147
and the corresponding formulation P5 is
min
-- I ~ x ~ <
1.-
x 2 - y + 2p.(y2 + I y I).
1 -r
(22)
l
For all values of ~ larger than 1 / 2 , the optimal solution of the above penalized problem
is the optimal solution of the GBLP, namely: ( x *, y* ) = (0, 0).
Throughout this section, it will be assumed that the set Y is the polyhedron
Y = {yl By <~ b} (where B is an m • n 2 matrix and b a vector in N ' ) , that the mapping
F is strongly monotone with respect to the variable y and that its Jacobian VyF is
uniformly Lipschitz continuous with respect to the variable y, i.e., respectively,
<VyF(x, Y2 - ~ . F ( x , Yl), Y 2 - Y , )
VxEX,
>~ u l l Y 2 - Y t l l e,
(23)
V y I, y , ~ Y ,
and
IlEF(x,
y2)-~vF(X,
yl)ll~t[lye-ylll,
Vx~X,
Vy~,yz~Y.
(24)
The main convergence result of this section will require preliminary results about the
local behavior of the lower level variational inequality.
L e m m a 2. Let x o ~ X and assume that there exists a neighborhood of the point x o such
that V I P ( F ( x , . ), Y ) is geometrically stable inside that neighborhood. Then there
exists some neighborhood N O of x o such that the minunal face containing the solutions
to V I P ( F ( x , . ), Y ) and V I P ( F ( x 0, . ), Y ) are identical.
Proof. First note that our assumptions imply that the reaction y ( x ) to a given upper
level decision x defines a continuous function (see [21]). Let us denote by T ( x ) and
T(x o) the minimal faces containing the solutions to VIP(F( x, 9 ), Y) and VIP(F(x0, 9),
Y) respectively. We divide the proof into two parts.
(i) T ( x ) c T ( x o) for all x in some neighborhood of x 0.
We argue by contradiction. Assume there exists a sequence {x l} converging to x 0 and
a sequence of extreme points u~ ~ Y such that u t ~ T ( x l) but u~ r T(xo). It follows
from the geometric regularity assumption (see also [7]) that
(F(x,,
y ( x , ) ) , u, - y( x , ) ) = 0.
(25)
Now we select a subsequence {u/}/~ L such that lim/~ LUl = U~, t with u~ ~ Y - T(Xo).
Taking the limit as l ~ :c (I ~ L) in (25) there comes
( F ( x o, Y(X0) ), u~ - Y(X0)) = 0,
and this in turn implies that u~ ~ T(xo), a contradiction.
(ii) T(x o) c T(x) for all x in some neighborhood of x o.
Actually {ut} can be taken as the constant sequence {u~}.
(26)
P. Marcotte. D.L. Zhu / Mathematical Programming 74 (1996) 141-157
148
We proceed by contradiction again. Assume the result does not hold. Then there must
exist a sequence {xt}~ c such that T ( x l) is a strict subface of T(xo). Without loss of
generality, we assume that the faces T ( x t) are identical for all l in E. For each l E E,
consider the sequence { y(x~)}~ ~ L' of solutions to the lower level variational inequality.
Then the minimal face containing its limit point y ( x o) is a strict subface of T(xo), a
contradiction. []
L e m m a 3, Under the assumption of Lemma 2, there exists a neighborhood of x o and a
positive number ~7 such that
Go( x, y) >t 77 I1 y ( x ) - y II,
(27)
for all x in that neighborhood and all y in Y.
Proof. Following a result of Marcotte and Dussault [19], for each x in N 0, there exists a
positive number rl(x) such that the bound
G 0 ( x , y) >/ W(x) II y ( x ) - yll
(28)
holds, for all y in Y. Furthermore it is a direct consequence of the analysis leading to
this result that "1/(x) is a continuous function of the variable x, whenever the optimal
face of the lower level variational inequality remains unchanged. Since this condition is
satisfied in some neighborhood of x 0 (see Lemma 2), and rl(x o) is positive, the set
{~(x)} is bounded from below by some number r/, for x in the required neighborhood,
and the result is proved. []
L e m m a 4. Let Y( x, y) be the solution of the linearized variational inequality, i.e.,
(F(x,y)+VvF(x,y)(y(x,y)-y),y(x,y)-z)<~O,
Vz~Y.
(29)
Then we have
G~)((x, y ) ; (0, y ( x , y ) - y ) )
~< - G o ( x , y).
(30)
Proof. This lemma is simply a restatement in our context of a result of Marcotte and
Dussault [20] about the descent property of Newton's direction. []
L e m m a 5. Under the assumption of Lemma 2, there exists a neighborhood of ( Xo, y(xo))
and a positive constant y such that
G'0( ( x, y ) ; (0, y ( x ) - y ) ) ~< - y [I Y ( x ) - y II,
(31)
for all x, y in that neighborhood.
Proof. We have that
to write
II ~(x,
y) - y(x)II = G(
[I y(x)
- y)II
2) (see
[17]). This allows us
G o ( ( x , y ) ; (0, y ( x ) - y ) )
=G;((x, y): (0, ~(x, y ) - y ) ) +Go((X, y); (0, y(x)-y(x, y)))
~<G~((x, y) (0, y(x, y ) - y ) ) +G(II y ( x ) - y l l z),
P. Marcotte, D.L. Zhu / Mathematical Programming 74 (1996) 141-157
149
by the linearity of G o with respect to its second argument and the continuity
and compactness assumptions,
<~ - G o ( x , y)
+~(11y ( x )
_y[[2),
by Lemma 4,
-~11 y(x) - y l l +~'(11 y ( x ) -yl12),
by Lemma 3.
Since y(x) is a continuous function of x, there must exist a neighborhood of
( x o, Y(Xo)) such that
G o ( ( x , y)" (0, y ( x ) - y ) ) ~< -3'11 y ( x ) - y l t ,
for all (x, y) in that neighborhood, for any positive number 3' less than "q. []
We are now in position to state and prove the main result of this section.
Theorem 1. Let ( x* , y * ) ~ S. Assume that there exists a neighborhood of the point x *
such that VIP(Y, F(x, 9)) is geometrically stable inside that neighborhood. Then there
exists a finite value of the penalty parameter #, say IX*, such that for all tx larger than
IX~, ( x*, y~ ) is a global solution of P5 if and only if it is also a global solution of P3.
Proof. From Proposition 1, there exists a subsequence { Ixk}k~ X such that the solution
(x(ixk), Y(IXk)) of P5 converges to a solution (x*, y * ) ~ S of GBLP. Lemma 5
ensures the existence of a neighborhood N ~ of (x*, y* ) and some positive number 3'
such that
Go((X, y); (0, y ( x ) - y ) ) ~
-3"11y( x) - yll,
V(x, y ) e N * .
Therefore there must exist a finite value /2 such that (x(IX~), y(IX~)) lies in N* for all
values of Ixk larger than /2 and all indices k ~ K.
On the other hand, since f is continuously differentiable on the compact set
S = X • Y, we have
II~.f( x, y) II ~/3,
Vx ~ x, y E Y,
(32)
for some finite number/3. Let IX* = rain{ Ixk, k E K ] Ixk ~/2' Ixk >/3/7}" Assume that
Go( x( IX" ), y( IX'~)) is positive, i.e., that y( x( IX* )) 4=y(/z'~ ). (Otherwise the solution to
the penalized problem is feasible for P3, and consequently optimal for the GBLP.) We
have
f ' [ ( x ( i x * ) , y(ix*))" (0, y ( x ( i x " ) ) - y ( / x * ) ) ]
(33)
+ IX* Go[( X( IX* ), y(ix*))" (0, y ( x ( i x ~ ) ) - y ( i x * ) ) ]
< (/3 - Ix'3') II y( x( Ix* )) - y(/z* )11 by (32) and temma 5
<
O,
(34)
and (0, y(x( IX* )) - y( IX* )) is a feasible descent direction for P5, in contradiction with
the (global) optimality of (x( IX* ), y( IX" )). We conclude that G0(x( IX* ), y( IX* )) = 0.
From Lemma 1, we have G0(x(IX), y(IX)) = 0 for IX>~ IX*. Thus (x(IX), y(IX)) is an
optimal solution to the GBLP.
P. Marcotte. D.L. Zhu / Mathematical Programming 74 (l 996) 141-157
150
Conversely, let (x ~, y ' ) be an optimal solution of the GBLP and consider
(x(Ix), y(/x)) for /x larger than Ix*. By the above analysis we have that
G0(x(Ix), y(Ix)) = 0. Therefore,
f(x*,
y * ) = f ( x *, y * ) + ~ G 0 ( x ~ ,
y*)
4 f ( x ( I x ) , y(Ix)) + IxG0( x( Ix). y(Ix))
and this shows that ( x ~ . y~ ) is optimal for P5. as claimed.
[]
Corollary 1. Let IX >1 Ix'. Then (.~, -~') is a local minimum of P5 if and only if it is a
local minimum of P3.
Proof. In the proof of Theorem 1, the inequality (34) holds for local as well as for
global minima and it follows that any local minimum of P5 satisfies G0(2, ~) = 0 ,and is
feasible for P3, whenever IX >~/x '~ i.e., that P5 is entirely equivalent to P3. Consequently, the set of local minima of P5 coincides with that of P3. []
The formulation P5 involves a nondifferentiable term. However it is possible to
rewrite G 0 as
G 0 ( x , y ) = max
<P(.~. y), y - ~ >
2C Y
=
<F(x, y), y-u~>
max
ecE
(35)
= rain
subject to
@>~<F(x, y), y - u ~ > ,
VeEE,
where {u~}~.~ v_ denotes the set of extreme points of the polyhedron Y= {Yl By<~ b}.
This results in the formulation
min
f ( x, y) + ,a~
(36)
x ~- X,.~ ~ Y, ,~
subject to
~ >/ < F( x, y), y - u,.),
Ve~E,
which yields a differentiable formulation.
Alternatively, the gap function can be expressed in dual form (see [19]) as
Go(x, y ) = min
A~o
< f ( x , y), y ) + b t A
(37)
subject to
F( x, y) + B~A = 0.
Replacing Go(x, 3') by its dual formulation yields
P6:
rain
,,,~.v.,,~ r.A>~0
subject to
f ( x , y) + / x ( < F ( x , y), y) + b'A)
(38)
F( x, y) + B~3.= 0.
Corollary 2. Under the assumption of Theorem 1, there r
a finite value of Ne
penal~' parameter IX, say Ix'. such that for all Ix larger than Ix*, ( x *, y* ) is a global
solution of P6 if and only if it is also a global solution of P3.
P. Marcotte. D.L. Zhu / Mathematical Programming 74 (!996) 141-157
Proof. The result is a direct consequence of the equivalence between P5 and P6.
15 I
[]
We close this section with an equivalent of Corollary 1.
Corollary 3. Let tz >1 I~*. Then ( 7c, ~) is a local minimum of P6 (fl and only if it is a
local mininlum of P5, if and only (f it is a local minimum of P3.
5. An exact penalty function for the GBLP: the non-separable case
In this section we extend the analysis to the case where the set S is not separable
with respect to the vectors of variables x and y. More specifically, we set
S = { ( x , y) l f ( x ,
y)=Ax+By-d<<,O,
x~X'},
where A and B are p • n~ and p • n 2 matrices respectively, and X' is a compact,
convex subset of [R",. We will still assume that the mapping F is strongly monotone
with respect to the variable y and that its Jacobian with respect to y, VyF, is uniformly
Lipschitz continuous on the set S.
At a point x 0 of X, a solution Y0 to the lower level variational inequality is
characterized by the complementarity system
F( x o, Yo) + B' Ao = O,
( Ax o + B y o - d)tA0 = 0,
(39)
a0~O.
The strong monotonicity of F implies that the lower level solution y ( x ) is unique, for
all x. Furthermore, if the gradients of the binding constraints are linearly independent
and the strict complementarity condition
Aoi>0
~
Ci(x 0, Y 0 ) = 0 ,
Vi,
(40)
is satisfied then, in a neighborhood of x 0, the functions y(x) and A(x) are unique and
Lipschitz continuous. (See [9] for example.)
For given x, denote by T(x) the minimal face of Y(x) containing the solution y ( x )
to the lower level variational inequality. We have the following results.
Proposition 2. Let x o ~ X. If the linear independence and strict complementarity
conditions hold at Yo = Y(x0), there exists" a neighborhood N o of x o such that T ( x ) and
T( x o) have the same topological structure for all x E No, i.e.,
Ci(xo, Yo) = 0
** C , ( x , y ( x ) ) =0.
Proof. Let J denote the index set of binding constraints at ( x 0, Y0) and
T ( x o ) = { y l f i ( x o, y) = 0 , i ~ J , Ci(xo, y) ~ 0 , i ~ J } .
The strict complementarity condition yields
Ai(x0) > 0
r
Ci(xo, Yo) = 0 .
P. Marcotte. D.L. Zhu / Mathematical Programming 74 (1996) 141-157
152
Since the mapping A is continuous in the variable x, there nmst exist a neighborhood
N 1 of x 0 such that h i ( x ) > 0 whenever C,(x, y ( x ) ) = 0, for all i ~ J and x ~ N I.
On the other hand, the continuity of the (affine) functions C i implies the existence of
a neighborhood N 2 such that C~(x, y ( x ) ) < 0 for all i ~ J
and x ~ N 2. Taking
N O = N~ A N 2 brings the conclusion. []
P r o p o s i t i o n 3. Under the assumptions of Proposition 2, there exists a neighborhood N O
of x o such that T is a continuous point-to-set mapping on N o.
Proof. Let N o be the neighborhood of Proposition 2, J the index set of binding
constraints at x 0, and 2 E N 0.
(i) T is closed at 2. Let
T ( 2 ) = { Y l C i ( 2 , y) = 0 , i ~ J .
= {y,
+
0,
C,(Yc, y) <~0, icY J}
+
0},
for suitable submatrices A, A of A and B, B of B. Let x~. ~ 2, Yk --* Y, xk E N o and
Yk ~ T(xk), for all k. We have
lira C~(x~, Y k ) = C~(2, ~,)
and
=0,
C i ( 2 , ~)
<~0,
V i E J,
Viq~J,
which implies that )3 E T(_~), and T is closed at 2.
(ii) T is open at 2. Let x k denote a sequence converging to the point 2, and )' be an
arbitrary point in the face T(2). Let us introduce the face T ( x k) = {y] A~xk + B y - d =
O, Ax k + By - d <~ 0}, and let Yk = arg rainy e T(.~-~)11Y -- Y ]], i.e., y~ is the projection of
onto T ( x k) and the distance from p to T ( x ~) is given by [[ ~ - Yk II.
T h e application of Hoffman's Lemma [15] to the linear systems ,4x"k + B y - d = 0
and Ax k + By - d ~ 0 yields:
for some positive constant z, where ( x ) + denotes the projection of the vector x onto the
nonnegative orthant. W e conclude that lim k ~ [[ p - Yk II = 0 and therefore Yk ~ Y, as
required. []
P. Marcotte. D.L. Zhu/Mathematical Programming 74 (1996) 141-157
153
Corollary 4. Consider the function 6 defined as
8(x) =
max II y ( x ) - z l l .
(41)
Z~ T(x)
If the assumptions of Proposition 2 hold at x o, then there exists a neighborhood of x o
such that 8 is continuous on that neighborhood.
Proof. The sets Y(x) and T(x) are uniformly compact and the point-to-set mapping T is
continuous in a neighborhood of x 0. Also the function y(x) is continuous on X. The
conclusion then follows from Theorem 7 in [16]. []
Now, as in the previous sections, we associate with the GBLP its reformulation
P7:
min
(x. D e s
subject to
f( x, y)
(42)
Go(x, y) = 0.
and the corresponding penalty problem:
PS:
min
q~( x, y, /x) = f ( x, 3') + / * G o ( x , y).
(43)
(x,),)~S
Lemma 6. Under the assumptions of Proposition 2, there exist a positive constant
and a neighborhood N'( xo, Yo) of ( x o, Yo) such that
Go(x, y ) > ' o [ [ y ( x ) - y l l ,
V ( x , y) ~ S f q N ' ( x o ,
Yo).
(44)
Proof. For fixed x E X, one has (see Proposition 3 of [19])
Go( x, y) >~ r / ( x ) II y(x) - y II, Vy E y ( x ) ,
(45)
where r/(x) depends continuously on the distance 6(x) of y ( x ) to the boundary of
7"(x). Now, Corollary 4 ensures that rl(x) is continuous in a neighborhood of x 0. Since
~(x 0) is positive, the conclusion follows. []
The next lemma parallels Lemma 5 of the preceding section and its proof is similar.
Lemma 7. Under the assumptions of Proposition 2, there exists a neighborhood of
(x0, Yo) and a positive constant 3, such that
G ~ ( ( x , y); (0, y ( x ) - y ) )
~< -3'11 y ( x ) - y [ I .
(46)
for all x, y in that neighborhood.
Lemma 7 is the key result that is required to prove the exactness of G O as a penalty
function. The proofs are similar to the proofs of the corresponding results in Section 4,
using Lemma 6 in place of Lemma 3.
Theorem 2. Let ( x * , y*) ~ S and let the assumptions of Proposition 2 hold at
(x" , y* ). Then there exists a finite value I~ such that, for all tz >~tz* , ( x* , y* ) is a
global optimum of P8 if and only if it is also a global optimum of P7.
P. Marcotte, D.L. Zhu/Mathematical Pro,~ramming 74 (1996) 141-157
154
Corollary 5. Let t~ >~Iz" . Then (X, ~) is a local minimum q[ P8 if and only if it is a
local minimum of P7.
We note that similar results hold for the dual fornmlation
P9"
rain
(x. y)ss.a>~ 0
/ ( x , ,,) + ~[<r(.,. ;9, y> + ( J - A x ) ' a ]
subject to
F ( x, y) + B ~,~ = 0.
(47)
E x a m p l e . We illustrate our main result using a two-dimensional example taken from the
paper by Gauvm and Savard [12], where the lower level problem corresponds to an
optimization problem. We have
f(x,
y) = x z + ( y -
S={(x,y)
10) 2 ,
lx+y~<20,
x~<15, x , y > ~ 0 } ,
F(x, y)=4(x+2y-30).
Its optimal value is 20 and is reached at the unique optimal solution ( x * , y* ) = (2, 14).
The gap function of this problem takes the foml
o0( x, y) =
max {(4( x + 2 y - 30), y - z) l0 <
4(x + 2y-
z
30) y - rain{0, 4( x + 2 y -
= {4(x+2y-30)
y,
4( x + 2 y - 30)( x + y - 20),
< 20 - x}
30)(20-
x)}
ff x + 2 y - 3 0 > / 0 ,
if x + 2 y - 3 0 ~ < o .
(48)
Now consider the penalized problem
min
x z + ( y - 10) 2 + p.G0( x, y)
x,y
subject to
x+y~<20
0~<x~< 15
y>~ 0.
(49)
The above problem separates into two cases.
(i) First case: x + 2 y >~ 30:
P(/x) =
min { x 2 + ( y
- lO)2+41x(x+2y-30)y[x+2y>~30,
.v,y ~>0
x + y ~ 2 0 , x.< 15}
(50)
The optimum is reached at (x, y) = (2, 14) and its value is 20.
(ii) Second case: x + 2 y ~< 30:
P(/x)=
rain { x 2 + ( y
- lO)2+4tx(x+2y-30)(x+y-20)lx+2y<~30,
x v~>0
~+y~20, ~
15}
(51)
P. Marcotte, D.L. Zhu / Mathematical Programming 74 (1996) 141-157
20
i
155
I
penalized function --
18
16
14
12
10
8
6
4
2
I
t
q
i
{
[
0.05
0.1
0.15
0.2
0.25
0.3
0
#
0.35
Fig. I. Convergenceof the penalized problem.
I
If /x > 88 the solution is (2, 14) as well. If p. ~< X,
the optimal solution is
40( p , - / x 2)
x(p,) = 1+ 12/x-4/x 2'
10(1 + 1 8 / x - 4/z 2)
y(/z)=
1 +12p,-4/x 2
The sequence of optimal values of the penalized function P(/x) is plotted in Fig. 1.
6. Extensions and r e m a r k s
In this paper, we presented an exact penalty scheme for an optimization problem with
variational inequality constraints. Our analysis made use of the nondifferentiable gap
function of the lower level variational inequality as a penalty function. In this context,
our regularity assumptions could not be weakened while still retaining the key result
G•((x,
y); (y(x)
-y))
< -ylly(x)-yll.
(52)
In the case where F is constant and f is affine however, the linear independence and
strict complementarity conditions are not required any more: (52) is always valid for
linear programs. Indeed, a direct proof of exactness of the gap function, in the linear
case, can be found in [2]. This result can actually be extended to the situation where f is
nonlinear and F assumes the form F ( x , y) = H ( x ) + h.
It is also possible to extend the analysis to take into account additional upper level
constraints (x, y) ~ S', resulting in the more general formulation:
min
(x, y)~S'OS
subjectto
.f( x, y)
( F ( x , y), y - z ) < ~ O ,
forall z ~ Y ( x ) .
A byproduct of our exactness result is the derivation of necessary optimality
conditions for the GBLP. If we assume that the convex set X' can be represented as
X'= { x ~ R "~ [li( x ) ~<O, i= l . . . . . s},
P. Marcotte, D.L. Zhu / Mathematical Pro,~rammin~ 74 (1996) 141-157
156
where the functions I i are convex, and that its interior is nonempty (Slater's constraint
qualification), then ( x * , y * ) is a (local) minimum of the GBLP only if there exist
vectors v* and w* such that
0 ~ v f( .,-~, ,,') + ~*a,G0( .~ ~, >'~ ) + Y'. u,'Vt(.4-~) + w*'A,
i=1
0 ~ E'f( x ~ y : ) + ~.a,,Go( x *, >," ) + w "B,
vi*li(x" )
=0,
i =
1 .....
j=
l . . . . . p,
v[' >~0,
i=
1.....
s,
w ] >>. O,
j = 1.....
p,
"7 (A.~-* + B , y "
- dj) = O,
S,
provided that the penalty parameter tt is sufficiently large (/z > it* ). In the separable
case (A = 0), the expression of the Clarke generalized gradient of G o takes the simple
form
a, C 0 ( x , y ) = c o z ~ n , .
~.)(r,.F(,,', v ) ) ' ( y - ~ ) ,
0~.G0( .v, y) = c o z e v(.,., y,[ F( x, y) + (F', F ( x , y ) ) ' ( y - .:.)],
where P ( x , y ) = arg min:~ n.,l(F(x, y ) ) ' ( y - ,:). Tile derivation of K a r u s h - K u h n Tucker conditions for the GBLP can also be based on the dual formulation (47). One
must be careful however that the dual feasibility constraints
F ( x , y) + B'A = 0
(53)
satisfy some regularity (constraint qualification) condition. This will be the case if F is
affine in (x, y) or under various separability conditions. As of now, we do not know
whether the regularity of the system (53) is satisfied for more general classes of
nonlinear functions F.
Several authors have previously proposed optimality conditions based on the reformulation of a bilevel program as an optimization program constrained by the optimality
conditions, in one form or another, of the lower level program. (See [3,4].) The
drawback of these approaches is that, in general, no constraint qualification holds for the
KKT constraints, leading to optimality conditions that are very seldom realized in
practice. Let us mention that stronger optimality results, based on local sensitivity
analysis results, have been proposed by Gauvin and Savard [12].
From a practical point of view, the nonconvex and nondifferentiable problems P5 and
P8 are difficult to solve. A global optimum can only be achieved by resorting to global
optimization techniques such as implicit enumeration, cutting plane methods, etc.
Recently, in the context of linear bilevel programming. Gendreau et al. [11] proposed an
efficient heuristic for generating a high-quality initial solution, later to be used as a
starting point for local search methods. This procedure was based on a primal-dual,
exact penalty formulation of the linear bilevel program. Work is currently under way to
adapt this technique within a nonlinear framework.
P. Marcotte, D.L. Zhu /Mathematical Programming 74 (1996) 141-157
157
References
[1] Hierarchical Optimization, Annals of Operations Research 34 (1992).
[2] G. Anandalingam and D.J. White, " A solution method for the linear static Stackelberg problem using
penalty functions," IEEE Transactions on Automatic Control AC-35 (1990) 1170-1173.
[3] J.F. Bard, "'An efficient point algorithm for a linear two-stage optimization problem," Operations
Research 31 (1983) 670-684.
[4] Z. Bi, P. Calamai and A. Conn, "'An exact penalty method approach for the nonlinear bilevel
programming problem," Report # 180-0-170591, Deparmaent of Systems Design, University of Waterloo
(1991).
[5] Computers and Operations Research 9 (1982).
[6] J.M. Danskin, "The theory of max-rain, with applications," SIAM Journal of Applied Mathematics 14
(1966) 641-664.
[7] L-P. Dussault and P. Marcotte, "Conditions de r6gularit6 g6om6trique pour les in6quations variationnelles," I~'~IRO Recherche Op{rationnelle 23 (1988) I - 16.
[8] C.S. Fisk, " A conceptual framework for optimal transportation systems planning with integrated supply
and demand models," Transportation Science 20 (1986) 37-47.
[9] T.L. Friesz, R.T. Tobin, H-J Cho and N.J. Mehta, "Sensitivity analysis based heuristic algorithms for
mathematical programs with variational inequality constraints," Mathematical Programming 48 (1990)
265-284.
[10] M. Fukushima, "'Equivalent differentiable optimization problems and de~ent methods for asymmetric
variational inequality problems," Mathematical Programming 53 (1992) 99-110.
[11] M. Gendreau, P. Marcotte and G. Savard. " A hybrid tabu-ascent algorithm for the linear bilevel
programming problem," forthcoming in Journal ~/'Global Optimization.
[12] J, Gauvin and G. Savaa-d, "The steepest descent method for the nonlinear bilevel programming
problem," Working paper G-9037, GERAD, l~colc Polytechnique de Montr6al (1992) (first version).
[13] J. Haslinger and P. Neittaanm~ki, Finite Element Approximation jbr Optimal Shape Design, Theo~' and
Application (Wiley, New York, 1988).
[ 14] D.W. Hearn, "'The gap function of a convex program," Operations Research Letters I (1981) 67-71.
[15] A.J. Hoffman, "On approximate solutions of systems of linear inequalities," Journal of Research ~['the
National Bureau of Standards 49 (1952) 263-265.
[16] W. Hogan, "Directional derivative for extremal value functions with applications to the completely
convex case," Operations Research 21 (1973) 188-209.
[17] N.H. Josephy, "Newton's method for generalized equations," Technical Report 1966, Mathematical
Research Center, University of Wisconsin, Madison, WI (1979).
[18] P. Marcotte, "'A new algorithm for solving variational inequalities with application to the traffic
assignment problem," Mathematical Programming 33 (1985) 339-351.
[19] P. Marcotte and J.-P. Dussault, "'A sequential linear programming algorithm for solving monotone
variational inequalities," SlAM Journal of Control and Optimization 27 (I 989) 1260-1278.
[20] P. Marcotte and J.-P. Dussault. "'A note on a globally convergent method for solving monotone
variational inequalities," Operations Research Letters 6 (1987) 35-42.
[21] Y. Qiu and T. IVlagnanti, "Sensitivity mxalysis for variational inequalities," Mathematics of Operations
Research 17 (1992) 61-76.
[22] Y. Ishizuka and E. Aiyoshi, "Double penalty method for bilevel optimization problems," Annals of
Operations Research 34 (1992) 73-88.

Download Report

Exact and inexact penalty methods for the generalized bilevel

Paperzz.com

Your Paperzz