Gould, F.J. and Howe, S.; (1971)A new result on interpreting Lagrange multipliers as dual variables."

t
This reseal'ah was partiaZZy supported by a grant from the Offiae of NavaZ
Researah aontraat number N00014-67-A-0321-0003 (NR047-095).
* CurnauZwn in Oper>ations Resea'llah and Systems AnaZysis;, University of
North Cal'olina at ChapeZ BiZZ;, 27514.
A NB~
RESULT ON INTERPRETING I.l\GRANGE
f1JLTIPLIERS AS DUAL VARIABLES t
by
F. J. Gould*
and
Stephen Howe*.
Department of Statistias
University of North Carolina at ChapeZ BiZZ
Institute of Statistics Mimeo Series No. 738
JanUaJl..tj , 1971
•
ANEW
RESULT Q\J INTERPRETING lAGRANGE fJULTIPLIERS AS DuAL VARIABLESt
F. J. Gould* and Stephen Howe*
I
I
INTRODUCTION
It is well known that in linear programming the knowledge of optimal dual
variables can be used to simplify the finding of an optimal solution to the primal problem.
In this paper, the analogous notion is considered for nonlinear
and particularly nonconcave programming problems.
A new theoretic result on
Lagrange multipliers is presented which enables one to make a statement for nonlinear problems similar to the above assertion for linear programs.
Consider the problem
•
maximize
(P) :
f(x) ,
subject to
g(x) ::;; 0,
where
n
f: R -+ R,
n
g: R
points) for this problem.
m
represent the constraint set (feasible
-+ R •
That is
Assume that all functions in (P) are twice continuously differentiable and that
(P) has at least a local solution, say
exists a neighborhood of
is in
N(x*) n SO'
x*,
say
x*.
N(x*),
That is,
such that
x*
is in
f(x*)
~
So
f(x)
Assume that a constraint qualification holds at
and there
whenever
x*
[2].
Then the following Kuhn-Tucker optimality conditions are valid:
t
This researah was partially supported by a grant from the Office of Naval,
Research contract number N00014-67-A-OJ21-000J (NR047-095).
*
Cumcul,um in Operations Research and Systems Anal,ysis" University of
North Ca1'OUna at Chapel, Hill" 27514.
x
2
There exists a nonnegative vector
=
Vf{x*)
'A*
in
m
R
such that
m
r
i=1
'A* i····
Vg i (x*)
(complementary slackness).
In this context the following question may be posed.
'A*
vector is known, but that
knowledge of
'A*
x*
is tmknown.
to determine either
x*
Suppose that the above
How might one then use the
or possibly some other local solution
to (P)?
If (P) is a concave program
(f{x) concave, gj{x) convex, j=l, ••• ,m)
answers to this question are known.
the
In particular, in this case, the Kuhn-
Tucker conditions are sufficient as well as necessary for optimality (i.e., if
the pair
any
x*
(x*,'A*)
satisfies
satisfying
C
l
C
l
C
2
then
x*
is a solution to (P).
since for concave programs the Lagrangian is a
If the Lagrangian is strictly concave, then
unique global maximizer.
x* will be its
If the Lagrangian is not strictly concave then there
may be a set of vectors, including
Lagrangian (thus satisfying
set, which also satisfies
Also
also (globally) maximizes the Lagrangian function
L{x,'A*) = f{x)-I~=l 'A*igi(x),
concave function.
and
C ).
1
C ,
2
x*,
each member of which maximizes the
In this case, any feasible member of the
is a solution to (P).
concave program, a knowledge of
A*
Thus, whenever (P) is a
permits the statement of a conceptual pro-
cedure which is necessary and sufficient for determining solutions to (P) in the
sense that it is guaranteed to be failsafe.
This procedure is the following.
Find among the set of Lagrangian multipliers one that also satisfies feasibility
and complementary slackness.
The procedure is failsafe in that
x*
is such a
quantity.
If (P) is not concave, then there are at least two serious difficulties:
(1)
There is no convenient way of determining the vectors
1\
x which
3
satisfy the system
Vf{x)
g{x)
=
o
In particular, in the nonconcave case, the global (local) solutions to (P) will
not generally be global (local) maximizers of
(2)
f{x)-L~=lA*igi{x).
The Kuhn-Tucker conditions in the nonconcave case are not generally
sufficient.
lienee, even if all solutions to the above system can be found, there
is not a known way of selecting the solutions to (P) from the set of vectors
1\
X
which satisfy
DI , D and D • (Second order information could be used but there
Z
3
are no known necessary and sufficient second order conditions. Hence, conceptually, second order information does not solve the problem.)
Thus, in the case of nonconcave programs, (P), the usual Lagrangian function
appears to be of limited use as a vehicle for answering the question:
knowledge of
A*
Can a
be theoretically related to a procedure for solving (P)?
In this paper, we consider the extended Lagrangian function [3]
m
(1.1)
P(x,k,8) = f(x) - L Si[eXp(kgi(x»-l],
i=l
where x E Rn , k is a nonnegative scalar, and f3 E Rm• Recall that
some unknown local solution to (P),
and the pair
(x* ,A*)
A*
x*
is
is a known nonnegative dual variable,
satisfies the Kuhn-Tucker conditions
D , D and D • We
2
l
3
now wish to assume that
(i) :
x*
satisfies the second order sufficiency conditions [4]
(ii): A*i > 0
for every
i
such that
gi(x*)
= O.
Under these conditions, it will be shown that there exists a positive scalar K
such that if k*
~
K,
and if
maximizer of P(x,k*,8*),
8*i
= A*i/k*,
i = l, ••• ,m,
then x*
and any local maximizer of P(x,k*,13*)
is a local
also
4
DZ and D3 is a local solution to (P).
satisfying
~
Thus, from a knowledge of
A*, we are able in theory to determine a local solution to (P) by the following
- failsafe procedure:
conduct a search among the local maximizers of
for one which satisfies
such an element.
D
Z
and
D •
3
P(x,k*,S*)
The procedure is failsafe in that
x*
is
Thus, in nonconcave programs, although the usual Lagrangian
function has restricted theoretic application, it is seen that the Lagrange
multipliers,
A*,
continue to playa key theoretic role as dual variables.
The
use of an extended Lagrangian function allows one to extend the theory of concave programs in a parallel way.
In another, but related, context it will be shown that the function
P(x,k,S)
actually has a local saddle point at
x*, k*, S*,
and consequently
the results herein have an interpretation in the framework of duality notions
for nonlinear programs.
The present work has been motivated by a desire to improve the computational
theory for solVing nonconcave problems.
Although immediate computational im-
plications of the above described results are not clear, it is hoped that this
work will lead to a better understanding of nonconcave programs and thereby
stimulate other efforts to obtain computational advances.
In Section 2, the main result is stated and proved.
Following a discussion
in Section 3, the duality interpretation is demonstrated and discussed in
Section 4.
An example and application are presented in Sections 5 and 6,
respectively.
II
Let the problem (P) and
x*
I
RESULTS
be defined as above.
Suppose
f
and
g
are
twice continuously differentiable, and that the weak constraint qualification
5
holds at
x*.
Let
I
denote the index set of active constraints.
=
I
Let
{i
{1,2, ••• ,m}: gi(x*)
€
1.*
is an optimal dual
fine the extended Lagrangian
lEMMA I.
(X*,A*)
variable~
(x*,A*)
P(x,k*,S*),
satisfy
as in (1.1), by setting
Dl ,D 2 ,D if and only if
3
satisfy
o}.
D , D and D
2
3
l
or Kuhn-Tucker multiplier). Now de-
A* be a nonnegative vector such that
(i.e.,
=
That is,
(x*,k*,S*)
k* > 0
and
satisfy
m
VP(x*,k*~S*)
o =
=
r
Vf(x*) -
i=l
S*i
exp(k*g
(x*»k*Vg
i
i (x*)
i=1,2, ••• ,m
=
S*i
PROOF:
For
i
For
€
I,
i=1,2, ••• ,m.
¢ I, G3 holds iff S*i = 0, iff A*i = 0, iff D3 holds.
i
gi(x*)
=0
so that both
G
3
to be a local solution to (P) and hence
satisfied.
(I)
0
=
Vf(x*)
-
and
Also,
x*
is assumed
G2 (feasibility) are trivially
r
k* S*i Vgi(x*).
id
0
=
Vf(x*)
- L
iE:!
is equivalent to
A*i vgi(x*).
Finally, it follows from the assumption
(II)
are the same.
LE~MA
2D
S* i
= 1.* i /k* '
If a constraint qualification holds at
G3 •
Take
S*
each
i,
that
(I)
and
D
there exists a nonnegative vector
PRooF~
D
2
D hold.
3
Thus the system G , G , G is equivalent to
2
l
3
Similarly, the system Dl' D , D
2
3
(II)
and
= A*/k*. D
S*
such that
x*
then for any positive
(x*,k*,S*)
satisfy
k*
G , G2 ,
l
6
Now consider the Hessian of the extended Lagrangian at
(2.1)
where
L(x,A*)
V2f(x*) -
=
V2L(x*,A*) - k*
L
L A*i
Vg (x*)VTg (x*)
ii
ieI
is the usual Lagrangian function discussed in Section 1.
V2L(x*,A*)
that if
k*B* V2g (x*) - L (k~2B* Vg (x*}VTg (x*)
i
i
ieI
iii
=
ieI
x*:
V2P(x*,k*,B*)
is negative definite, then
in (2.1) each term of the sum is a dyadic matrix.
Even when
Note
is also, because
V2L(x*,A*)
is not
negative definitive, however, the dyadic terms will under certain conditions
~2p(x*,k*,B*}
cause
to be negative definite.
be the subspace spanned by
of
S by
{~gi(x*):
ieB},
B = {ieI: A*i>O},
Let
let
and denote the orthogonal complement
SJ..
THEOREM Ie
D , D and D , (X*,A*) also satisfy
I
2
3
zT~2L(x*,A*)z < O. Then there is a positive scalar
Suppose that, in addition to
D : for each nonzero zeSJ.,
4
K such that for any
V2P(x*,k*,B*)
maximizer of
k* > K and
B*i
is negative definite.
= A*i/k*,
= 1,2, ••• ,m
i
Consequently,
x*
the Hessian
is an isolated local
P(x,k*,B*).
REMl\RK:
If the nondegeneracy assumption
that
D , D , D and D is equivalent to assuming that
2
l
4
3
satisfy the second order sufficiency conditions discussed by Fiacco
(X*,A*)
(X*,A*)
S
A*i > 0,
ieI
is made, then assuming
satisfy
. and McCormick in [4].
PROOF OF THEOREM
that
I.
~2L(x*,A*),
definite.
If
If
and hence
S
= {oJ
then
SJ.
V2P(x*,k*,B*)
= Rn ,
S
~
yeS
{OJ,
=
h~BA*;V81 (x*)VT81 (x*~y
such that
on the compact set
y ~ 0,
U = {yeS:
h(y) > 0,
II yll = I}.
D
4
implies
is negative
K > O.
consider the continuous function
yT
in which case
for any positive k*,
In this case, the result holds for any
h(y)
For any
(i)
h: S
+
R defined by
=
hence
h
takes a minimum m > 0
l
Now denote M =
II ~2L(x*,A*}",
the
7
sup norm of the matrix V2L(x*,A*).
If S
ii)
n
= R,
then for every V€S
T
V V2p(X*,k*,e*)v
=
=::;
such that
T
v V2 L(x*,A*)V - k*h(v)
Z
Mil vll - k*mlllvI!Z.
In this case, the result holds for K = MimI.
iii) Finally, suppose {O}; s ~ Rn • Then
zT V2L(x*,A*)Z < 0,
set
V
hence
= {Z€S~: II Z II=!}.
T
v V2P(x*,k*,S*)v =
S~; {oJ.
For any
Z€S~
3
z;o,
zTV2L(x*,A*)Z
has a maximum -m < 0 on the compact
Z
n
v€R :fy€S, Z€S~ 3 v=y+z. Then
For any
(y+Z)TV2 L(x*,A*)(Y+Z) (y+z) T
=
v; 0,
zT V2 L(x*,A*)Z
~* ~
T
A*i Vg i (x*)V gi (x*0 (y+z)
J
[ i€B
+ ZyTV2L(x*,A*)Z + yTV2L(x*,A*)y - k*h(y)
~ - m2 " z1l2+ ZMlIyllllzll+ Mllyl!Z_ k*mlllyllZ
= -k*mlllyI!Z _ mz(1I zll-.1!.
lIyll)Z + M211yllZ + MilyllZ
m
m
Z
Z
=
2
II y liZ (-k*ml + M
m
Z
Hence, if k* > K
= .l!.
m
l
o
(l +.l!.)
m
,
+ 14) - m2 ( II z II
then vTV2p (x* ,k*,S*)v
- ..1!.
m II y II )Z.
Z
<
III.
A*
is known.
·e
such that
for any nonzero
v.
DISCUSSION
For the program P as defined, with local maximum at
vector
0
Z
(X*,A*)
x*
and nonnegative
satisfy the Kuhn-Tucker conditions, suppose
A*
In the concave case, with no further assumptions, this information
can be used to find a global solution to
P.
With the above results, we can now
make a similar, but somewhat weaker, statement in the nonconcave case.
Under the
8
additional assumption that
_
set
k* > K and
D4 holds, there is a value
S*i = A*i/k*
fori
= l, ••• ,m
1{
then
> 0
such that if we
(x*,k*,S*)
will satisfy
and G , an:dthe Hessian ofthl! l!xtended Lagrangian P(x, k*, (:5*) will be
3
negative definite at x*. This means that P(x,k*,8*) will have a local unGl'-G
2
constrained maximmn at
maximizer of
x*.
P(x,k*,S*)
Furthermore, we can show that any local (global)
which satisfies feasibility and complementary slackFor example, suppose
ness will be a local (global) solution to P.
global maximizer of
P(x,k*,B*)
the complementary slackness relations
"m
f(x) - l
D •
3
A
s*i[eXp(k*g.(x»-l]
i=l
~€So and that
and suppose
is a
(~,A*) satisfy
Vx€Rn
Then
~
"x
m
f(x) -
l
s*i[exp(k*gi(x»-ll,
i=l
J.
which implies
m
=
l
f(x) +
s*i[l-eXp(k*gi(x»],
i=l
which implies
A
f(x)
~
Hence
f(x)
A
X
is a global solution to (P).
The argument for local solutions is analogous.
Thus, given
A*
under the stated conditions, we formulate the extended
Lagrangian in the form (1. 1) and then search among its local maximizers for
points
~ which also satisfy feasibility and complementary slackness.
will be at least one such point
local solution to (P).
A
x,
namely
x*,
and each such
There
A
x will be a
Note that this procedure will not directly yield a
global solution to (P), unlike the concave case.
I f the points
A
X
include a
global solution to (P) (for example, if
x*
is a global solution), it will be
recognizable only to the extent that if
X
is a global solution to (P) and
if ""x
is any other local maximizer of
complementary slackness, then
o
P(x,k*,8*)
satisfying feasibility and
P(~,k*,S*) ~ P(xO,k*,S*).
9
IV. DuALITY
Under the assumptions in Section 2, we have
N{x*),
some neighborhood of x*,
mizer of
and since
P{x,k*,S*).
x*
~
0,
Also, because
x*
(X*,A*)
Vx
€
is a:local unconstrained maxisatisfies complementary slackness,
is feasible,
m
=
P{x*,k*,S*)
Vk
because
P{x,k*,S*) S P{x*,k*,S*)
S ~ O.
L
f{x*) -
i=l
Vx€N{x*),
Thus,
P{x,k*,8*)
S
This shows that for each fixed
=
P{x,k,B)
has a local saddle point at
a dual function,
s* i [exp{k*gi (x*»-l]
~(8),
f{x*)
S
P{x*,k,e)
k,S ~ 0,
P{x*,k*,8*)
k*
=
S
P{x*,k,S).
sufficiently large, the function
m
f{x) -
I
Ai/k [exp{kgi{x»-l]
i=l
(x*,k*,8*).
Fix k*
sufficiently large and define
as
~(S)
=
max P(x,k*,8).
x€N{x*)
Then we have
f{x*)
= min
~(8)
=
~(a*).
a~o
We note here that the dual problem is precisely the problem of determining
an exact penalty function for the primal.
In other words, to solve the dual, it
is necessary to find a
is a local solution to
provided k*
a*
such that
is sufficiently large.
x*
maXP{x,k*,a*),
Fletcher [5] has obtained at least a
theoretic and possibly a computational advance in penalty function techniques
by presenting exact
penal~y
functions for equality-constrained problems.
However,
there has to date appeared no computationally useful type of differentiable exact
10
penalty function for nonconcave inequality-constrained problems.
.~
function
P(x,k*,S)
The present
is no exception, since there are not acceptable known pro-
cedures for determining
in nonconcave cases. Most known applicable pro-
S*
cedures would involve general cutting plane techniques in the
S space, such
as the Dantzig-Wolfe decomposition [6] or the procedures of Nemhauser and
Widhelm [7].
These procedures, at the present state of theory, could firstly be
employed only to find a
S*
corresponding to a global solution to (P) and
secondly they would require an infinite sequence of global optimizations of nonconcave functions, which would appear to be unacceptable.
VI ExAr1'LE
One point in the development thus far should perhaps be further emphasized.
We have shown that if
the conditions
x*
is a local solution to (P), and if
D thru D ,
l
4
P(X,k*,A*/k*)
for all
k*
then
then
1\
x
P(X,k*,A*/k*)
That is, if
g(~)
such that
is a local solution to (P).
1\
gi(x)
= O.
i
That is,
i€B
= 1, •.. ,m,
1\
x.
~ is any other local unconstrained
S
0
and
A*igi(~)
= 0,
i
= l, ••• ,m
However, note that if such an
it follows that
implies
This means that the optimal dual variables
optimal for
However, the theory admits the
We have not been able to impose weak con-
ditions which rule out this possibility.
then from
satisfy
is a local unconstrained maximizer of
sufficiently large.
possibility of nonuniqueness.
maximizer of
x*
(X*,A*)
1\
X
implies
Also,
A*
corresponding to
x*
are also
For nonlinear problems, though this is clearly possible, it
intuitively seems "unlikely".
exists
11
As an illustration of Theorem 1, consider the following simple quadratic
example.
2
max x,
s .t.
x - 2
~
0
-x - 1
~
o.
In this case, there are two local solutions, namely
solution is
A*2 =
o.
x* = 2,
That is,
P{X,k*,A*/k*),
and
(X*,A*)
satisfy the Kuhn-Tucker conditions.
The global
A*l = 4,
The function
is given by
=
J.
x
2
4
- k* (exp{k*{x-2»-1).
S
= R,
S
k* > K
1
= 2.
It is not difficult to verify that
Note that here
+2.
with corresponding optimal dual variables
P{x,k*,A*/k*)
holds for
-1
= {O},
D
4
is trivially satisfied, and Theorem 1
P{X,k*,A*/k*)
has
the shape illustrated by Figure 1.
I
\\
1 .
-2
-1
1
3
2
Figure 1
Although the function
P
is not concave in
x
(for any
k), it is to be
noted that in this case there is only one local unconstrained maximum and this
occurs at
x*
= +2.
Note also that the function
P has no global maximum.
12
VI.
APPLICATION
It is generally thought to be the case that if the optimal response function
is differentiable then the partial derivatives are given by the optimal dual variab1es
However, for nonconcave programs, a proof of this result, under
reasonable assumptions, is not known.
In this section, we offer such a proof
under the following assumptions.
A1 :
x*
A :
2
the assumptions of Theorem 1 hold.
A:
3
the following stability conditions hold (see [1])
(i)
(ii)
is a unique global solution to (P)"
.3
IS > 0
;;l
{xeRn : g(x)
<
O}
SO'
equal to
is compact, where
Sol
I
is nonempty and the closure of this set is
the constraint set in (P).
the optimal response is differentiable at
A :
4
is the m-vector of ones
b" 0
(it is not known
whether this assumption is redundant to A , A and A ).
1 2
3
The optimal response function can be precisely defined as follows.
..
{xeRn : g(x)
S
Let
where bE:Rm,
b},
and let
B
The optimal response function
iHEOREM
2:
..
f.6Up:
Under the assumptions
B -+ R
U {+oo}
is defined as
A thru A ,
1
4
=
j
= 1,
.•• , m.
b=O
PRooF:
By Theorem 1,
13* == 'A*/k* and k*
x*
is an isolated local maximizer of P(x,k*,I3*),
is sufficiently large.
Hence,.3 e: > 0 such that
where
13
Vx
P(x*,k*,6*) > P(x,k*,B*)
:i)
I/x-x*1I s
Let D
8.
= {x€Rn :
Ilx-x*11 s d,
and
consider the problem
(~)
max
x
f(x)
D
€
g(x)
Let
~~up(b)
O.
S
= ~upf(x),
s.t. g(x)
b,
S
x€D.
We first observe that
x*, k*, 6*
satisfy the following three conditions:
(i)
P(x*,k*,B*)
(ii)
g(x*)
(iii)
S
>
P(x,k*,S*)
cussed in [2].
=
x*, k*, a*
= 1, ••• ,m.
(By G .)
3
solve the extended constrained Lagrangian problem dis0,
i
It follows from results in the latter paper that the function
m
z(b,k*,A*)
L
6
j=l
is a support to
z(b,k*,A*)
€ D
0
s*i[exp(kgi (x*»-l]
Consequently,
Vx
V b€B.
~~up(·) at
A*j/k* [eXp(k*b j )-l] + ~~up(O)
(0, ~~up(O»
If we can now show that
in the sense that
~~up(b)
~~up(b)
for all b
ficiently near zero, it will follow that, for
=
=
az
ab
j
= f~up(b)
suf-
€ {l, ••• ,m},
(0, k*, A*)
j
S
=
A* j
and the proof will be complete.
To show the required result, let N1 / n (O) denote a family of balls of
m
radius l/n about the origin in R and suppose that in each of these neighborhoods there is a point bn such that ~~up(bn):f: f~up(bn)' We note that
since bn ~ 0, the sets Sh n are eventually compact and nonempty and Sb ~ So
n
in the Hausdorff metric. It was shown in [1] that these results are implied by
the stability conditions
all n
A • By the eventual compactness of the sets Sbn'
3
sufficiently large there is an x €Sb
such that f(x) = f~ p(b ).
n
n
n
4JU
it follows that none of the elements
for
n
x
n
are
14
e
in
D.
Hence a subsequence of
II i
-x*/1 > e.
But since
x
converges to some
n
f.6Up(b n ) ~ f.6Up(O),
for all
b
f(x*).
in some neighborhood of zero.
such that
we have
...
lvhich contradicts the uniqueness of
""
X€So
f.6Up(O)
Thus we have
=
f(x*),
~.6u.p(b)
= f.6Up(b)
0
REFERENCES
[1]
J. P. Evans and F. J. Gould, "Stability in nonlinear programming,"
Opms. Res., 18 (1970), pp. 107-118.
[2]
F. J. Gould and Jon W. Tolle, "A necessary and sufficient constraint qualification for constrained optimization," SIAM J. AppZ. Math.,
to appear.
[3]
F. J. Gould,
"Extensions of Lagrange multipliers in nonlinear programming,"
SIAM J. AppZ. Math., 17 (1969), pp. 1280-1297.
[4]
Non-linear! Pro(J1'amming: SequentiaZ Unconstrained Minimization Techniques, John Wiley and Sons, New York,
A. V. Fiacco and G. P. McCormick,
1968.
[5]
A CZass of Methods for Non-linear! Programrrring with Temrination and Convergence Properties, Integer and NonZinear Programming,
R. Fletcher,
J. Abadie, ed., North-Holland Publishing Company, Amsterdam, 1970.
[6]
G. B. Dantzig, Linear Programming and Extensions,
Press, New Jersey, 1963.
[7]
G. L. Nemhauser and W. B. Widhelm, "A modified linear program for columnar
methods in mathematical programming," Opns. Res., to appear.
Princeton University