On the Existence and Convergence of the Central Path for Convex

Computational Optimization and Applications, 10, 51–77 (1998)
c 1998 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands.
°
On the Existence and Convergence of the Central
Path for Convex Programming and Some Duality
Results
[email protected]
RENATO D.C. MONTEIRO
FANGJUN ZHOU
School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA 30332
Received December 18, 1995; Revised January 25, 1997; Accepted February 5, 1997
Abstract. This paper gives several equivalent conditions which guarantee the existence of the weighted central
paths for a given convex programming problem satisfying some mild conditions. When the objective and constraint
functions of the problem are analytic, we also characterize the limiting behavior of these paths as they approach
the set of optimal solutions. A duality relationship between a certain pair of logarithmic barrier problems is also
discussed.
Keywords: convex program, central path, duality, Wolfe dual, Lagrangean dual
1.
Introduction
The purpose of this work is to provide several conditions which guarantee the existence of a
family of weighted central paths associated with a given convex programming problem and
to study the limiting behavior of these paths as they approach the set of optimal solutions
of the problem.
There are several papers in the literature which study these issues in the context of linear
and convex quadratic programs. These include Adler and Monteiro [1], Güler [4], Kojima,
Mizuno and Noma [7], Megiddo [9], Megiddo and Shub [10], Monteiro [11], Monteiro
and Tsuchiya [13], Witzgall, Boggs and Domich [15]. For linear and convex quadratic
programs, the properties of the weighted central paths are quite well understood under very
mild assumptions, namely (a) existence of an interior feasible solution and, (b) boundedness
of the set of optimal solutions. The paper by Güler et al. [5] gives a simplified and elegant
treatment in the context of linear programs of several results treated in the forementioned
papers.
Conditions which guarantee the existence of the weighted central paths have been given
by Kojima, Mizuno and Noma [7] and McLinden [8] for special classes of convex programs.
Namely, [7] deals with the monotone nonlinear complementarity problem which is known
to include certain types of convex programs as special case and [8] deals with the pair of
dual convex programs
inf{h(x) | x ≥ 0},
inf{h∗ (ξ) | ξ ≥ 0},
(1)
where h(·) is an extended proper convex function and h∗ (·) is the conjugate function of
h(·). We develop corresponding results for the general convex program (P) described
52
MONTEIRO AND ZHOU
below. Major differences between problem (P) and the one considered by McLinden are:
(i) the feasible region of problem (1) is contained in the nonnegative orthant while problem
(P) is not required to satisfy this condition; (ii) the objective and constraint functions of
problem (P) assume only real values while the objective function h(·) of problem (1) can
assume the value +∞. It might be possible to extend some of our results to the more general
setting of extended convex functions but we made no attempt in this direction since such
extension would needless complicate our notation and development.
McLinden in his remarkable paper [8] gives some special results about the limiting behavior of the weighted central paths with respect to the pair of convex programs (1). Specifically,
he analyzes the convergence behavior of these paths assuming, in addition to conditions (a)
and (b) above, the existence of a pair of primal and dual optimal solutions satisfying strict
complementarity.
In this paper (Section 5), we provide convergence results for the weighted central paths
assuming that the objective function and the constraint functions are analytic. As opposed
to McLinden [8], we do not assume the existence of any pair of primal and dual optimal
solutions satisfying strict complementarity.
I l+ and R
I l++
Throughout this paper R
I l denotes the l-dimensional Euclidean space; also, R
denote the nonnegative and the positive orthants of R
I l , respectively. Given convex functions
n
n
I
gj : R
I → R,
I j = 1, . . . , p, throughout this paper we consider the following
f :R
I → Rand
convex program
(P)
inf f (x)
s.t. x ∈ P ≡ {x ∈ R
I n | Ax = b, g(x) ≤ 0},
I m and g : R
In →R
I p denotes the function defined for every x ∈ R
In
where A ∈ R
I m×n , b ∈ R
T
by g(x) = (g1 (x), . . . , gp (x)) .
Given a fixed weight vector w ∈ R
I p++ , the w-central path of (P) arises by considering
the following parametrized family of logarithmic barrier problems
(P(t))
inf{f (x) − t
p
X
wj log |gj (x)| | x ∈ P 0 },
(2)
j=1
where t ∈ R
I ++ is the parameter of the family and
I n | Ax = b, g(x) < 0}
P 0 ≡ {x ∈ R
is the set of interior feasible solutions of (P). If each problem (P(t)) has exactly one
solution x(t) then the path t > 0 7→ x(t) ∈ P 0 is called the w-central path associated with
(P). Conditions for the existence of this path are therefore conditions which guarantee that
(P(t)) has exactly one solution for every t > 0.
Our paper is organized as follows. In Section 2 we introduce some notation and terminology that are used throughout the paper. We also introduce some mild assumptions that
are frequently used in our results and discuss several situations in which these assumptions
are satisfied. It is hoped that this discussion will convince the reader of the mildness of our
assumptions.
EXISTENCE AND CONVERGENCE OF THE CENTRAL PATH
53
In Section 3, several equivalent conditions which guarantee the existence of at least one
solution of problem (P(t)) are presented (see Theorem 1 and Theorem 2). A duality theory
between a certain pair of logarithmic barrier problems (that is, problems (4) and (5)) is also
developed in this section. A similar duality theory has been developed by Megiddo [9] for
the same pair of logarithmic barrier problems in the context of linear programs.
In Section 4, we show that the several conditions developed in Section 3 are equivalent to
conditions requiring boundedness of the optimal solution set of problem (P) and/or its dual
problem (see Theorem 3). Both the Wolfe dual and the Lagrangean dual are considered in
our analysis.
Results that guarantee the uniqueness of the solution of problem (P(t)) are stated in
Section 5 and require analyticity of the functions f (·) and gj (·), j = 1, . . . , p. We observe
that while most of the results of Section 3 and Section 4 do not require any analyticity
condition, Section 5 deals exclusively with convex programs whose objective and constraint
functions are analytic. In Section 5 the analyticity assumption plays a major role in the
study of the limiting behavior of the w-central path t 7→ x(t) as t → 0. Convergence of
certain dual weighted central paths are also proved in Section 5 under the assumption that
all constraint functions are affine.
The following notation is used throughout the paper. If x ∈ R
I l and B ⊂ {1, . . . , l}
I l then hB (y)
then xB denotes the subvector (xi )i∈B ; if h(·) is a function taking values in R
denotes the subvector [h(y)]B for every y in the domain of h(·). The vector (1, · · · , 1)T ,
regardless of its dimension, is denoted by 1. For any vector x, the notation |x| is used
for the vector whose i-th component is |xi |. Throughout, we use terminology and facts
from finite-dimensional convex analysis as presented by Rockafellar [14]. In particular, the
symbols ∗ , ∂ and 0+ applied to a function signify the conjugate function, subdifferential
mapping, and the recession function, respectively. Also, the symbol 0+ applied to a convex
set signify the recession cone of the set. We denote the number of elements of a finite set
J by |J|.
2.
Notation and Assumptions
In this section we introduce some notation and terminology that are used throughout the
paper. We also introduce some assumptions that are frequently used in our results and also
discuss several situations in which these assumptions are satisfied; it is our hope that this
discussion will show that these assumptions are mild ones.
The problem we consider in this paper is the convex program (P) stated in Section 1.
I m ×R
Ip →R
I associated with (P) is defined for every
The Lagrangean function L : R
I n ×R
n
m
p
(x, y, s) ∈ R
I ×R
I ×R
I by
L(x, y, s) ≡ f (x) + (b − Ax)T y + sT g(x),
and the Lagrangean dual problem associated with (P) is
(DL )
sup L(y, s)
s.t. (y, s) ∈ DL ≡ {(y, s) | s ≥ 0, L(y, s) > −∞},
I p → [−∞, ∞) is the dual function defined by
where L : R
I m ×R
54
MONTEIRO AND ZHOU
L(y, s) ≡ inf{L(x, y, s) | x ∈ R
I n },
∀(y, s) ∈ R
I m ×R
I p.
(3)
It is well known that L is a concave function since it is the pointwise infimum of the affine
(and hence concave) functions L(x, ·, ·), x ∈ R
I n . Another dual problem associated with
(P) is the Wolfe dual given by
sup L(y, s)
(DW )
½
s.t. (y, s) ∈ DW ≡
L(y, s) = L(x, y, s)
(y, s) ∈ DL |
for some x ∈ R
In
¾
.
Hence DW is the set of all points (y, s) in DL for which the infimum in (3) is achieved. It
is well known that the set of points (y, s) for which the infimum in (3) is achieved is given
by {(y, s) | 0 ∈ ∂L(x, y, s) for some x ∈ R
I n }.
We denote the set of optimal solutions of a program (·) as opt(·) and its value as val(·).
So, for instance, opt(P) denotes the set of optimal solutions of (P) and val(P) denotes its
value. By definition, the value of a program is the infimum (or the supremum) of the set of
all values that the objective function can assume over the set of all feasible solutions of the
problem. By convention, we assume that the infimum (supremum) of the empty set is equal
to +∞ (−∞) and that the infimum (supremum) of a set unbounded from below (above) is
equal to −∞ (+∞). Let
0
≡ {(y, s) ∈ DL | s > 0},
DL
0
≡ {(y, s) ∈ DW | s > 0}.
DW
0
0
and DW
as the set of interior feasible solutions of problem (DL ) and (DW ),
We refer to DL
respectively. The following logarithmic barrier problems plays a fundamental role in our
presentation. Given w ∈ R
I p++ , consider the problems
w
(P )
inf{pw (x) ≡ f (x) −
p
X
wj log |gj (x)| | x ∈ P 0 },
(4)
j=1
w
)
(DL
sup{dw (y, s) ≡ L(y, s) +
p
X
0
wj log sj | (y, s) ∈ DL
},
(5)
j=1
w
)
(DW
0
sup{dw (y, s) | (y, s) ∈ DW
}.
w
) is as the problem
An equivalent way to formulate problem (DW
sup{L(x, y, s) +
p
X
wj log sj | s > 0, 0 ∈ ∂L(x, y, s)},
j=1
which, for differentiable functions f (·) and g(·), takes the form
sup{L(x, y, s) +
p
X
j=1
wj log sj | s > 0, ∇f (x) − AT y + ∇g(x)s = 0}.
(6)
EXISTENCE AND CONVERGENCE OF THE CENTRAL PATH
55
Here are adopting the convention that the gradient of a scalar function is a column vector
and ∇g(x) = [∇g1 (x) · · · ∇gp (x)]. We also let
P D0 = {(x, y, s) : x ∈ P 0 , s > 0, L(x, y, s) = L(y, s)},
and, for w ∈ R
I p++ , we define the set
Sw ≡ {(x, y, s) ∈ P D0 | −g(x) ◦ s = w},
where the notation ◦ denotes the Hadamard product of two vectors, that is if u and v are
two vectors then u ◦ v denotes the vector whose i-th component is equal to ui vi for every
i. When f (·) and g(·) are differentiable, the points in Sw satisfy the following “centering”
conditions:
Ax = b, g(x) < 0,
∇f (x) + ∇g(x)s − AT y = 0, s > 0,
−g(x) ◦ s = w.
We next introduce some assumptions which are frequently used in our presentation and
subsequently we make some comments about the assumptions. Consider the following two
assumptions:
Assumptions:
(A) rank(A) = m;
(B) For any α > 0 and β ∈ R,
I the set {x ∈ P | g(x) ≥ −α1, f (x) ≤ β} is bounded.
Assumption (A) is quite standard and considerably simplifies our development. We next
discuss Assumption (B). First note that Assumption (B) obviously holds when P = ∅, that
is, when problem (P) is infeasible. Assumption (B) also holds when opt(P) is nonempty
and bounded. Indeed, it is easy to see that opt(P) is nonempty and bounded if and only if
there exists a constant β such that the set {x ∈ P | f (x) ≤ β} is bounded. It follows from
Lemma 1 below that a necessary and sufficient condition for opt(P) to be nonempty and
bounded is that the set {x ∈ P | f (x) ≤ β} be bounded for any β ∈ R.
I Since the set defined
in Assumption (B) is a subset of {x ∈ P | f (x) ≤ β}, we conclude that Assumption (B)
holds when opt(P) is nonempty and bounded.
A proof of the following well known result can be found for example in Fiacco and
McCormick [2], page 93.
I n → R,
I j = 1, . . . , l, are convex functions and that, for some
Lemma 1 Assume that hj : R
I n | hj (x) ≤ αj , j = 1, . . . , l} is nonempty and bounded.
scalars α1 , . . . , αl , the set {x ∈ R
Then, for any given scalars β1 , . . . , βl , the set {x ∈ R
I n | hj (x) ≤ βj , j = 1, . . . , l} is
bounded (and possibly empty).
We should note that Assumption (B) does not imply that opt(P) is nonempty or that
opt(P) is bounded when it is nonempty. Indeed, the problem inf{x | x ≤ 0} satisfies
Assumption (B) but it has no optimal solution. Moreover, the problem inf{−x1 | x1 ≤
0, x2 ≤ 0} satisfies Assumption (B) and has a nonempty optimal solution set which is
unbounded.
56
MONTEIRO AND ZHOU
An example of a problem which does not satisfy Assumption (B) is
inf{f (x) ≡ 0 | ex − 1 ≤ 0}.
The next result gives a sufficient condition for Assumption (B) to hold when each constraint function gj (·), j = 1, . . . , p, is affine.
Lemma 2 Assume that g(x) = Cx − h, where C is a p × n matrix and h ∈ R
I p , and that
n
P = {x ∈ R
I | Ax = b, Cx ≤ h} is nonempty. Then a sufficient condition for Assumption
(B) to hold is that P be a pointed polyhedron (that is, P has a vertex). In addition, if f (·)
is an affine function then this condition is also necessary.
Proof. Assume that P is pointed and let α > 0 and β ∈ Rbe
I
given. Note that P is pointed
if and only if the lineality space of P is equal to {0}, that is, {d ∈ R
I n | Ad = 0, Cd =
0} = {0}. Hence the set
{x ∈ P | g(x) = Cx − h ≥ −α1} = {x | Ax = b, h − α1 ≤ Cx ≤ h}
(7)
is bounded since its recession cone is equal to {d ∈ R
I n | Ad = 0, Cd = 0} = {0}. Thus
the set defined in Assumption (B) is also bounded since it is a subset of the set in (7). We
have thus shown that Assumption (B) holds.
To show the second part of the lemma, assume that f (·) is an affine function, say, f (x) =
I n and γ ∈ R.
I The set of Assumption (B) is bounded if and only if its
cT x + γ where c ∈ R
recession cone {d | Ad = 0, Cd = 0, cT d ≤ 0} is equal to {0}. However, it is easy to see
that {d | Ad = 0, Cd = 0, cT d ≤ 0} = {0} if and only if {d | Ad = 0, Cd = 0} = {0},
that is, if and only if P is pointed.
3.
Conditions for the Existence of the Central Path
In this section we give several equivalent conditions which ensure that problem (P(t)),
t > 0, defined in (2) has at least one solution. With this goal in mind, we will consider
the more general question of existence of a solution of the convex program (P w ) for an
arbitrary w ∈ R
I p++ . We also discuss the duality relationship that exists between the pair
w
w
) (or (DW
)). (See Megiddo [9] for a discussion of this duality
of problems (P w ) and (DL
relationship in the context of linear programs.)
We start by stating one of the main results of this section. Theorem 2, which is the other
main result of this section, complements Theorem 1.
Theorem 1 Suppose that both Assumption (A) and Assumption (B) hold. Then the following statements are all equivalent:
0
6= ∅;
(a) P 0 6= ∅ and DW
0
6= ∅;
(b) P 0 6= ∅ and DL
(c) opt(P w ) 6= ∅;
(d) Sw 6= ∅;
EXISTENCE AND CONVERGENCE OF THE CENTRAL PATH
57
(e) P D0 6= ∅;
w
) 6= ∅.
(f) opt(DL
Moreover, any of the above statements imply:
w
w
w
) = opt(DW
) and the set {s | (y, s) ∈ opt(DL
)} is a singleton; if in addition
(1) opt(DL
w
)
Assumption (A) holds and the functions f (·) and g(·) are differentiable then opt(DL
is also a singleton;
w
)};
(2) Sw = {(x, y, s) | x ∈ opt(P w ), (y, s) ∈ opt(DL
w
),
(3) for any fixed (ȳ, s̄) ∈ opt(DL
I n };
opt(P w ) = {x | x ∈ P, |g(x)| ◦ s̄ = w} ∩ Argmin{L(x, ȳ, s̄) | x ∈ R
w
)=
(4) val(P w ) − val(DL
Pp
j=1
w
w
wj (1 − log wj ) and val(DL
) = val(DW
).
The proof of Theorem 1 will be given only after we prove several preliminary results,
some of which are interesting in their own right. The first two lemmas are well known and
are stated without proofs.
Lemma 3 Assume that X is a metric space and that h : X → R
I ∪ {∞} is a proper
lower semi-continuous function. Let E be a nonempty subset of X. If there exists a point
x0 ∈ E such that h(x0 ) < ∞ and the set {x ∈ E | h(x) ≤ h(x0 )} is compact then the
set of minimizers of the problem inf{h(x) | x ∈ E} is nonempty and compact (and hence
bounded).
Lemma 4 Let α and β be given positive scalars and consider the function h : (0, ∞) → R
I
defined by h(t) = αt − β log t for all t ∈ (0, ∞). Then,
(a) h is strictly convex;
(b) h(t) ≥ β[1 − log(β/α)] for all t > 0 with equality holding if and only if t = β/α;
(c) h(t) → ∞ as t → 0 or t → ∞.
The following simple lemma is invoked more than once in our development.
0
6= ∅. Then there exist constants τ0 ∈ R
I and τ1 > 0 such that
Lemma 5 Suppose that DL
f (x) ≥ τ0 + τ1
p
X
|gj (x)|,
∀x ∈ P.
(8)
j=1
0
0
. By the definition of DL
, there exists τ0 ∈ R
I
Proof. Let (y 0 , s0 ) be a fixed point in DL
such that
L(x, y 0 , s0 ) ≥ τ0 ,
∀x ∈ R
I n.
(9)
58
MONTEIRO AND ZHOU
Let τ1 ≡ min{s0j | j = 1, . . . , p} > 0. Rearranging (9), we obtain that for every x ∈ P ,
f (x) ≥ τ0 − (s0 )T g(x) − (y 0 )T (b − Ax)
p
X
|gj (x)|.
= τ0 + (s0 )T |g(x)| ≥ τ0 + τ1
j=1
The next result shows that the existence of interior feasible solutions for both (P) and
(DL ) implies that (P w ) has an optimal solution.
Proposition 1 Suppose that Assumption (B) holds and let w ∈ R
I p++ be given. If P 0 6= ∅
0
w
and DL 6= ∅ then opt(P ) 6= ∅.
Proof. Take a point x0 ∈ P 0 . In view of Lemma 3, the result follows once we show that the
set Ω0 ≡ {x ∈ P 0 | pw (x) ≤ pw (x0 )} is compact, where pw (·) is defined in (4). Indeed,
0
6= ∅, Lemma 5 implies that there exist constants τ0 ∈ R
I and τ1 > 0 such that (8)
since DL
holds. Hence, we have
p
X
(τ1 |gj (x)| − wj log |gj (x)|) ≤ f (x) − τ0 −
j=1
p
X
wj log |gj (x)|
j=1
= pw (x) − τ0 ≤ pw (x0 ) − τ0 ,
∀x ∈ Ω0 .
Using Lemma 4 and the fact that τ1 > 0 and w > 0, it is easy to verify that the above
relation implies the existence of a constant ² > 0 such that
² ≤ |gj (x)| ≤ ²−1 ,
∀x ∈ Ω0 and ∀j = 1, . . . , p.
(10)
Relation (10) and the fact that pw (x) is bounded above on Ω0 imply that f (x) is also
bounded above on Ω0 . In view of Assumption (B), it follows that Ω0 is bounded. Relation
(10) also implies that
Ω0 ≡ {x ∈ P ² | pw (x) ≤ pw (x0 )}
I n | Ax = b, g(x) ≤ −²1}. Since P ² is a closed set, it follows that Ω0
where P ² ≡ {x ∈ R
is also a closed set. We have thus proved that Ω0 is a compact set.
The set opt(P w ) is not necessarily a singleton. However, we have the following result
whose proof is left to the reader.
Lemma 6 If opt(P w ) 6= ∅ then the set {(f (x), g(x)) | x ∈ opt(P w )} is a singleton, that
is, there exist f¯ ∈ R
I and ḡ ∈ R
I p such that f (x) = f¯ and g(x) = ḡ for every x ∈ opt(P w ).
w
) 6= ∅. We
We now turn our efforts to obtaining conditions which ensure that opt(DL
start with the following preliminary result.
Lemma 7 Suppose that Assumption (A) holds and that P 0 6= ∅. Then the following
statements hold:
EXISTENCE AND CONVERGENCE OF THE CENTRAL PATH
59
(a) there exist constants γ0 ∈ R
I and γ1 > 0 such that
L(y, s) ≤ γ0 − γ1
p
X
sj , ∀(y, s) ∈ R
I m ×R
I p+ ;
(11)
j=1
Im ×R
I p | L(y, s) ≥ γ, s ≥ 0} is
(b) for any constant γ ∈ R,
I the set Ωγ ≡ {(y, s) ∈ R
compact (possibly empty).
Proof. We first show (a). Since P 0 6= ∅, let x0 be a fixed point in P 0 . Defining γ0 ≡ f (x0 )
and γ1 ≡ minj=1,...,p {|gj (x0 )|} and using the definition of L and the fact that Ax0 = b
and g(x0 ) < 0, we obtain
L(y, s) ≤ L(x0 , y, s) = f (x0 ) + y T (b − Ax0 ) + sT g(x0 )
p
X
sj ,
∀(y, s) ∈ R
I m ×R
I p+ .
= f (x0 ) − sT |g(x0 )| ≤ γ0 − γ1
j=1
We now show (b). The result is trivial when Ωγ = ∅, and hence we assume from now on
I m ×R
I p 7→
that Ωγ 6= ∅ and that (ȳ, s̄) is a fixed point in Ωγ . Since the function (y, s) ∈ R
−L(y, s) ∈ (−∞, +∞] is lower semi-continuous and convex, it follows that {(y, s) |
L(y, s) ≥ γ} is a closed convex set. Hence, Ωγ is also a closed convex set. To show
that Ωγ is bounded, it is sufficient to prove that 0+ Ωγ = {0}. Indeed, let (∆y, ∆s) be an
arbitrary vector in 0+ Ωγ . We will show that (∆y, ∆s) = 0. By the definition of 0+ Ωγ , we
have that (ȳ, s̄) + λ(∆y, ∆s) ∈ Ωγ for all λ ≥ 0. More specifically, we have:
s̄ + λ∆s ≥ 0, ∀λ ≥ 0,
and
γ ≤ L(ȳ + λ∆y, s̄ + λ∆s) ≤ γ0 − γ1
p
X
(s̄j + λ∆sj ), ∀λ ≥ 0,
(12)
j=1
where the last inequality is due to (11). Since ∆s ≥ 0, relation (12) holds only if ∆s = 0.
Next, assume for contradiction that ∆y 6= 0. Then, using the fact that rank(A) = m, it
is easy to show the existence of a point x̄ ∈ R
I n such that ∆y T (b − Ax̄) < 0. Using the
definition of L, the fact that ∆s = 0 and relation (12), we obtain
γ ≤ L(ȳ + λ∆y, s̄ + λ∆s)
≤ L(x̄, ȳ + λ∆y, s̄ + λ∆s)
= L(x̄, ȳ, s̄) + λ(∆y)T (b − Ax̄),
∀λ ≥ 0.
But this relation holds only if ∆y T (b−Ax̄) ≥ 0, contradicting the fact that ∆y T (b−Ax̄) <
0. Hence, we must have ∆y = 0 and the result follows.
The following well known result is used in the proof of the next proposition. Its proof
can be found for example in Rockafellar [14], theorem 21.2.
60
MONTEIRO AND ZHOU
Lemma 8 A necessary and sufficient condition for P 0 to be empty is that there exists a
I p such that 0 6= s̃ ≥ 0 and
point (ỹ, s̃) ∈ R
I m ×R
∀x ∈ R
I n.
ỹ T (b − Ax) + s̃T g(x) ≥ 0,
(13)
0
6= ∅. Let w ∈ R
I p++ be
Proposition 2 Suppose that Assumption (A) holds and that DL
given. Then the following implications hold:
w
) is unbounded;
(a) if P 0 = ∅ then (DL
w
) 6= ∅.
(b) if P 0 6= ∅ then opt(DL
Proof. We first show implication (a). Assume that P 0 = ∅ and let (ȳ, s̄) be a fixed point
0
. Lemma 8 and the fact that P 0 = ∅ imply the existence of a point (ỹ, s̃) ∈ R
I m ×R
Ip
in DL
satisfying relation (13) and 0 6= s̃ ≥ 0. We next show that (ȳ(λ), s̄(λ)) ≡ (ȳ, s̄)+λ(ỹ, s̃) ∈
0
for all λ ≥ 0. Indeed, the fact that s̄ > 0 and s̃ ≥ 0 implies that s̄(λ) > 0 for all λ ≥ 0.
DL
Moreover, using relation (13) and the definition of L, we obtain
ª
©
L(ȳ(λ), s̄(λ)) = inf f (x) + ȳ(λ)T (b − Ax) + s̄(λ)T g(x)
x
ª
©
≥ inf f (x) + ȳ T (b − Ax) + s̄T g(x)
x
ª
©
+ λ inf ỹ T (b − Ax) + s̃T g(x)
x
ª
©
= L(ȳ, s̄) + λ inf ỹ T (b − Ax) + s̃T g(x)
x
≥ L(ȳ, s̄),
∀λ ≥ 0.
(14)
0
for all λ ≥ 0. Moreover, using relation (14)
the fact that 0 6=
Hence, (ȳ(λ), s̄(λ)) ∈ DL
Pand
p
s̃ ≥ 0, we can easily verify that dw (ȳ(λ), s̄(λ)) ≡ L(ȳ(λ), s̄(λ)) + j=1 log wj s̄j (λ) →
w
∞ as λ → ∞. Hence, problem (DL
) is unbounded and implication (a) follows.
w
) is
We next show implication (b). Assume that P 0 6= ∅. First observe that opt(DL
p
m
I ×R
I ++ }, where
exactly the set of minimizers of the problem inf{−dw (y, s) | (y, s) ∈ R
dw (·, ·) is defined in (5). In view of Lemma 3, it is sufficient to show that the set
Γ ≡ {(y, s) | −dw (y, s) ≤ τ, s > 0}
is compact, where τ ≡ −dw (ȳ, s̄) < ∞ and (ȳ, s̄) is the point considered in the proof of
(a). Indeed, since P 0 6= ∅ and Assumption (A) holds, it follows that the assumptions of
I and γ1 > 0
Lemma 7 are satisfied. Hence, by Lemma 7(a), there exist constants γ0 ∈ R
such that (11) holds. Thus, if (y, s) ∈ Γ we have,
−dw (ȳ, s̄) + γ0 ≥ −dw (y, s) + γ0
= −L(y, s) + γ0 −
p
X
wj log sj
j=1
≥ γ1
p
X
j=1
=
p
X
j=1
sj −
p
X
wj log sj
j=1
{γ1 sj − wj log sj } .
EXISTENCE AND CONVERGENCE OF THE CENTRAL PATH
61
Using Lemma 4 and the above relation, it is easy to show the existence of a constant δ > 0
such that
δ1 ≤ s ≤ δ −1 1, ∀(y, s) ∈ Γ.
(15)
Relation (15) and the definition of Γ then imply
Γ ⊆ {(y, s) | s ≥ 0, L(y, s) ≥ −τ −
p
X
wj log δ −1 },
j=1
which, in view of Lemma 7(b), yields the conclusion that Γ is bounded. Also, (15) implies
that
Γ = {(y, s) | −dw (y, s) ≤ τ, s ≥ δ1},
which in turn implies that Γ is a closed set. Hence, Γ is a compact set and implication (b)
follows.
As an immediate consequence of the previous result, we obtain the following corollary.
w
) 6= ∅ if and only if
Corollary 1 Suppose that Assumption (A) holds. Then opt(DL
0
0
P 6= ∅ and DL 6= ∅.
Note that Proposition 2 is a result about the Lagrangean dual. It is natural to ask if a
similar result holds with respect to the Wolfe dual, that is, whether the two implications:
w
) is unbounded;
(a0 ) if P 0 = ∅ then (DW
0
0
w
) 6= ∅,
(b ) if P 6= ∅ then opt(DW
0
6= ∅. It turns out that none of the
hold under Assumption (A) and the assumption that DW
two implications hold as the following example illustrates.
I 2 | x2 ≥ 1/x1 , x1 > 0} and the
Example: Consider the convex set C = {(x1 , x2 ) ∈ R
2
I
by f (x) = −2x2 + dist(x, C) and g(x) = x22 + 2x2 + 1 − δ
functions f, g : R
I → Rdefined
2
for all x ∈ R
I , where δ is a nonnegative constant. Clearly, both f (·) and g(·) are convex
functions. It is easy to verify that the dual function L restricted to R
I + is given by

−∞
if s = 0,



if 0 < s < 1,
−δs + 2 − s−1
L(s) ≡ inf L(x, s) =
−δs
+
s
if 1 ≤ s ≤ 3/2,



−δs + 3 − (9/4)s−1 if 3/2 ≤ s,
and that the infimum is achieved only when 0 < s < 1. Hence, we have DL = (0, ∞)
w
) is
and DW = (0, 1). If δ = 0, it is easy to verify that P 0 = ∅ and that problem (DW
w
0
bounded above with opt(DW ) = ∅. This case shows that implication (a ) does not hold.
w
) is also bounded
Now, if δ > 0 then P 0 6= ∅ and, in addition, if δ ≤ 1 then problem (DW
w
0
above with opt(DW ) = ∅. This last case shows that implication (b ) does not hold.
We can also ask whether the equivalent version of Corollary 1 in terms of the Wolfe dual,
that is the one obtained by replacing the subscript L by W in its statement, hold. Clearly,
Example 3 with δ > 0 is a counterexample for the “if” part of this modified version of
Corollary 1. The following example provides a counterexample for the “only if” part.
62
MONTEIRO AND ZHOU
Example: Consider the convex set C as in Example 3 and the functions f, g : R
I2 → R
I
2
I . It is easy to verify
defined by f (x) = x1 − x2 and g(x) = |x2 | + dist(x, C) for all x ∈ R
that the dual function L(s) for every s ∈ R
I + is given by
½
−∞ if 0 ≤ s < 1,
L(s) ≡ inf L(x, s) =
0 if 1 ≤ s,
x
and that the infimum is achieved and finite only when s = 1. Hence, we have DL = [1, ∞)
w
)=
and DW = {1}. Moreover, we can easily verify that P = ∅. Clearly, we have opt(DW
0
{1} =
6 ∅ but P = ∅, and hence the “only if” part of Corollary 1 does not hold in the
context of the Wolfe dual. Note that this example satisfies Assumption (A) since m = 0
and Assumption (B) since P = ∅.
Note that Theorem 1 guarantees that implication (b0 ) holds if, in addition to assuming
0
6= ∅ and Assumption (A), we further impose Assumption (B). Note also that Example
DW
3 with δ > 0 does not satisfy Assumption (B).
w
).
We next describe the duality relationship that exists between problems (P w ) and (DL
0
then
Lemma 9 If x ∈ P 0 and (y, s) ∈ DL
pw (x) − dw (y, s) ≥
p
X
wj (1 − log wj ).
(16)
j=1
where pw (·) and dw (·, ·) are defined in (4) and (5). Moreover, equality holds in (16) if and
only if (x, y, s) ∈ Sw .
0
be given. Using the fact that Ax = b, g(x) < 0 and
Proof. Let x ∈ P 0 and (y, s) ∈ DL
L(x, y, s) ≥ L(y, s), we obtain
pw (x) − dw (y, s) = f (x) −
p
X
wj log |gj (x)| − L(y, s) −
j=1
≥ f (x) − L(x, y, s) −
p
X
wj log sj
j=1
p
X
wj log(sj |gj (x)|)
j=1
= y T (Ax − b) − sT g(x) −
p
X
wj log(sj |gj (x)|)
j=1
=
≥
p
X
j=1
p
X
(sj |gj (x)| − wj log sj |gj (x)|)
wj (1 − log wj ),
(17)
j=1
where the last inequality is due to Lemma 4(b). This shows (16). Using Lemma 4(b) again
and expression (17), it is easy to see that equality holds in (16) if and only if
L(x, y, s) = L(y, s),
−s ◦ g(x) = w.
(18)
EXISTENCE AND CONVERGENCE OF THE CENTRAL PATH
63
By definition of Sw , we immediately conclude that (18) holds if and only if (x, y, s) ∈ Sw .
Lemma 10 Let w ∈ R
I p++ be given and assume that Sw 6= ∅. Then, (x̄, ȳ, s̄) ∈ Sw if and
w
w
), in which case (ȳ, s̄) ∈ opt(DW
).
only if x̄ ∈ opt(P w ) and (ȳ, s̄) ∈ opt(DL
Proof. We first show the “only if” part of the equivalence and the the fact that (x̄, ȳ, s̄) ∈ Sw
w
implies (ȳ, s̄) ∈ opt(DW
). Indeed, assume that (x̄, ȳ, s̄) ∈ Sw . By the definition of Sw ,
0
0
0
⊆ DL
, and by the “if and only if” statement of Lemma
we have x̄ ∈ P and (ȳ, s̄) ∈ DW
9, we conclude that
pw (x̄) − dw (ȳ, s̄) =
p
X
wj (1 − log wj ).
j=1
This relation together with relation (16) of Lemma 9 implies
0
.
pw (x) − dw (y, s) ≥ pw (x̄) − dw (ȳ, s̄), ∀x ∈ P 0 , ∀(y, s) ∈ DL
(19)
Fixing (y, s) = (ȳ, s̄) in (19), we obtain the conclusion that x̄ ∈ opt(P w ). Similarly, fixing
w
0
). Since (ȳ, s̄) ∈ DW
, this also implies that
x = x̄ in (19), we obtain that (ȳ, s̄) ∈ opt(DL
w
(ȳ, s̄) ∈ opt(DW
).
We next show the “if” part of the equivalence. Assume that x̄ ∈ opt(P w ) and (ȳ, s̄) ∈
w
). Then it is easy to see that inequality (19) holds. By assumption, Sw 6= ∅ and so
opt(DL
0 0 0
let (x , y , s ) be a fixed point in Sw . As in the proof of the “only if” part, we have
pw (x ) − dw (y , s ) =
0
0
0
p
X
wj (1 − log wj ),
(20)
j=1
Combining inequality (19) with x = x0 and (y, s) = (y 0 , s0 ) and relation (20), we obtain
p
X
wj (1 − log wj ) ≥ pw (x̄) − dw (ȳ, s̄).
(21)
j=1
By inequality (16), we conclude that (21) must hold as equality. Hence, by the “if and only
if” statement of Lemma 9, it follows that (x̄, ȳ, s̄) ∈ Sw .
As an immediate consequence of the above two lemmas, we obtain the following result.
Proposition 3 Let w ∈ R
I p++ be given and assume that Sw 6= ∅. Then the following
statements hold:
w
w
w
) = opt(DW
) and the set {s | (y, s) ∈ opt(DL
)} is a singleton; if in addition
(a) opt(DL
w
)
Assumption (A) holds and the functions f (·) and g(·) are differentiable then opt(DL
is a singleton;
w
)};
(b) Sw = {(x, y, s) | x ∈ opt(P w ), (y, s) ∈ opt(DL
w
),
(c) for any fixed (ȳ, s̄) ∈ opt(DL
I n };
opt(P w ) = {x | x ∈ P, |g(x)| ◦ s̄ = w} ∩ Argmin{L(x, ȳ, s̄) | x ∈ R
64
w
(d) val(P w ) − val(DL
)=
MONTEIRO AND ZHOU
Pp
j=1
w
w
wj (1 − log wj ) and val(DL
) = val(DW
).
w
w
) = opt(DW
) follows from Lemma 10. Using the fact that
Proof. The assertion that opt(DL
w
) is strictly concave with respect to s, we can
the objective function dw (y, s) of problem (DL
w
)} is a singleton. Assume now that Assumption
easily see that the set {s | (y, s) ∈ opt(DL
(A) holds and the functions f (·) and g(·) are differentiable. Fix a point x̄ ∈ opt(P w ). (This
w
) if and only if
point exists since Sw 6= ∅.) By Lemma 10, we know that (y, s) ∈ opt(DL
∇f (x̄) − AT y + ∇g(x̄)s = 0 and |g(x̄)| ◦ s = w. Using the fact that rank(A) = m, we
can easily see that there exists a unique point (y, s) satisfying these two last equations. We
w
) is a singleton. Statements (b), (c) and (d) follow immediately
have thus shown that opt(DL
from (a), Lemma 9 and Lemma 10.
It is worth mentioning that Lemmas 9, 10 and Proposition 3 hold even if we do not assume
that the functions f (·) and gj (·), j = 1, . . . , p, are convex.
We are now in a position to give the proof of Theorem 1. The proof has already been
given in the several results stated above and all we have to do is to put the pieces together.
Proof of Theorem 1: We will show that the implications (a) ⇒ (b) ⇒ (c) ⇒ (d) ⇒
(e) ⇒ (a), the implication (d) ⇒ [(1), (2), (3) and (4)] and the equivalence (b) ⇔ (f )
hold, from which the result follows.
0
0
⊆ DL
.
[(a) ⇒ (b)] This is obvious since DW
[(b) ⇒ (c)] This follows from Proposition 1.
). The KKT conditions for P w imply the existence of
[(c) ⇒ (d)] Let x̄ ∈ opt(P wP
p
m
y ∈R
I such that 0 ∈ ∂f (x̄) + j=1 (wj /|gj (x̄)|) − AT y. Setting sj = wj /|gj (x̄)| for
j = 1, . . . , p, we have (x̄, y, s) ∈ Sw .
[(d) ⇒ (e)] This is obvious since Sw ⊆ P D0 .
0
[(e) ⇒ (a)] This is obvious since P D0 ⊆ P 0 × DW
.
[(d) ⇒ (1), (2), (3) and (4)] This follows from Proposition 3.
[(b) ⇔ (f )] This follows from Corollary 1.
w
Theorem 1 is not completely symmetrical in the sense that the condition opt(DW
) 6= ∅ is
not equivalent to conditions (a), (b), (c), (d), (e) and (f) (see Example 3). However, if (i) the
objective function and the constraints functions are analytic, or (ii) the constraints functions
w
) 6= ∅ is
gj (·), j = 1, . . . , p, are affine functions, then the next result shows that opt(DW
equivalent to conditions (a), (b), (c), (d), (e) and (f) of Theorem 1.
Theorem 2 Suppose that Assumption (A) and Assumption (B) hold and assume that
either one of the following conditions hold:
(a) the constraints functions gj (·), j = 1, . . . , p, are affine, or;
(b) the functions f (·) and gj (·), j = 1, . . . , p, are analytic.
w
) 6= ∅ is equivalent to any one of the conditons (a), (b), (c),
Then, the condition opt(DW
(d), (e) and (f) of Theorem 1.
The proof of Theorem 2 will be given at the end of this section after we state and prove
some preliminary results. The main property that we use about an analytic convex function
is that it satisfies the following flatness condition.
EXISTENCE AND CONVERGENCE OF THE CENTRAL PATH
65
Definition 1.
(Flatness condition) A function h : R
In →R
I is said to be flat if given any
n
points x, y ∈ R
I such that x 6= y the following implication holds: if h(x) is constant on
the segment [x, y] ≡ {λx + (1 − λ)y | λ ∈ [0, 1]} then h(x) is constant on the entire line
containing [x, y].
I a flat convex function such that inf{h(x) | x ∈ R
I n}
Lemma 11 Assume that h : R
I n → Ris
n
is finite but it is not achieved. Then, there exists a direction d ∈ R
I such that the function
λ ∈R
I 7→ h(x + λd) ∈ R
I is (strictly) decreasing for every x ∈ R
I n.
I n . Moreover,
Proof. Since inf{h(x) | x ∈ R
I n } is finite, we have h0+ (d) ≥ 0 for every d ∈ R
since this infimum is not achieved, it follows from Theorem 27.1(d) of Rockafellar [14] that
the set R = {d ∈ R
I n | h0+ (d) = 0} is nonempty (R is the set of all directions of recession
of h). We will show that some d ∈ R satisfies the conclusion of the lemma. Indeed, assume
I n such that λ 7→ h(xd + λd)
for contradiction that for every d ∈ R, there exists xd ∈ R
is not a strictly decreasing function. For the remaining of the proof, let d be an arbitrary
direction in R. By Theorem 8.6 of Rockafellar [14], it follows that λ 7→ h(xd + λd) is a
non-increasing function. Hence, there exists a closed interval [λ− , λ+ ] of positive length
such that λ 7→ h(xd + λd) is constant on [λ− , λ+ ]. Since h is a flat function, it follows that
I By Corollary 8.6.1 of Rockafellar [14], it follows
h(xd +λd) is constant on the whole line R.
that λ 7→ h(x + λd) is a constant function for every x ∈ R
I n . Since d ∈ R is arbitrary,
we have thus shown that every direction of recession of h(·) is a direction in which h(·) is
constant. By Theorem 27.1(b) of Rockafellar [14], it follows that inf{h(x) | x ∈ R
I n } is
achieved contradicting the assumptions of the lemma. Hence, the conclusion of the lemma
follows.
As a consequence of the previous lemma, we obtain the following result.
I are analytic convex functions satisfying the folLemma 12 Assume that h, k : R
In → R
lowing properties:
(a) inf{h(x) | x ∈ R
I n } is finite and achieved;
(b) inf{k(x) | x ∈ R
I n } is finite, and;
(c) inf{h(x) + k(x) | x ∈ R
I n } is not achieved.
Then the function h − θk is not convex for any θ > 0.
I n } are finite, we have h0+ (d) ≥
Proof. Since inf{h(x) | x ∈ R
I n } and inf{k(x) | x ∈ R
n
+
I . By Theorem 9.3 of Rockafellar [14], we have
0 and k0 (d) ≥ 0 for every d ∈ R
I n . Hence, we have
(h + k)0+ (d) = h0+ (d) + k0+ (d) for every d ∈ R
Γ ≡ {d ∈ R
I n | (h + k)0+ (d) = 0} = {d ∈ R
I n | h0+ (d) = 0, k0+ (d) = 0}.
¯ is a decreasing function
By Lemma 11, there exists d¯ ∈ Γ such that λ 7→ (h + k)(x + λd)
n
n
I such that h(x̄) ≤ h(x) for all x ∈ R
I n.
for every x ∈ R
I . By (a), there exists x̄ ∈ R
+ ¯
¯
Since h0 (d) = 0, it follows from Theorem 8.6 of Rockafellar [14] that λ 7→ h(x̄ + λd)
¯
is a non-increasing function. Hence, it follows that h(x̄ + λd) = h(x̄) for every λ ≥ 0.
¯ is a decreasing function, we conclude
Therefore, using the fact that λ 7→ (h + k)(x̄ + λd)
¯
that λ ∈ [0, ∞) 7→ k(x̄ + λd) is a decreasing function, and hence, together with (b), a
¯ is a strictly
strictly convex function. This implies that λ ∈ [0, ∞) 7→ (h − θk)(x̄ + λd)
concave function. We have thus shown that the function h − θk is not convex for any θ > 0.
66
MONTEIRO AND ZHOU
Theorem 2 is an immediate consequence of the following proposition.
0
6= ∅ and that either one of the following conditions
Proposition 4 Assume that DW
hold:
(a) the constraints functions gj (·), j = 1, . . . , p, are affine, or;
(b) the functions f (·) and gj (·), j = 1, . . . , p, are analytic.
w
) is unbounded.
Then, P 0 = ∅ implies that problem (DW
0
. By Lemma 8 and the fact that P 0 = ∅, there
Proof. Let (ȳ, s̄) be a fixed point in DW
I p such that 0 6= s̃ ≥ 0 and relation (13) is satisfied. As in the proof of
exists (ỹ, s̃) ∈ R
I m ×R
0
Proposition 2, it follows that (ȳ(λ), P
s̄(λ)) ≡ (ȳ, s̄) + λ(ỹ, s̃) ∈ DL
for all λ ≥ 0 and that
p
dw (ȳ(λ), s̄(λ)) ≡ L(ȳ(λ), s̄(λ)) + j=1 log s̄j (λ) → ∞ as λ → ∞. We will next show
0
for all λ ≥ 0, which together with the above observations imply
that (ȳ(λ), s̄(λ)) ∈ DW
w
that (DW ) is unbounded.
We first assume that condition (a) holds. In this case, it follows that the left hand side of
(13) is an affine function which is nonnegative. Then x ∈ R
I n 7→ ỹ T (b − Ax) + s̃T g(x)
is a constant function and hence, for every λ > 0, the function x 7→ L(x, ȳ(λ), s̄(λ))
0
, this implies
differs from the function x 7→ L(x, ȳ, s̄) by a constant. Since (ȳ, s̄) ∈ DW
n
that, for every λ > 0, inf{L(x, ȳ(λ), s̄(λ)) | x ∈ R
I } is achieved, or equivalently, that
0
.
(ȳ(λ), s̄(λ)) ∈ DW
0
Assume now that condition (b) holds. Assume for contradiction that (ȳ(λ̄), s̄(λ̄)) 6∈ DW
n
for some λ̄ > 0. Let h, k : R
I →R
I denote the functions defined by h(x) = L(x, ȳ, s̄) and
I n . It is easy to see that h and k satisfy all
k(x) = λ̄[ỹ T (b − Ax) + s̃T g(x)] for every x ∈ R
the assumptions of Lemma 12. Hence, from the conclusion of this lemma it follows that
the function h − θk is not convex for any θ > 0. But since h − θk = L(·, ȳ − θλ̄ỹ, s̄ − θλ̄s̃)
and s̄ − θλ̄s̃ ≥ 0 for any θ > 0 sufficiently small, we obtain that h − θk is convex for
any θ > 0 sufficiently small, contradicting the above conclusion. Hence, it follows that
0
for all λ ≥ 0.
(ȳ(λ), s̄(λ)) ∈ DW
4.
Other Existence Conditions for the Central Path
In this section, we derive other conditions which are equivalent to the conditions of Theorem
1 and/or Theorem 2. The conditions discussed in this section impose boundedness on the
optimal solution set of (P) and/or its dual (Lagrangean or Wolfe) problem. The main result
of this section is Theorem 3.
The first result essentially says that boundedness of the set opt(P) is equivalent to the
existence of an interior feasible solution for the (Lagrangean or Wolfe) dual problem.
Proposition 5 Suppose that Assumption (B) holds. Then, the following statements are
equivalent:
(a) opt(P) is nonempty and bounded;
0
6= ∅;
(b) P 6= ∅ and DW
0
(c) P 6= ∅ and DL 6= ∅.
EXISTENCE AND CONVERGENCE OF THE CENTRAL PATH
67
Proof. We first show the implication (a) ⇒ (b). Assume that opt(P) is nonempty and
bounded. Fix a point x0 ∈ P 6= ∅ and let ² > 0 be such that f (x0 ) < ²−1 . Consider the
following convex program
Pp
inf f (x) − j=1 wj log(² − gj (x))
s.t. f (x) ≤ ²−1 ,
(22)
Ax = b,
g(x) ≤ (²/2)1,
Observe that x0 is a feasible point of (22) and that all the inequality constraints of (22) are
strictly satisfied by x0 . Moreover, using the fact that opt(P) is nonempty and bounded and
Lemma 1, we can easily see that the feasible region of (22) is compact. Hence, the problem
has a minimizer x̄ which satisfy the KKT conditions:
¶
p µ
X
wj
∂gj (x̄) + λ̄∂f (x̄) − AT ȳ + ∂g(x̄)s̄,
(23)
0 ∈ ∂f (x̄) +
²
−
g
(x̄)
j
j=1
λ̄ ≥ 0,
s̄ ≥ 0,
λ̄[²−1 − f (x̄)] = 0,
s̄T [(²/2)1 − g(x̄)] = 0.
(24)
Rearranging (23), we obtain the condition that 0 ∈ ∂L(x̄, ŷ, ŝ), where
µ
¶
wj
1
−1
+ s̄j > 0,
ȳ,
ŝj ≡ (1 + λ̄)
∀j = 1, . . . , p.
ŷ ≡
² − gj (x̄)
1 + λ̄
0
0
and therefore DW
6= ∅.
Hence, (ŷ, ŝ) ∈ DW
0
0
The implication (b) ⇒ (c) is straightforward since DW
⊆ DL
.
0
6= ∅. Take a
We now prove the implication (c) ⇒ (a). Assume that P 6= ∅ and DL
0
point x ∈ P . In view of Lemma 3, the conclusion that opt(P) is nonempty and bounded
will follow if we show that the set Ω ≡ {x ∈ P | f (x) ≤ f (x0 )} is compact. This set
is clearly closed since P is closed and f (x) is continuous. It remains to show that Ω is
0
6= ∅, it follows from Lemma 5 that there exist constants τ0 ∈ R
I
bounded. Indeed, since DL
and τ1 > 0 such that relation (8) holds. This implies that
Ω = {x ∈ P | f (x) ≤ f (x0 ), gj (x) ≥ −(f (x0 ) − τ0 )/τ1 , ∀j = 1, . . . , p}.
By Assumption (B), the set in the right hand side of the above expression is bounded.
Hence, Ω is bounded and the result follows.
The proof of the implication (a) ⇒ (b) is based on ideas used in Lemma 13 of Monteiro
and Pang [12]. Note that the implications (a) ⇒ (b) and (a) ⇒ (c) hold regardless of the
validity of Assumption (B). On the other hand, Assumption (B) is needed to guarantee the
reverse implications. Indeed, consider Example 3 with δ > 0. Clearly, it does not satisfy
0
0
6= ∅ and DW
6= ∅. But it is easy to see that opt(P) = ∅ if
Assumption (B), P 0 6= ∅, DL
δ ≤ 1 and that opt(P) is nonempty and unbounded if δ > 1.
We next turn our efforts to show that, under certain mild assumptions, the existence of
an interior feasible solution for problem (P) is essentially equivalent to boundedness of
the set of optimal solutions of the (Lagrangean or Wolfe) dual problem. With this goal in
68
MONTEIRO AND ZHOU
mind, it is useful to recall the notion of a Kuhn-Tucker vector as defined in Rockafellar
I p+ is called a Kuhn-Tucker vector for (P) if
[14], pages 274-275. A point (y, s) ∈ R
I m ×R
L(y, s) = val(P) ∈ R.
I We denote the set of all Kuhn-Tucker vectors for (P) by KT .
By Theorem 28.1 and Theorem 28.3 of Rockafellar [14], we have that a necessary and
sufficient condition for x ∈ opt(P) and (y, s) ∈ KT is that the following relations hold:
x ∈ P, s ≥ 0, sT g(x) = 0 and L(x, y, s) = L(y, s).
(25)
(The last relation in (25) is also equivalent to 0 ∈ ∂f (x)−AT y+s1 ∂g1 (x)+. . .+sp ∂gp (x).)
Hence, when opt(P) 6= ∅, we have KT ⊂ DW .
The next two results give the relationship between the set KT and the sets opt(DL ) and
opt(DW ).
Lemma 13 KT =
6 ∅ if and only if opt(DL ) 6= ∅ and val(P) = val(DL ), in which case
KT = opt(DL ).
Proof. The proof of this lemma follows straightforwardly from the definition of KT and
from the weak duality result, namely: L(y, s) ≤ f (x) for every x ∈ P and (y, s) ∈ R
I m ×IRp+ .
Lemma 14 Assume that opt(P) 6= ∅. Then, KT 6= ∅ if and only if opt(DW ) 6= ∅ and
val(P) = val(DW ), in which case KT = opt(DW ).
Proof. Assume that opt(P) 6= ∅. First we observe that KT ⊆ opt(DW ). This inclusion
follows from the definition of KT , the weak duality result and the fact that KT ⊆ DW which
holds under the assumption that opt(P) 6= ∅ (see the observation preceding Lemma 13).
Under the assumption that opt(DW ) 6= ∅ and val(P) = val(DW ), the reverse inclusion
KT ⊇ opt(DW ) is immediate.
Lemma 15 Suppose that Assumption (A) holds. Then, P 0 6= ∅ and val(P) > −∞ imply
that KT is nonempty and bounded.
Proof. Assume that P 0 6= ∅ and val(P) > −∞. Then Theorem 28.2 of Rockafellar [14]
implies that KT =
6 ∅. The boundedness of KT follows from Lemma 7 and the fact that
I p+ | L(y, s) ≥ val(P)}.
KT = {(y, s) ∈ R
I m ×R
Lemma 16 If opt(DL ) is nonempty and bounded then P 0 6= ∅.
Proof. Assume that opt(DL ) is nonempty and bounded and let (ȳ, s̄) be a fixed point in
opt(DL ). Assume for contradiction that P 0 = ∅. Lemma 8 then implies the existence
I p satisfying relation (13) and 0 6= s̃ ≥ 0. We next show
of a point (ỹ, s̃) ∈ R
Im ×R
that (ȳ(λ), s̄(λ)) ≡ (ȳ, s̄) + λ(ỹ, s̃) ∈ opt(DL ) for all λ ≥ 0, a fact that contradicts the
boundedness of opt(DL ). Indeed, the fact that s̄ ≥ 0 and s̃ ≥ 0 implies that s̄(λ) ≥ 0 for
all λ ≥ 0. Moreover, using (13) and (14) we obtain,
L(ȳ(λ), s̄(λ)) ≥ L(ȳ, s̄),
∀λ ≥ 0.
(26)
EXISTENCE AND CONVERGENCE OF THE CENTRAL PATH
69
Since (ȳ, s̄) ∈ opt(DL ), relation (26) clearly implies that (ȳ(λ), s̄(λ)) ∈ opt(DL ) for all
λ ≥ 0.
As a consequence of the lemmas stated above we have the following result.
Proposition 6 Assume that Assumption (A) holds. If val(P) > −∞ then the following
statements are equivalent:
(a) P 0 6= ∅;
(b) KT is nonempty and bounded;
(c) opt(DL ) is nonempty and bounded,
in which case KT = opt(DL ) and val(P) = val(DL ). If instead the stronger condition
that opt(P) 6= ∅ is assumed then (a), (b) and (c) above are also equivalent to the following
statement:
(d) opt(DW ) is nonempty and bounded and val(P) = val(DW ),
in which case KT = opt(DW ).
Proof. Assume that val(P) > −∞. The implication (a) ⇒ (b) follows from Lemma
15. By Lemma 13, we conclude that (b) implies (c) and the fact that KT = opt(DL ) and
val(P) = val(DL ). Lemma 16 yields the implication (c) ⇒ (a). This shows the first
part of the proposition. Assume now that opt(P) 6= ∅. In this case, Lemma 14 yields the
equivalence (b) ⇔ (d) and that, in this case, KT = opt(DW ).
A natural question to ask is whether the condition that val(P) = val(DW ) can be omitted
from statement (d) of Proposition 6. The following example shows that this condition can
not be omitted.
I defined for every x = (x1 , x2 ) ∈ R
I 2 by
Example: Consider the functions f, g : R
I2 →R
½
−1 if x1 ≤ −1;
f (x) =
x1 if x1 ≥ −1
and g(x) = kxk − x2 , where k · k denotes the two-norm. It is easy to verify that the
dual function L(s) = inf x f (x) + sg(x) is given by L(s) = −1 for every s ∈ R
I + and
that the infimum is achieved only for s = 0. Hence, we have DL = [0, ∞) and DW =
{0}. Moreover, we can easily verify that P 0 = ∅, opt(P) = {(0, x2 ) | x2 ≥ 0} and
opt(DW ) = {0}. Note that val(P) = 0 and val(DW ) = −1. Note also that this example
satisfies Assumption (A) since m = 0.
Before stating the next result, we note that the equivalence of statements (a) and (b) of
Proposition 6 is well known under the assumption opt(P) 6= ∅ (see for example Hiriart–
Urruty and Lemaréchal [6], theorem 2.3.2, chapter VII).
Under the stronger assumption that opt(P) is nonempty and bounded, the next result
shows that the condition val(P) = val(DW ) is not needed in statement (d) of Proposition
6.
Proposition 7 Assume that Assumption (A) holds and that opt(P) is nonempty and
bounded. Then any of the statements (a), (b) and (c) of Proposition 6 is equivalent to the
condition that opt(DW ) is nonempty and bounded. In this case, we have KT = opt(DL ) =
opt(DW ) and val(P) = val(DL ) = val(DW ).
70
MONTEIRO AND ZHOU
The proof of Proposition 7 will be given below after we state a preliminary result. Consider
the following perturbed problem
(P(d))
v(d) ≡ inf{f (x) | Ax = b, g(x) ≤ d},
where d ∈ R
I p is a given perturbation vector. It is well known that the function v(·) is convex;
v(·) is usually referred to as the perturbation function associated with problem (P). The
following result due to Geoffrion (see [3], Theorem 8) is needed in the proof of Proposition
7.
Lemma 17 (Geoffrion) Assume that opt(P) is nonempty and bounded. Then v(·) is
a lower semi-continuous function at d = 0.
I m ×R
I p → [−∞, ∞) associated with problem (P(d))
Note that the dual function Ld : R
is given by
I m ×R
I p.
Ld (y, s) = L(y, s) − dT s, ∀(y, s) ∈ R
(27)
Hence, it follows that, for every d ∈ R
I p , DL and DW are the sets of feasible solutions of
the Lagrangean dual and the Wolfe dual associated with problem (P(d)), respectively. We
are now in a position to give the proof of Proposition 7. The arguments used in the proof
are based on the proof of Theorem 7 of Geoffrion [3].
Proof of Proposition 7: Assume that Assumption (A) holds and opt(P) is nonempty and
bounded. In view of Proposition (6), it remains to show that val(P) = val(DW ) holds
when opt(DW ) is nonempty and bounded. Indeed, let {dk } be a sequence of strictly
positive vectors converging to 0. Clearly, the set of interior feasible solutions of (P(dk )) is
nonempty. Moreover, using the fact that opt(P) is nonempty and bounded, we can show that
opt(P(dk )) is also nonempty and bounded by Lemma 1. Hence, in view of the equivalence
of statements (a) and (d) of Proposition (6), there exists a sequence {(y k , sk )} ⊆ DW such
that
v(dk ) = Ld (y k , sk ), ∀k.
Using this relation, relation (27), the weak duality result and the fact that dk , sk ≥ 0, we
obtain
v(0) ≥ L(y k , sk ) = Ld (y k , sk ) + (dk )T sk ≥ v(dk ), ∀k.
(28)
Since v(.) is lower semi-continuous at d = 0 and v(0) ≥ v(dk ) for all k, it follows that
limk→∞ v(dk ) = v(0). Hence, relation (28) implies that limk→∞ L(y k , sk ) = v(0).
Clearly, this shows that val(P) = val(DW ).
We end this section by giving other conditions which are equivalent to conditions (a), (b),
(c), (d), (e) and (f) of Theorem 1 and the condition of Theorem 2. The main result given
below is a consequence of the results already stated in this section. Consider the conditions:
(1) P 0 6= ∅;
(2) opt(DL ) is nonempty and bounded;
EXISTENCE AND CONVERGENCE OF THE CENTRAL PATH
71
(3) opt(DW ) is nonempty and bounded,
and the conditions:
0
6= ∅;
(a) DL
0
6= ∅;
(b) DW
(c) opt(P) is nonempty and bounded.
By combining one condition from conditions (1), (2) and (3) with one condition from
conditions (a), (b) and (c), we obtain a total of nine conditions which we refer to as (1a),
(1b), (1c), (2a), (2b), (2c), (3a), (3b) and (3c). The following result gives the relationship
between these nine conditions.
Theorem 3 Suppose that both Assumption (A) and Assumption (B) hold. Then, conditions (1a), (1b), (1c), (2a), (2b), (2c) and (3c) are all equivalent. Moreover, any of
these conditions implies (3b), which in turn implies (3a). In addition, if all the constraints
functions gj (·), j = 1, . . . , p, are affine then the nine conditions are equivalent.
Proof. The equivalence of the conditions (1a), (1b) and (1c) follows from Proposition
5. Note that any of the conditions (a), (b) or (c) implies that val(P) > −∞. Hence, it
follows from Proposition 6 that (1) and (2) are equivalent under any of the conditions (a),
(b) or (c). Moreover, it follows from Proposition 7 that (3) is also equivalent to both (1)
and (2) when condition (c) holds. We have thus shown the equivalences (1a) ⇔ (2a),
(1b) ⇔ (2b) and (1c) ⇔ (2c) ⇔ (3c). By Proposition 5 and the fact that DW ⊂ DL ,
we know that (c) ⇒ (b) ⇒ (a). These two implications obviously yield the implications
(3c) ⇒ (3b) ⇒ (3a). We have thus shown the first part of the proposition. The second
part of the result follows trivially from the lemma stated below.
Lemma 18 Assume that each constraint function gj : R
In →R
I (j = 1, . . . , p) is an affine
0
function. If opt(DW ) is nonempty and bounded then P 6= ∅.
Proof. The proof of this result uses arguments similar to the ones used in the proofs of
Lemma 16 and Proposition 4(a). We leave the details to the reader.
In general the implications (3a) ⇒ (3b) and (3b) ⇒ (3c) do not hold. Indeed, the
problem stated in Example 3 satisfies (3b) but not (3c) since it has no feasible solution.
This shows that (3b) ⇒ (3c) does not hold. The following simple example shows that
(3a) ⇒ (3b) does not hold either.
Example: Consider the functions f, g : R
I →R
I defined by f (x) = 0 and g(x) = ex for
every x ∈ R.
I It is easy to verify that L(s) ≡ inf x f (x) + sg(x) = 0 for every s ≥ 0 and
that the infimum is achieved only for s = 0. Hence, we have DL = [0, ∞) and DW = {0}.
Thus (3a) is satisfied but not (3b). Note that P = ∅ and m = 0 and so both Assumptions
(A) and (B) are satisfied.
5.
Limiting Behavior of the Weighted Central Path
In this section we analyze the limiting behavior of the path of solutions of the parametrized
family of logarithmic barrier problems (2). This path is referred in this paper to as the
w-central path (or the weighted central path when the reference to w is not relevant). When
w = 1, the w-central path is usually referred to as the central path. As opposed to McLinden
72
MONTEIRO AND ZHOU
[8], we do not assume the existence of a pair of primal and dual optimal solutions satisfying
strict complementarity. However, we assume throughout this section that the functions f (·)
and gj (·), j = 1, . . . , p, are analytic. Two main results are proved. The first one (Theorem
4) states that the weighted central path converges to a well characterized optimal solution
of (P). The second result (Theorem 5) shows that a certain dual weighted central path
also converges to a well characterized dual optimal solution under the assumption that all
constraint functions are affine.
We begin by explicitly stating the assumptions used throughout this section. In addition
to Assumptions (A) and (B) of Section 3, we impose throughout this section the following
two assumptions:
Assumption (C): P D0 6= ∅.
Assumption (D): The functions f (x) and gj (x), j = 1, . . . , p, are analytic.
In view of the equivalence of statements (c) and (e) of Theorem 1, we know that problem
(P w ) has at least one optimal solution. This problem can however have more than one optimal solution. The following result shows that under the assumptions above, this possibility
can not occur.
Lemma 19 For any fixed w ∈ R
I p++ , problem (P w ) has exactly one optimal solution.
Proof. Existence of at least one optimal solution has already been established. To prove
that there is at most one optimal solution, assume for contradiction that x̂ and x̄ are two
distinct optimal solutions of problem (P w ). Hence, all points in the segment [x̄, x̂] ≡
{λx̄ + (1 − λ)x̂ | λ ∈ [0, 1]} are also optimal solutions of (P w ). It then follows from
Lemma 6 that the functions f (·) and gj (·), j = 1, . . . , p, are constant over [x̄, x̂]. Since
these functions are analytic in view of Assumption (D), we conclude that they are constant
over the whole straight line L containing [x̄, x̂]. Since any point x in the line L satisfies
Ax = b, the set {x | f (x) = f (x̄), Ax = b, g(x) = g(x̄)} contains L, and hence it is
unbounded. However, one can easily see that this set must be bounded due to Assumption
(B). We have thus obtained a contradiction and the result follows.
It follows from Lemma 19 that problem (P(t)) has a unique optimal solution which we
denote by x(t). In what follows we are interested in analyzing the limiting behavior of the
w-central path t 7→ x(t), as t > 0 tends to 0. We show in Theorem 4 below that this path
converges to a specific optimal solution of (P), namely the w-center of opt(P), which we
define next.
If opt(P) consists of a single point x∗ then the w-center of opt(P) is defined to be x∗ .
Consider now the case in which opt(P) consists of more than one point and define
B ≡ {j | gj (x) < 0 for some x ∈ opt(P)}.
It can be shown using arguments similar to the ones used in the proof of Lemma 19 that
B 6= ∅ when opt(P) has more than one point. The w-center of opt(P) is then defined to
be the unique optimal solution of the following convex program:
X
wj log |gj (x)|
(C)
max
j∈B
s.t. x ∈ opt(P), gB (x) < 0.
(29)
EXISTENCE AND CONVERGENCE OF THE CENTRAL PATH
73
It remains to verify that the above definition is meaningful, that is, that problem (C) has
a unique optimal solution. We start by showing that the set of feasible solutions of (C) is
nonempty.
Lemma 20 The set OB ≡ {x | x ∈ opt(P), gB (x) < 0} is nonempty.
Proof. It follows from the definition of B P
that, for every j ∈ B, there exists xj ∈ opt(P)
j
such that gj (x ) < 0. Define x̄ = (1/|B|) j∈B xj . Clearly, x̄ ∈ opt(P) since opt(P) is
a convex set. Moreover, the convexity of gj (·) implies that
gj (x̄) ≤
1 X
gj (xj ) < 0, ∀j ∈ B.
|B|
j∈B
Hence the set {x | x ∈ opt(P), gB (x) < 0} is nonempty.
Lemma 21 Problem (C) has a unique optimal solution.
Proof. Fix a point x̄ ∈ OB . In view of Lemma 3, the existence of an optimal solution of (C)
follows once
P we show that the set ΓB ≡ {x ∈ OB | φB (x) ≥ φB (x̄)} is compact, where
φB (x) ≡ j∈B wj log |gj (x)| for every x ∈ OB . Indeed, first observe that Assumption
(C) and Proposition 5 imply that opt(P) is a compact set. This implies that the sets
gj (opt(P)), j = 1, . . . , p, are bounded. Using this observation, we can easily show the
existence of a constant δ > 0 such that gB (x) ≤ −δ1 for all x ∈ ΓB . Hence,
ΓB = {x ∈ opt(P) | gB (x) ≤ −δ1, φB (x) ≥ φB (x̄)},
from which it follows that the set ΓB is both bounded and closed, and hence compact.
We now show that problem (C) has at most one optimal solution. Assume by contradiction
that x1 and x2 are two distinct optimal solutions of problem (C). Then every point in the
segment [x1 , x2 ] is also an optimal solution. Now, it is easy to see that gB (x) is constant
over the set of optimal solutions of problem (C). Moreover, we also know that f (x) and
gj (x) with j 6∈ B are constant over opt(P). Therefore, we conclude that f (x) and g(x) are
constant over the segment [x1 , x2 ], and hence, in view of Assumption (D), over the whole
straight line containing [x1 , x2 ]. But one can easily verify that this conclusion contradicts
Assumption (B).
We now state and prove one of the main results of this section.
Theorem 4 Suppose that Assumptions (A), (B), (C) and (D) hold and let w ∈ R
I p++ be
given. Then, the w-central path t 7→ x(t) converges to the w-center of opt(P) as t tends
to 0.
Proof. Let x∗ denote the w-center of opt(P), that is the optimal solution of problem (C), and
let x̄ denote an arbitrary accumulation point of x(t) as t tends to 0, that is x̄ = limk→∞ x(tk ),
where {tk } is a sequence of positive scalars converging to 0. The theorem follows once
we show that x∗ = x̄. Assume for contradiction that x∗ 6= x̄ and let ∆x = x∗ − x̄.
Consider the sequence of points {xk } defined by xk ≡ x(tk ) + ∆x for every k. Clearly,
we have limk→∞ xk = x∗ . We next show that xk ∈ P 0 for every k sufficiently large.
Using the definition of xk and the fact that A∆x = 0, we obtain that Axk = b for every
74
MONTEIRO AND ZHOU
k. Since g(·) is a continuous function, gB (x∗ ) < 0 and limk→∞ xk = x∗ , we conclude
that gB (xk ) < 0 for every k sufficiently large. Now, it is easy to see that x̄ ∈ opt(P).
Hence, due to the convexity of opt(P), we have [x̄, x∗ ] ⊆ opt(P). This implies that
gj (x) = 0 with j 6∈ B for every x ∈ [x̄, x∗ ]. Since gj (·) with j 6∈ B is analytic, it
follows that gj (x) = 0 with j 6∈ B over the whole straight line containing [x̄, x∗ ], that is
I By Corollary 8.6.1 of Rockafellar [14],
gj (x̄ + λ∆x) = 0 with j 6∈ B for every λ ∈ R.
it follows that λ 7→ gj (x(tk ) + λ∆x) with j 6∈ B is a constant function for every k. In
particular, it follows that gj (xk ) = gj (x(tk )) < 0 with j 6∈ B for every k. We have thus
shown that xk ∈ P 0 for every k sufficiently large. Since x(t) is by definition the optimal
solution of problem (2), we conclude that for every k sufficiently large,
f (xk ) − tk
p
X
wj log |gj (xk )| ≥ f (x(tk )) − tk
j=1
p
X
wj log |gj (x(tk ))|.
(30)
j=1
The same arguments used to prove that gj (xk ) = gj (x(tk )) with j 6∈ B can also be used
to show that f (xk ) = f (x(tk )). Using these two equalities into relation (30), we obtain
X
X
wj log |gj (xk )| ≤
wj log |gj (x(tk ))|,
j∈B
j∈B
for all k sufficiently large. Letting k go to ∞ in the last relation, we obtain
X
X
wj log |gj (x∗ )| ≤
wj log |gj (x̄)|
j∈B
(31)
j∈B
if gB (x̄) < 0, or that
X
wj log |gj (x∗ )| ≤ −∞
(32)
j∈B
if gj (x̄) = 0 for some j ∈ B. Relation (31) is not possible since x∗ is the only optimal
solution of (29). Obviously, (32) is not possible either.
Associated with problem P, we can also define a dual w-central path as the path of
solutions of the following parametrized family of dual logarithmic barrier problems
(D(t))
max{L(y, s) + t
p
X
0
wj log sj | (y, s) ∈ DL
},
(33)
j=1
where again t > 0 represents the parameter of the family. By Theorem 1, we know that
Assumptions (A), (C) and (D) imply that, for each t > 0, problem (D(t)) has a unique
optimal solution which we denote by (y(t), s(t)). The path t > 0 7→ (y(t), s(t)) is then
called the dual w-central path associated with (P). In what follows we characterize the
limit of the path t 7→ (y(t), s(t)) as t > 0 tends to 0 for the case in which the constraints
functions gj (·), j = 1, . . . , p, are affine. The corresponding characterization for the more
general case in which the functions gj (·), j = 1, . . . , p, are allowed to be nonlinear remains
open.
75
EXISTENCE AND CONVERGENCE OF THE CENTRAL PATH
Before stating the above characterization, we first define the w-center of opt(DL ) =
opt(DW ). Let
N ≡ {j | sj > 0 for some (y, s) ∈ opt(DL )}.
The w-center of opt(DL ) is defined to be the unique optimal solution of the following
convex program:
X
wj log sj
(DC)
max
j∈N
s.t. (y, s) ∈ opt(DL ), sN > 0.
It can be easily verified that the above problem has a unique optimal solution, and hence
that the above definition is meaningful.
I p++ be
Theorem 5 Assume that the functions gj (·), j = 1, . . . , p, are affine and let w ∈ R
given. Then, the dual w-central path t 7→ (y(t), s(t)) converges to the w-center of opt(DL )
as t tends to 0.
Proof. This proof closely follows the proof of Theorem 4. Assume that g(x) = Cx − h,
I p . Let (y ∗ , s∗ ) denote the w-center
∀x ∈ R
I n , where C is a p × n matrix and h ∈ R
of opt(DL ). Let (ȳ, s̄) be an arbitrary accumulation point of (y(t), s(t)) as t tends to
0, that is, (ȳ, s̄) = limk→∞ (y(tk ), s(tk )), where {tk } is a sequence of positive scalars
converging to 0. The result follows once we show that (ȳ, s̄) = (y ∗ , s∗ ). Assume for
contradiction that (ȳ, s̄) 6= (y ∗ , s∗ ) and define (∆y, ∆s) ≡ (y ∗ − ȳ, s∗ − s̄). It is easy
to verify that (ȳ, s̄) ∈ opt(DL ). Consider the sequence of points {(y k , sk )} defined by
(y k , sk ) ≡ (y(tk ), s(tk )) + (∆y, ∆s) for all k. We claim that there exists k0 such that
0
, ∀k ≥ k0 ,
(y k , sk ) ∈ DL
k k
L(y , s ) = L(y(tk ), s(tk )), ∀k ≥ 0,
skj = sj (tk ), ∀j 6∈ N, ∀k ≥ 0.
(34)
(35)
(36)
We now prove the theorem assuming for the moment that the above claim is true. Indeed,
using (34) and the fact that (y(tk ), s(tk )) is the optimal solution of (D(tk )), we obtain
k
k
L(y(t ), s(t )) + t
k
p
X
wj log sj (t ) ≥ L(y , s ) + t
k
k
j=1
k
k
p
X
wj log skj ,
∀k ≥ k0 .
j=1
Combining (35) and (36) with the above relation yield
X
X
wj log sj (tk ) ≥
wj log skj , ∀k ≥ k0 .
j∈N
j∈N
Making k go to ∞ in the above relation and using the fact that (y ∗ , s∗ ) is the unique optimal
solution of (DC), we can easily obtain a contradiction.
It remains to show that the claim holds. By the definition of N , we have s∗j = s̄j = 0
for every j 6∈ N . Hence, ∆sj = 0 for every j 6∈ N , and this implies (36). Clearly, we
have limk→∞ (y k , sk ) = (y ∗ , s∗ ), and since s∗N > 0, we conclude that there exists k0 such
76
MONTEIRO AND ZHOU
that skN > 0 for all k ≥ k0 . This observation together with relation (36) imply that sk > 0
for every k ≥ k0 . It is now immediate that (34) holds once we show the validity of (35).
Observe that by the definition of the function L(·, ·), (35) follows immediately once we
show that
I n.
(∆y)T (b − Ax) + (∆s)T (Cx − h) = 0, ∀x ∈ R
(37)
To show this last relation, fix a point x̄ ∈ opt(P). Using the fact that (ȳ, s̄) and (y ∗ , s∗ )
are in opt(DL ) = KT and the observation preceding (25), we conclude that
∇f (x̄) − AT y ∗ + C T s∗ = 0, (s∗ )T (C x̄ − h) = 0,
∇f (x̄) − AT ȳ + C T s̄ = 0, (s̄)T (C x̄ − h) = 0,
which in turn imply that
−AT ∆y + C T ∆s = 0, (∆s)T (C x̄ − h) = 0.
These two relations and the fact that Ax̄ = b then imply
(∆y)T (b − Ax) + (∆s)T (Cx − h) = bT ∆y − hT ∆s
= x̄T AT ∆y − x̄T C T ∆s = 0,
for every x ∈ R
I n.
It is worth noting that in the proof of Theorem 5 we used only the fact that f (·) is
differentiable, and hence it is not necessary to assume that f (·) is analytic.
Acknowledgments
The first author wishes to thank Alex Shapiro for many useful discussions which have
been invaluable toward the development of this work. This work was based on research
supported by the National Science Foundation under grant DMI-9496178 and the Office of
Naval Research under grants N00014-93-1-0234 and N00014-94-1-0340.
References
1. I. Adler and R. D. C. Monteiro, “Limiting behavior of the affine scaling continuous trajectories for linear
programming problems,” Mathematical Programming, vol. 50, pp. 29–51, 1991.
2. A. V. Fiacco and G. P. McCormick, Nonlinear Programming: Sequential Unconstrained Minimization Techniques, John Wiley & Sons: New York, 1968. Reprint : Volume 4 of “SIAM Classics in Applied Mathematics,”
SIAM Publications, Philadelphia, PA 19104–2688, USA, 1990.
3. A. M. Geoffrion, “Duality in nonlinear programming: a simplified applications-oriented development,”
SIAM Review, vol. 13, pp. 1–37, 1971.
4. O. Güler, “Limiting behavior of the weighted central paths in linear programming,” Mathematical Programming, vol. 65, pp. 347–363, 1994.
5. O. Güler and C. Roos and T. Terlaky and J. P. Vial, “Interior point approach to the theory of linear programming,” Cahiers de Recherche 1992.3, Faculte des Sciences Economique et Sociales, Universite de Geneve,
Geneve, Switzerland, 1992.
6. J.-B. Hiriart–Urruty and C. Lemaréchal, “Convex Analysis and Minimization Algorithms I,” volume 305 of
Comprehensive Study in Mathematics. Springer-Verlag: New York, 1993.
EXISTENCE AND CONVERGENCE OF THE CENTRAL PATH
77
7. M. Kojima, S. Mizuno and T. Noma, “Limiting behavior of trajectories by a continuation method for
monotone complementarity problems,” Mathematics of Operations Research, vol. 15, pp. 662–675, 1990.
8. L. McLinden, “An analogue of Moreau’s proximation theorem, with application to the nonlinear complementarity problem,” Pacific Journal of Mathematics, vol. 88, pp. 101–161, 1980.
9. N. Megiddo, “Pathways to the optimal set in linear programming,” In N. Megiddo, editor, Progress in
Mathematical Programming: Interior Point and Related Methods, pp. 131–158. Springer Verlag: New York,
1989. Identical version in Proceedings of the 6th Mathematical Programming Symposium of Japan, Nagoya,
Japan, 1986, pp. 1–35.
10. N. Megiddo and M. Shub, “Boundary behavior of interior point algorithms in linear programming,” Mathematics of Operations Research, vol. 14, pp. 97–114, 1989.
11. R. D. C. Monteiro, “Convergence and boundary behavior of the projective scaling trajectories for linear
programming,” Mathematics of Operations Research, vol. 16, pp. 842–858, 1991.
12. R. D. C. Monteiro and J.-S. Pang, “Properties of an interior-point mapping for mixed complementarity
problems,” Mathematics of Operations Research, vol. 21, 629–654, 1996.
13. R. D. C. Monteiro and T. Tsuchiya, “Limiting behavior of the derivatives of certain trajectories associated
with a monotone horizontal linear complementarity problem,” Mathematics of Operations Research, vol.
21, pp. 793–814, 1996.
14. R. T. Rockafellar, Convex Analysis, Princeton University Press: Princeton, NJ, 1970.
15. C. Witzgall, P. T. Boggs and P. D. Domich, “On the convergence behavior of trajectories for linear programming,” In J. C. Lagarias and M. J. Todd, editors, Mathematical Developments Arising from Linear
Programming: Proceedings of a Joint Summer Research Conference held at Bowdoin College, Brunswick,
Maine, USA, June/July 1988, volume 114 of Contemporary Mathematics, pp. 161–187, American Mathematical Society: Providence, Rhode Island, USA, 1990.