The trust region subproblem with non

Math. Program., Ser. A
DOI 10.1007/s10107-014-0749-1
FULL LENGTH PAPER
The trust region subproblem with non-intersecting
linear constraints
Samuel Burer · Boshi Yang
Received: 22 February 2013 / Accepted: 21 January 2014
© Springer-Verlag Berlin Heidelberg and Mathematical Optimization Society 2014
Abstract This paper studies an extended trust region subproblem (eTRS) in which
the trust region intersects the unit ball with m linear inequality constraints. When
m = 0, m = 1, or m = 2 and the linear constraints are parallel, it is known that the
eTRS optimal value equals the optimal value of a particular convex relaxation, which
is solvable in polynomial time. However, it is also known that, when m ≥ 2 and at
least two of the linear constraints intersect within the ball, i.e., some feasible point of
the eTRS satisfies both linear constraints at equality, then the same convex relaxation
may admit a gap with eTRS. This paper shows that the convex relaxation has no gap
for arbitrary m as long as the linear constraints are non-intersecting.
Keywords Trust-region subproblem · Second-order cone programming ·
Semidefinite programming · Nonconvex quadratic programming
Mathematics Subject Classification
90C20 · 90C22 · 90C25 · 90C26 · 90C30
1 Introduction
The classical trust region subproblem minimizes a nonconvex quadratic objective over
the unit ball:
S. Burer (B)
Department of Management Sciences, University of Iowa, Iowa City, IA 52242-1994, USA
e-mail: [email protected]
B. Yang
Department of Mathematics, University of Iowa, Iowa City, IA 52242-1419, USA
e-mail: [email protected]
123
S. Burer, B. Yang
v(T0 ) := min x T Qx + c T x : x ≤ 1 .
(T0 )
x
(T0 ) is an important subproblem in trust region methods for nonlinear optimization
and has drawn intense research interest [4–7,10,13]. In particular, even though (T0 ) is
nonconvex, it can be solved efficiently both in theory and practice. Several extensions
of (T0 ) have been proposed that enforce additional constraints on the trust region, e.g.,
parallel linear constraints [14], or a second full-dimensional ellipsoidal constraint [3].
The paper [9] replaces x ≤ 1 with the more general constraint l ≤ q(x) ≤ u, where
q(x) is an arbitrary quadratic function. An important theoretical and practical issue is
whether such extensions can still be solved efficiently. In this paper, we investigate the
theoretical tractability of the following extension of (T0 ), which enforces m additional
linear inequality constraints:
v(Tm ) := min x T Qx + c T x :
x
x ≤ 1
.
aiT x ≤ bi (i = 1, . . . , m)
(Tm )
A natural starting point is of course (T0 ), which has the following polynomial-time
solvable semidefinite programming (SDP) relaxation:
v(T0 ) ≥ v(R0 ) := min Q • X + c T x : trace(X ) ≤ 1, X x x T .
(R0 )
x,X
Here, X is a symmetric matrix, Q • X is the matrix inner product, X x x T is
equivalent to the convex constraint
1
x
xT
X
0
by the Schur complement theorem, and the constraint trace(X ) ≤ 1 makes the feasible
region of (R0 ) compact. Let F(R0 ) denote the feasible region of (R0 ). One can show—
by appealing to Barvinok’s or Pataki’s theories concerning the rank of extreme points
of semidefinite feasibility systems [1,8], for example—that every extreme point of
F(R0 ) satisfies X = x x T , which guarantees that v(R0 ) actually equals v(T0 ) since
some optimal solution of (R0 ) must occur at an extreme point.
Sturm and Zhang [12] and Burer and Anstreicher [2] study the following
polynomial-time solvable relaxation of (T1 ), in this case (R0 ) with an added secondorder cone (SOC) constraint:
trace(X ) ≤ 1, X x x T
v(T1 ) ≥ v(R1 ) := min Q • X + c x :
b1 x − Xa1 ≤ b1 − a1T x
x,X
T
.
(R1 )
The SOC constraint is constructed [12] by relaxing the valid quadratic SOC constraint
(b1 − a1T x)x = (b1 − a1T x)x ≤ b1 − a1T x and is called an SOC-RLT constraint
[2] since its construction is closely related to the reformulation-linearization technique
of [11]. Sturm and Zhang prove v(T1 ) = v(R1 ), extending the case for m = 0,
123
The trust region subproblem
while Burer and Anstreicher show further that every extreme point of F(R1 ) satisfies
X = x x T , where F(R1 ) is the feasible set of (R1 ).
Ye and Zhang [14] and Burer and Anstreicher [2] also studied (T2 ), which has the
following polynomial-time solvable relaxation:
v(T2 ) ≥ v(R2 )
⎧
⎨
:= min Q • X +c T x :
x,X ⎩
⎫
trace(X ) ≤ 1, X x x T
⎬
. (R2 )
bi x − Xai ≤ bi − aiT x (i = 1, 2)
⎭
b1 b2 − b2 a1T x − b1 a2T x + a1T Xa2 ≥ 0
Extending (R1 ), this relaxation also contains the SOC-RLT constraint for a2T x ≤
b2 and a new linear RLT constraint reflecting the valid quadratic inequality (b1 −
a1T x)(b2 − a2T x) ≥ 0.
Ye and Zhang first showed that if the complementarity condition (b1 − a1T x)(b2 −
T
a2 x) = 0 is added to (T2 ), then the corresponding relaxation with b1 b2 − b2 a1T x −
b1 a2T x +a1T Xa2 = 0 admits no gap. Based on this complementarity result, the authors
then showed that the case of (T2 ) with a1 a2 could be solved in polynomial-time via
several steps. Burer and Anstreicher also considered a1 a2 and extended the results
of Ye and Zhang by showing that the extreme points of F(R2 ) satisfy X = x x T . So
v(T2 ) = v(R2 ) in this case, thus providing a second proof that this special case of
(T2 ) is polynomial-time solvable—but this time in a single step.
On the other hand, Burer and Anstreicher gave a counter-example for which v(T2 ) >
v(R2 ) when a1 ∦ a2 . This example had the property that a1T x ≤ b1 and a2T x ≤ b2
intersected inside the unit ball—more precisely, some x in the unit ball simultaneously
made both linear constraints tight—leaving open the possibility that v(T2 ) = v(R2 )
could still hold when the two inequalities are non-intersecting.
In this paper, we study the following polynomial-time solvable relaxation of (Tm ),
which includes all possible SOC-RLT and RLT constraints:
v(Tm ) ≥ v(Rm ) : = min
x,X
Q • X + cT x
s. t. trace(X ) ≤ 1, X x x T
bi x − Xai ≤ bi − aiT x
(Rm )
1≤ i ≤m
bi b j − b j aiT x − bi a Tj x + aiT Xa j ≥ 0
1≤ i < j ≤m
In light of the results in [2], we focus only on the non-intersecting case, that is, when
there exists no x feasible for (Tm ) satisfying aiT x = bi and a Tj x = b j for some i < j.
Note that the RLT constraints corresponding to i = j are implied by X x x T , and by
the symmetry of X , the RLT constraint for i > j is equivalent to the RLT constraint
for (ı̂, jˆ) = ( j, i) with ı̂ < jˆ.
Let F(Rm ) denote the feasible set of (Rm ). For the non-intersecting case, we will
prove that every extreme point of F(Rm ) satisfies X = x x T , which immediately
implies v(Tm ) = v(Rm ) and that (Tm ) is polynomial-time solvable. This is our main
goal. Combined with the results of [2], we thus achieve a very clear demarcation of
when the relaxation (Rm ) achieves v(Tm ) = v(Rm ) generally: precisely when the
123
S. Burer, B. Yang
linear constraints aiT x ≤ bi are non-intersecting in the unit ball. We also discuss a
slight extension in the last section of the paper.
2 Ye and Zhang’s analysis revisited
As motivation for our result, we would first like to discuss how the original analyis of
Ye and Zhang [14], which solves the case of (T2 ) with a1 a2 in several polynomialtime steps, extends naturally to the non-intersecting case of this paper. In other words,
the same steps solve the non-intersecting case of (T2 ) in polynomial time. Thus, at
least for m = 2, our result may not be surprising. On the other hand, we could not see
how to extend the approach of Ye and Zhang to general m, and so to the best of our
knowledge, our general approach in this paper contributes something new.
In Sect. 2.3 of their paper [14], Ye and Zhang proved that
⎧
⎨
⎫
x ≤ 1
⎬
aiT x ≤ bi (i = 1, 2)
v(T2c ) := min x T Qx + c T x :
x ⎩
⎭
(b1 − a1T x)(b2 − a2T x) = 0
(T2c )
is solved exactly by the SOCP-SDP relaxation
⎧
⎨
⎫
trace(X ) ≤ 1, X x x T
⎬
v(R2c ) := min Q • X + c T x : bi x − Xai ≤ bi − aiT x (i = 1, 2)
.
⎭
x,X ⎩
b1 b2 − b2 a1T x − b1 a2T x + a1T Xa2 = 0
(R2c )
In words, v(R2c ) = v(T2c ).
In Sect. 4 of [14], Ye and Zhang then analyzed the following specific case of (T2 ),
which is essentially equivalent to (T2 ) with a1 a2 :
min x T Qx + c T x :
x
x ≤ 1
.
−1 ≤ a0T x − b0 ≤ 1
(1)
They argued that it could be solved in two steps. First, assume that either −1 ≤ a0T x−b0
or a0T x − b0 ≤ 1 is binding at optimality. Then (a0T x − b0 + 1)(1 − a0T x + b0 ) = 0
is valid at optimality, and hence solving an instance of (R2c ) recovers the optimal
value. Second, if −1 < a0T x − b0 < 1 at optimality, then the various candidates
for the optimal solution of (T0 ), when both linear constraints are dropped, can be
examined. From the theory known for (T0 ), these candidates are solutions of
(Q + μI )x = − 21 c,
x = 1,
where μ varies over a list of three potential optimal values that can be pre-computed
separately. However, it is not enough for a candidate x to be optimal for (T0 ). It must
also be feasible, i.e., it must satisfy
(Q + μI )x = − 21 c,
123
x = 1,
−1 ≤ a0T x − b0 ≤ 1
The trust region subproblem
simultaneously. To find such candidates, Ye and Zhang suggest to solve
(Q + μI )x = − 21 c
T
T
,
max a0 x − b0 + 1 1 − a0 x + b0 :
x
x = 1
(2)
which is polynomial-time solvable because, after restricting to the affine subspace
(Q + μI )x = − 21 c, it is an instance of the equality-constrained trust-region subproblem. This optimization can be interpreted as examining the optimal solutions of
(T0 ) corresponding to μ, while simultaneously trying to make x feasible with respect
to the constraints −1 ≤ a0T x − b0 ≤ 1. In particular, if the optimal value of (2) is
nonnegative, then the associated optimal solution has met both constraints and is thus
feasible for (1). This uses the fact that the linear constraints are parallel and cannot
both be violated simultaneously. After considering all three μ values, the candidates
can be judged one-by-one for feasibility and global optimality.
If the constraints −1 ≤ a0T x − b0 ≤ 1 are replaced by the more general nonintersecting a1T x ≤ b1 and a2T x ≤ b2 , Ye and Zhang’s analysis carries through. The
first case of binding (b1 − a1T x)(b2 − a2T x) = 0 works the same by solving (R2c ). For
the second case, one analogously solves
(Q + μI )x = − 21 c
T
T
.
max (b1 − a1 x)(b2 − a2 x) :
x
x = 1
(3)
Here again, a nonnegative optimal value guarantees that the associated associated
optimal solution x̄ satisfies both aiT x̄ ≤ bi (i = 1, 2), which can be argued as follows.
We need only eliminate the possibility that, without loss of generality, a1T x̄ > b1 .
We consider two subcases. First, assume for contradiction that a1T x̄ > b1 and
T
a2 x̄ = b2 , and let x̂ be feasible for (T2 ) with a1T x̂ < b1 and a2T x̂ = b2 . Such
a point exists when both linear constraints are non-redundant and, as assumed, nonintersecting. Then some convex combination y of x̄ and x̂ has a1T y = b1 and a2T y = b2 ,
which contradicts the non-intersecting property. Now assume for contradiction that
a1T x̄ > b1 and a2T x̄ > b2 . Let x̂ be any feasible point of (T2 ). Then some convex
combination y of x̄ and x̂ has a1T y > b1 and a2T y = b2 (or possibly the indices 1
and 2 are switched). So we are back to the previous subcase, providing the desired
contradiction.
3 Preliminaries
With the result of Sect. 2 as motivation, we now discuss some preliminary items in
preparation for Sect. 4. In terms of notation, F(Tm ) and F(Rm ) denote the feasible
sets of (Tm ) and (Rm ), respectively. In addition, given (x, X ), we define
Y (x, X ) :=
1
x
xT
X
123
S. Burer, B. Yang
and remark that X = x x T if and only if Y (x, X ) is rank-1. Accordingly, we also say
that (x, X ) is rank-1 if X = x x T .
We assume throughout that F(Tm ) = ∅ and formally state our assumption that no
two hyperplanes defined by aiT x = bi and a Tj x = b j intersect within the unit ball.
Assumption 1 For all i < j, there exists no x ∈ F(Tm ) such that aiT x = bi and
a Tj x = b j .
Note that Assumption 1 can be verified in polynomial time by checking a polynomial
number of convex feasibility problems. We will also need the following result, which
has been discussed in the introduction:
Lemma 1 Every extreme point (x, X ) ∈ F(R0 ) satisfies X = x x T .
Another important result for Sect. 4 demonstrates how any (x, X ) ∈ F(Rm ) with
aiT x < bi gives rise to a special vector z i ∈ F(Tm ).
Lemma 2 Suppose (x, X ) ∈ F(Rm ) with aiT x < bi for some i. Define
z i := (bi − aiT x)−1 (bi x − Xai )
(4)
Then z i ∈ F(Tm ).
Proof First, the ith SOC-RLT constraint of (Rm ) guarantees z i ≤ 1. Furthermore,
X x x T implies
(bi − aiT x)(bi − aiT z i ) = (bi − aiT x)bi − aiT (bi x − Xai ) = bi2 − 2bi aiT x + aiT Xai
≥ bi2 − 2bi aiT x + aiT x x T ai = (bi − aiT x)2 > 0.
Finally, for all j = i, the RLT constraints of (Rm ) and symmetry of X imply
(bi − aiT x)(b j − a Tj z i ) = (bi − aiT x)b j − a Tj (bi x − Xai )
= bi b j − b j aiT x − bi a Tj x + aiT Xa j
≥ 0.
This completes the proof.
Finally, Sect. 4 requires a simple fact about the extreme points of the intersection
of a compact convex set with a half-space.
Lemma 3 Let C be a compact convex set, and let H be a half-space. Every extreme
point of C ∩ H may be expressed as the convex combination of at most two extreme
points in C.
Proof Suppose that H is defined by the linear
inequality α T x ≤ β, and let x̄ ∈ C ∩ H
be extreme. Since x̄ ∈ C, we may write x̄ =
k∈K λ̄k x̄k , where K is some index set,
each x̄k is extreme in C, each λ̄k > 0, and k∈K λ̄k = 1.
123
The trust region subproblem
linear function x(λ) :=
For a vector variable λ with the same length as λ̄, define the
λ
x̄
.
In
words,
x(λ)
outputs
the
linear
combination
k
k
k∈K
k∈K λk x̄ k of the fixed
For example, x(λ̄) = x̄. Also define the polytope L := {λ ≥ 0 : α T x(λ) ≤
x̄k .
β, k∈K λk = 1}, which has two features. First, because λ ≥ 0 and the sum of its
entries equals 1, x(λ) defines a convex combination of the vectors x̄k ; so x(λ) ∈ C.
Second, the constraint α T x(λ) ≤ β ensures x(λ) ∈ H . Overall, λ ∈ L implies
x(λ) ∈ C ∩ H . In addition, after adding a slack variable, one can recast L as a
standard-form polytope with the general structure {z ≥ 0 : Az = b}, where the
number of rows in A is 2, i.e., the basis size is 2. Then every extreme point z has at
most two positive entries, which also ensures that every extreme point λ ∈ L has at
most two positive entries.
It holds
feasible for L since x̄ = x(λ̄) ∈ C ∩ H . Hence, we can write
thatj λ̄ > 0 is
j
ρ
λ
,
where
λ̄ =
j
j = 1, each ρ j > 0, and each λ is extreme in L. By
j
jρ
j
j
expanding, this means x̄ = j ρ j x(λ ) with each x(λ ) ∈ C ∩ H . Since x̄ is extreme
in C ∩ H , it holds that every x(λ j ) = x̄. Since λ j has at most two positive entries,
this completes the proof.
4 The result
We would like to prove v(Rm ) = v(Tm ) under Assumption 1, and we will accomplish
this by showing that every extreme point (x, X ) of the compact F(Rm ) satisfies X =
x x T . Our proof is by induction on m, where Lemma 1 with m = 0 serves as the base
case, i.e., every extreme (x, X ) ∈ F(R0 ) satisfies X = x x T . The induction hypothesis
is thus as follows:
Assumption 2 Given 1 ≤ i ≤ m, every extreme point (x, X ) ∈ F(Ri−1 ) satisfies
X = xxT .
Furthermore, our proof is broken down into two cases:
Case 1 Some constraint aiT x ≤ bi is redundant for F(Tm ).
Case 2 No constraint aiT x ≤ bi is redundant for F(Tm ).
Case 1 can be handled immediately.
Theorem 1 For Case 1, every extreme (x, X ) ∈ F(Rm ) satisfies X = x x T .
Proof Without loss of generality, assume amT x ≤ bm is redundant. Because F(Rm−1 )
is a relaxation of F(Rm ), (x, X ) ∈ F(Rm−1 ). Hence, by Assumption 2, Y (x, X ) may
T
be expressed as the convex combination of rank-1 matrices of the form x̄1 x̄1 , where
each x̄ ∈ F(Tm−1 ). Such x̄ also satisfy amT x̄ ≤ bm by redundancy, and so (x, X ) is
the convex combination of points (x̄, x̄ x̄ T ) ∈ F(Rm ). Since (x, X ) is extreme, this
implies X = x x T .
Now assume Case 2. Propositions 1 and 2 are the key results leading to Theorem 2
below, but first we state a critical lemma that applies in Case 2.
123
S. Burer, B. Yang
Lemma 4 For Case 2, suppose x satisfies x ≤ 1 and aiT x = bi for some i. Then
x ∈ F(Tm ).
Proof Because aiT x ≤ bi is non-redundant and F(Tm ) is convex, there exists x̄ ∈
F(Tm ) such that aiT x̄ = bi . Now suppose x ∈ F(Tm ), i.e., a Tj x > b j for all j ∈
J , where J = ∅ is some collection of indices not including i. Then some convex
combination of x̄ and x, say x̂, is feasible for (Tm ) and satisfies aiT x̂ = bi and a Tj x̂ = b j
for some j ∈ J . However, this contradicts Assumption 1, so x is in fact feasible for
(Tm ).
Proposition 1 For Case 2, let (x, X ) ∈ F(Rm ) be extreme such that some SOC-RLT
constraint is active. Then X = x x T .
Proof Assume without loss of generality that b1 x − Xa1 = b1 − a1T x, and consider (R1 ) based on the single constraint a1T x ≤ b1 . Since (x, X ) is also in F(R1 ),
Assumption 2 implies
Y := Y (x, X ) =
λk
1 1 T
xk
xk
,
k
where
k
λk = 1 and, for each k, xk ∈ F(T1 ) and λk > 0. Note that
b1 − a1T x
b1 x − Xa1
T 1
1
b1
=
λk
xk xk
−a1
k
1
=
.
λk (b1 − a1T xk )
xk
b1
=Y
−a1
(5)
k
Since the left-hand side of (5) is on the boundary of the SOC, each summand λk (b1 −
b −a T x 1
a1T xk ) x1k on the right is either 0 or parallel to b 1x−Xa
(which itself could be 0). As
1
1
1
λk > 0 and xk = 0, we can thus separate the indices k into two groups (using new
indices j and to distinguish the groups):
Y =
j
: a1T x j =b1
λj
1 1 T
xj
xj
+
: a1T x <b1
λ
1 1 T
x
x
.
Each x , if there exist any, must be equal to z 1 := (b1 −a1T x)−1 (b1 x − Xa1 ) since x1 is
b −a T x 1
parallel to b 1x−Xa
. Note that the existence of at least one implies a1T x < b1 so that
1
1
z 1 is well-defined. By (4) and Lemma 2, z 1 ∈ F(Tm ). In addition, each x j ∈ F(Tm )
by Lemma 4. Overall, we see that Y is the convex combination of rank-1 solutions to
(Tm ). Therefore, Y is rank-1 because (x, X ) is extreme.
Proposition 2 For Case 2, let (x, X ) ∈ F(Rm ) be extreme such that no SOC-RLT
constraint is active. Then X = x x T .
123
The trust region subproblem
Proof Suppose that all RLT constraints corresponding to the pairs (1, m), . . . ,
(m − 1, m) are inactive at (x, X ). Then (x, X ) is also extreme for (Rm−1 ), and so
X = x x T by Assumption 2. On the other hand, suppose the (i, m)th RLT constraint is
tight; say i = 1 without loss of generality. We will derive a contradiction that (x, X )
is extreme to complete the proof.
We first claim that the other RLT constraints corresponding to (2, m), . . . ,
(m − 1, m) are inactive. Let z m be given by (4); it is feasible for (Tm ) by Lemma 2.
Moreover, the proof of Lemma 2 shows that akT z m = bk if and only if the (k, m)th
RLT constraint is tight. Since at most one akT z m ≤ bk can be tight by Assumption 1,
at most one of the (k, m)th RLT constraints can be active, as claimed.
Let G denote the intersection of F(Rm−1 ) with the single RLT constraint b1 bm −
bm a1T x − b1 amT x + a1T Xam ≥ 0. So (x, X ) is extreme for G. Then Lemma 3 implies
(x, X ) can be expressed as a convex combination of at most two extreme points of
(Rm−1 ). By Assumption
bm rank(Y ) ≤ 2, where Y := Y (x, X ).
b1 2, this means
and
t
:=
Defining s := −a
−am , we see
1
b1
s Ys =
−a1
T T
xT
X
1
x
b1
−a1
= b12 − 2b1 a1T x + a1T Xa1
≥ b12 − 2b1 a1T x + a1T x x T a1 = (b1 − a1T x)2
>0
and similarly t T Y t > 0. Notice also that the tight RLT constraint can be expressed as
s T Y t = 0. Next consider the equation
⎞
⎛ T
s Ys
sT
W := ⎝ t T ⎠ Y s t I = ⎝ t T Y s
I
Ys
⎛
sT Y t
tT Yt
Yt
⎞ ⎛ T
s Ys
sT Y
tT Y ⎠ = ⎝ 0
Y
Ys
0
tT Yt
Yt
⎞
sT Y
tT Y ⎠ .
Y
We have W 0 and rank(W ) ≤ rank(Y ) ≤ 2. Then the Schur complement theorem
implies M := Y − (s T Y s)−1 (Y ss T Y ) − (t T Y t)−1 (Y tt T Y ) 0 and rank(M) =
rank(W ) − 2 ≤ 2 − 2 = 0, i.e., M = 0 or
Y = (s T Y s)−1 (Y s)(Y s)T + (t T Y t)−1 (Y t)(Y t)T .
We now prove several properties of
1
Ys =
x
xT
X
b1
−a1
b1 − a1T x
=
.
b1 x − Xa1
First, Y s lies in the interior of the SOC because b1 x − Xa1 < b1 −a1T x. In particular,
z 1 in (4) is well-defined with z 1 < 1. Furthermore, t T Y s = 0 implies t T z11 = 0, or
equivalently amT z 1 = bm . Hence, Lemma 4 implies z 1 ∈ F(Tm ). In a similar manner,
we can prove from Y t that z m defined by (4) is feasible for (Tm ) with a1T z m = b1 .
Also, we see z 1 = z m by Assumption 1.
123
S. Burer, B. Yang
T
T
Summarizing, Y = α z11 z11 + β z1m z1m for appropriate positive scalars α, β
and z 1 , z m ∈ F(Tm ) with z 1 = z m . Since the top-left entry in Y equals 1, this is a
proper convex combination of points in F(Rm ). However, this contradicts that (x, X )
is extreme.
We are now ready to state the desired theorem regarding Case 2 and the main result
of the paper (Corollary 1).
Theorem 2 For Case 2, every extreme (x, X ) ∈ F(Rm ) satisfies X = x x T .
Proof Proposition 1 covers the case when (x, X ) has an active SOC-RLT constraint,
while Proposition 2 handles when (x, X ) has no such active constraint.
Corollary 1 Under Assumption 1, v(Rm ) = v(Tm ).
Proof Theorem 1 covers Case 1, and Theorem 2 covers Case 2.
In [2], Burer and Anstreicher gave a counter-example for which v(R2 ) < v(T2 )
when Assumption 1 is violated:
⎛
2
Q =⎝3
12
3
−19
6
⎞
12
6 ⎠,
0
⎛
⎞
14
c = ⎝14⎠ ,
9
−x1 ≤ 21 ,
x1 + 65 x2 ≤ 0.
In this instance, v(T2 ) ≈ −12.9419 with x ∗ ≈ (−0.8536, 0.2947, 0.4294)T , while
v(R2 ) ≈ −13.8410 with optimal
⎛
⎞
⎛
−0.3552
0.2595
x̄ ≈ ⎝ 0.3881 ⎠, X̄ ≈ ⎝−0.2248
−0.2119
−0.0913
−0.2248
0.4495
−0.0694
⎞
−0.0913
−0.0694⎠,
0.2911
and the numerical rank of X̄ is 3. We wish to examine this counter-example from the
viewpoint of our proof. One can verify that Y (x̄, X̄ ) makes both SOC-RLT constraints
active in (R2 ), and so considering that (x̄, X̄ ) is likely to be extreme in (Rm ) from the
numerical point of view, Proposition 1 is violated in this case. Of course, Proposition 1
is based on Lemma 4, which heavily uses Assumption 1.
5 An extension
Consider the following assumption, which is slightly relaxed compared to Assumption 1.
Assumption 3 For all i < j, there exists no x ∈ F(Tm ) such that x < 1, aiT x = bi ,
and a Tj x = b j .
In comparison to Assumption 1, Assumption 3 allows the linear constraints to intersect
on the boundary of the unit ball. Every such intersection point on the boundary must
123
The trust region subproblem
clearly be an extreme point of the polyhedron P := {x : aiT x ≤ bi (i = 1, . . . , m)}.
For example, when the dimension of x is 2, Assumption 3 allows (Tm ) to model
polytopes P inscribed in the unit disk.
We have the following extension of Corollary 1.
Proposition 3 Under Assumption 3, v(Rm ) = v(Tm ).
Proof For > 0 small, tighten the constraint x ≤ 1 of (Tm ) to x ≤ 1 − in order to form a new extended trust region subproblem (eTRS) (Tm ) satisfying
) can
Assumption 1. Relative to (Rm ), a suitably modified convex relaxation (Rm
) = v(T ) by Corollary 1. The result follows because
be derived such that v(Rm
m
) = lim
v(Rm ) = lim→0 v(Rm
→0 v(Tm ) = v(Tm ) as all involved feasible sets are
compact.
We end with an application of Proposition 3. Consider min{x T Qx + c T x : x ∈ P},
where P is the regular pentagon inscribed in the unit disk with (1, 0) as an extreme
point and
−8
8
−1
Q=
,
c=
.
8 −14
−1
For example, P could be represented by the system
⎛
⎞
⎛
⎞
1.3764
1.3764
1
⎜−0.3249 1 ⎟ ⎜0.8507⎟
⎜
⎟
⎜
⎟
⎜−1.0000 0 ⎟ x1 ≤ ⎜0.8090⎟ .
⎜
⎟
⎜
⎟ x2
⎝0.8507⎠
⎝−0.3249 −1⎠
1.3764
1.3764 −1
Since the redundant constraint x ≤ 1 is not given explicitly, a reasonable approach
would be to solve the following relaxation that only contains the RLT constraints:
min
x,X
Q • X + cT x
s. t. X x x T
bi b j − b j aiT x − bi a Tj x + aiT Xa j ≥ 0
This yields optimal
0
0.8090
x̄ ≈
,
X̄ ≈
0
0
0
,
0.8090
i < j.
Q • X̄ + c T x̄ ≈ −17.7984.
Note that the numerical rank of X̄ is 2. According to Proposition 3, however, we can
obtain the exact optimal value by solving (Rm ). Doing so yields
0.3090
0.0955 −0.2939
∗
∗
x ≈
,
X ≈
,
−0.9511
−0.2939 0.9045
Q • X ∗ + c T x ∗ ≈ −17.4873.
123
S. Burer, B. Yang
Indeed, the numerical rank of X ∗ is 1, showing that x ∗ is a global minimizer of
x T Qx + c T x over P.
Acknowledgments The authors are in debt to two anonymous referees who provided detailed suggestions
that have improved the paper immensely. The authors would also like to thank Kurt Anstreicher for numerous
discussions and for suggesting first that the case m = 2 might be extensible to arbitrary m.
References
1. Barvinok, A.: Problems of distance geometry and convex properties of quadratic maps. Discrete Comput. Geom. 13, 189–202 (1995)
2. Burer, S., Anstreicher, K.: Second-order cone constraints for extended trust-region subproblems. University of Iowa (March 2011). Revised July 2012 and Nov 2012. To appear in SIAM J. Optim.
3. Celis, M.R., Dennis, J.E., Tapia, R.A.: A trust region strategy for nonlinear equality constrained optimization. In: Numerical Optimization, vol. 1984, pp. 71–82 (Boulder, Colo., 1984). SIAM, Philadelphia, PA (1985)
4. Conn, A.R., Gould, N.I.M., Toint, P.L.: Trust-Region Methods. MPS/SIAM Series on Optimization.
SIAM, Philadelphia, PA (2000)
5. Fu, M., Luo, Z.-Q., Ye, Y.: Approximation algorithms for quadratic programming. J. Comb. Optim. 2,
29–50 (1998)
6. Gould, N.I.M., Lucidi, S., Roma, M., Toint, P.L.: Solving the trust-region subproblem using the Lanczos
method. SIAM J. Optim. 9(2), 504–525 (1999). (electronic)
7. Moré, J.J., Sorensen, D.C.: Computing a trust region step. SIAM J. Sci. Stat. Comput. 4(3), 553–572
(1983)
8. Pataki, G.: On the rank of extreme matrices in semidefinite programs and the multiplicity of optimal
eigenvalues. Math. Oper. Res. 23, 339–358 (1998)
9. Pong, T., Wolkowicz, H.: The generalized trust region subproblem. University of Waterloo, Waterloo,
Ontario (2012)
10. Rendl, F., Wolkowicz, H.: A semidefinite framework for trust region subproblems with applications to
large scale minimization. Math. Program. 77(2, Ser. B):273–299 (1997).
11. Sherali, H.D., Adams, W.P.: A Reformulation-Linearization Technique for Solving Discrete and Continuous Nonconvex Problems. Kluwer, Dordrecht (1997)
12. Sturm, J.F., Zhang, S.: On cones of nonnegative quadratic functions. Math. Oper. Res. 28(2), 246–267
(2003)
13. Ye, Y.: A new complexity result on minimization of a quadratic function with a sphere constraint. In:
Floudas, C., Pardalos, P. (eds.) Recent Advances in Global Optimization. Princeton University Press,
Princeton, NJ (1992)
14. Ye, Y., Zhang, S.: New results on quadratic minimization. SIAM J. Optim. 14(1), 245–267 (2003).
(electronic)
123