Burer`s Key Assumption for Semidefinite and Doubly Nonnegative

Burer’s Key Assumption for Semidefinite and
Doubly Nonnegative Relaxations
Florian Jarre
Mathematisches Institut
Heinrich-Heine-Universität Düsseldorf, Germany
[email protected]
Sept 22, 2010
Abstract. Burer has shown that completely positive relaxations of nonconvex quadratic programs with nonnegative and binary variables are exact when the binary variables satisfy a
so-called key assumption. Here we show that introducing binary variables to obtain an equivalent problem that satisfies the key assumption will not improve the semidefinite relaxation,
and only marginally improve the doubly nonnegative relaxation.
Key words. Doubly nonnegative relaxation, completely positive program, key assumption.
1
Introduction
Burer [2] has shown that completely positive relaxations of nonconvex quadratic programs with
nonnegative and binary variables are exact – in the sense that the objective values coincide –
when the binary variables satisfy a so-called key assumption. The key assumption can always be
enforced by introducing slack variables for the binary variables. If these slack variables are not
introduced and the key assumption is violated, then the completely positive relaxation might no
longer be exact, see e.g. [1]. Here we consider the question whether the introduction of binary
slack variables may also improve the semidefinite or the doubly nonnegative relaxation. It turns
out that this is not the case; the semidefinite relaxation is invariant under the introduction of
binary slack variables, and when adding certain linear inequality constraints, the same is true
for the doubly nonnegative relaxation.
1.1
Notation
We use the notation X 0 to indicate that X is symmetric positive semidefinite and X ≥ 0 to
indicate that X has nonnegative entries. The cone of completely positive matrices is denoted
by C ∗ , i.e.
C ∗ = conv{xxT | x ≥ 0}
is the convex hull of all symmetric rank-one matrices with nonnegative entries. The cone of
doubly nonnegative matrices is denoted by D, D := {X | X 0, X ≥ 0}. The dimension of
C ∗ or D will always be evident from the context. The all-ones vector is denoted by e. Again,
the dimension will be evident from the context.
1
1.2
Completely positive Relaxation
We consider the nonconvex quadratic program
x
T
T
min { x Qx + 2c x | x = B , xi ∈ {0, 1} for i ∈ B, xC ≥ 0, Ax = b},
xC
(1)
where B := {1, . . . , k} comprises the binary variables, and C := {k + 1, . . . , n} the continuous
variables. The entries of b are denoted by bi (1 ≤ i ≤ m), the rows of A ∈ Rm×n are denoted
T
0
0
T
by ai and the rank-1-matrices
∈ R(1+n)×(1+n) by Ai .
ai ai
For the moment let K := C ∗ be the cone of completely positive matrices. For a given
x ∈ Rn we define the matrix


   T
1
xTB
xTC
1
1
(2)
X̂ = X̂(x) := xB  xB  =: xB XB YBC  .
T
xC
xC YBC
xC
XC
More generally, we will also use the partition of X̂ in (2) when X̂ has rank greater than one,
and consider the completely positive relaxation of (1),
0 cT
• X̂ | X̂ ∈ K, xB = diag(XB ), Ax = b, Ai • X̂ = b2i for 1 ≤ i ≤ m }. (3)
min {
c Q
To simplify the discussion we assume in the following that problem (3) (with K = C ∗ ) is
feasible. As xB ∈ [0, 1]k it is straightforward to identify further linear inequalities that are
valid for all rank-one-matrices X̂ derived from feasible solutions of (1),
XB ≤ xB eT ,
XB ≥ xB eT + e(xB )T − eeT ,
YB,C ≤ e(xC )T .
(4)
T , it is
The condition XB ≈ (xB )(xB )T ≤ xB eT follows from xB ≤ e and, since XB = XB
equivalent to XB ≤ e(xB )T ; the second condition in (4) is derived from (1−(xB )i )(1−(xB )j ) ≥
0 for 1 ≤ i, j ≤ k, and the last condition from (YB,C )i,j ≈ (xB )i (xC )j ≤ (xC )j .
Burer has shown in [2] that the relaxation (3) is exact in the sense that the optimal values
of (1) and (3) coincide, if the constraints Ax = b, x ≥ 0 imply xB ∈ [0, 1]k . This implication
is called key assumption in [2]. Conditions (4) are not needed in [2]. As pointed out by
Burer, a given problem can always be transformed to an equivalent problem that satisfies the
key assumption – by introducing nonnegative slack variables si ≥ 0 for i ≤ k, and using the
additional equations
s i = 1 − xi
for i ≤ k.
(5)
Since enforcing the key assumption for a problem by adding slack variables may strengthen
the completely positive relaxation, one may ask whether adding slack variables also strengthens
the semidefinite or the doubly nonnegative relaxation where the cone K = C ∗ is replaced with
the semidefinite cone or the cone K = D of doubly nonnegative matrices.
To this end we define the (1 + n + k) × (1 + n + k)-matrix Ei,n+i as a matrix that is all
zero except from ones at the four positions in rows and columns i + 1 and i + 1 + n for some
2
i ∈ {1, . . . , k}. (The matrices Ei,n+i for the constraints (5) (1 ≤ i ≤ k) correspond to the
matrices Ai above for the linear constraints in (1) (1 ≤ i ≤ m).)
For a given x ∈ Rn , s ∈ Rk we define the matrix
  T

1
xTB
xTC
1
1
xB XB YBC
xB  xB 
 

Ẑ = Ẑ(x, s) := 
T
xC  xC  =: xC YBC
XC
T
T
s YB,s YC,s
s
s

 

sT
sT

YB,s 
X̂
YB,s 
=

YC,s  
YC,s 
T
T
S
s YB,s
YC,s
S
(6)
More generally again, we also consider matrices Ẑ that are partitioned as in (6) but are not
necessarily of rank one. Here, s := e − xB , and S ≈ ssT lead to a relaxation of (1) that always
satisfies the key assumption,
0 cT
min {
• X̂ | Ẑ ∈ K, xB = diag(XB ), s = diag(S), xB + s = e, Ax = b,
c Q
Ei,n+i • Ẑ = 1 for 1 ≤ i ≤ k, Ai • X̂ = b2i for 1 ≤ i ≤ m }.
(7)
We point out that in the case K = C ∗ , the constraint s = diag(S) is not necessary for Burers
result (same optimal values in (1) and (7)) to hold. Introducing the slack variable s, however,
may indeed be necessary to guarantee equivalence of (1) and (7), see e.g. [1].
1.3
Doubly nonnegative relaxation
Before discussing the effect of the slack variable s on the semidefinite or the doubly nonnegative
relaxation we identify further constraints that can be added to (7):
Since S is a relaxation for (e − xB )(e − xB )T and XB approximates xB xTB problem (7) can
be augmented by additional linear constraints, namely
S = XB + eeT − exTB − xB eT .
(8)
Likewise, since YB,s is a relaxation of xB sT , and xB sT = xB (e − xB )T = xB eT − xB xTB , we
may add the further constraints
YB,s = xB eT − XB
(9)
which imply, in particular, (YB,s )i,i = 0 for 1 ≤ i ≤ k, since xB = diag(XB ).
Finally, since YC,s ≈ xC sT = xC (e − xB )T = xC eT − xC xTB we may require
T
YC,s = xC eT − YB,C
.
(10)
Note that the conditions (4) simply state that the expressions in (8), (9), and (10) be nonnegative. As detailed in the next proposition, the additional computational effort for solving (7)
with (8), (9), and (10) compared to (3) or to (3), (4) does not pay when K is the semidefinite
or the doubly nonnegative cone.
Proposition 1 When K is the semidefinite cone in (3) and in (7), both problems are equivalent – even if the key assumption is not satisfied, and even when constraints (8), (9), and
(10) are added to the formulation of (7). In case of K = D, problem (3) augmented with the
constraints (4) is equivalent to (7) (with or without (8), (9), and (10)).
3
Proof. Clearly, the X̂-part of a feasible solution Ẑ for (7) is also feasible for (3). Thus, it
suffices to show that each X̂ which is feasible for (3) can be augmented to a matrix Ẑ that
is feasible for (7). The augmentation of s := e − xB is given by the constraints in (7). We
use (8), (9), (10) to define S, YB,s , and YC,s . It is straightforward to verify that all equality
constraints in (7) are satisfied this way. Note that, by taking the Schur complement, Ẑ 0 iff


XB − xB (xB )T YBC − xB (xC )T YB,s − xB sT
T − x (x )T
YBC
XC − xC (xC )T YC,s − xC sT  0.
(11)
C B
T
T
T − s(x )T
T
YB,s − s(xB )
YC,s
S
−
ss
C
As X̂ is assumed to be feasible for (3) it follows that
XB − xB (xB )T YBC − xB (xC )T
0.
T − x (x )T
YBC
XC − xC (xC )T
C B
We consider the last block column in (11). Note that by (8), (9), and (10), we have from
e − s = xB ,
YB,s − xB sT = xB eT − XB − xB sT = xB (xB )T − XB ,
and
T
T
YC,s − xC sT = xC eT − YB,C
− xC sT = xC (xB )T − YB,C
,
S − ssT = XB + eeT − exTB − xB eT − (e − xB )(e − xB )T = XB − xB (xB )T .
Relation (11) is thus equivalent to

XB − xB (xB )T YBC − xB (xC )T
T − x (x )T
YBC
XC − xC (xC )T
C B
xB (xB )T − XB xB (xC )T − YB,C

xB (xB )T − XB
T 
xC (xB )T − YB,C
0.
T
XB − xB (xB )
Multiplying the last block row and the last block column of this matrix by -1 is a congruence
transformation that does not change the signature of the matrix. The resulting matrix is
positive semidefinite by Proposition 2 below.
We recall that when using (8), (9), (10), the inequality Ẑ ≥ 0 is implied by (4).
We note that the Schur complement used in order to arrive at (11) and congruence transforms as used above do change the cone C ∗ , so that this proof would not apply to the completely
positive relaxation.
The following simple result was used in the proof above:
Proposition 2 Let K be one of the following cones: The semidefinite cone, the nonnegative
cone, the doubly nonnegative cone, the completely positive cone, or the copositive cone. Here,
the dimension of K is unspecified and always follows from the context. Let
A B
M=
BT C
be a symmetric matrix. Then,
M ∈K
⇐⇒
M
M

M
∈K
M
4
⇐⇒
A
B T
A

B A
C B T  ∈ K.
B A
(12)
Proof. For any of the cones K above it is always true that principal submatrices of M belong
to K if M belongs to K. Thus, as the first and the last matrix in (12) are principal submatrices
of the second, it suffices to show the implication
M M
M ∈K
=⇒
∈ K.
M M
For a semidefinite or a completely positive M we may assume that
M=
X
u(i) u(i)
T
with u(i) ≥ 0 in the case of a completely positive matrix M . Then,
"P (i) (i) T P (i) (i) T # X (i) (i) T
M M
u u
u u
u
u
= P
T P (i) (i) T =
(i)
(i)
(i)
(i)
M M
u
u
u u
u u
establishing the claim. If M is copositive, it suffices to verify that
T u
M
v
M
M
M
u
= uT M u + 2uT M v + v T M v = (u + v)T M (u + v) ≥ 0
v
for all u, v ≥ 0. But this follows directly from copositivity of M .
Conclusion
In this short note the semidefinite relaxation of nonconvex quadratic programs with nonnegative and binary variables is investigated, both with and without additional slack variables for
the binary variables. Additional contraints (4) and (8) – (10) for both formulations are identified, and it is shown that the introduction of slack variables does not improve the relaxation.
Very similar results are established for the doubly nonnegative relaxation.
References
[1] I. Bomze, F. Jarre, A note on Burer’s copositive representation of mixed-binary QPs.
http://www.optimization-online.org/DB HTML/2009/08/2368.html (2009)
[2] S. Burer, On the copositive representation of binary and continuous nonconvex quadratic
programs. Math. Prog. 120, 479–495 (2009)
5