Computing dense subgraphs with semidefinite programming

Computing dense subgraphs with
semidefinite programming
Jérôme MALICK∗
∗ CNRS,
Frédéric ROUPIN∗∗
Lab. Jean Kuntzmann, Grenoble
∗∗ LIPN-CNRS,
University Paris XIII
Optimization 2011, Lisbon – July 2011
1
A combinatorial optimization problem
Find a subgraph of k vertices with maximum number of edges
Example: graph with n = 8 vertices, best subgraph with k = 4 ?
2
A combinatorial optimization problem
Find a subgraph of k vertices with maximum number of edges
Example: graph with n = 8 vertices, best subgraph with k = 4 ?
The densest-subgraph problem:
generalization of max-clique
(difficult) particular case of quadratic knapsack
3
Finding a densest k-subgraph ?
Difficult problem (NP-hard and more - see e.g. Khot ’05)
Solving to optimality ? Few methods:
– using linear programming (Ekrut ’90)
– using reformulation techniques (Pisinger ’06)
– using quadratic programming (Billionnet-Elloumi-Plateau ’09)
Compute dense subgraphs for unrestricted graphs with n 6 100
4
Finding a densest k-subgraph ?
Difficult problem (NP-hard and more - see e.g. Khot ’05)
Solving to optimality ? Few methods:
– using linear programming (Ekrut ’90)
– using reformulation techniques (Pisinger ’06)
– using quadratic programming (Billionnet-Elloumi-Plateau ’09)
Compute dense subgraphs for unrestricted graphs with n 6 100
Rem: SDP bounds are very tight - but expensive (Roupin ’04)
Question: use this to be competitive with best methods ?
5
Finding a densest k-subgraph ?
Difficult problem (NP-hard and more - see e.g. Khot ’05)
Solving to optimality ? Few methods:
– using linear programming (Ekrut ’90)
– using reformulation techniques (Pisinger ’06)
– using quadratic programming (Billionnet-Elloumi-Plateau ’09)
Compute dense subgraphs for unrestricted graphs with n 6 100
Rem: SDP bounds are very tight - but expensive (Roupin ’04)
Question: use this to be competitive with best methods ?
Objective of our work:
1
to study new SDP bounds - that trade tightness for cpu time,
while keeping SDP-like quality !
2
to use them within branch-and-bound for computing densest
k-subgraphs
3
to compare with the best: Billionnet-Elloumi-Plateau ’09
6
Outline
1
Formulation and new semidefinite bounds
2
Relaxed resolution: comparison of the bounds
3
Exact resolution: branch-and-bound procedure
7
Formulation and new semidefinite bounds
Outline
1
Formulation and new semidefinite bounds
2
Relaxed resolution: comparison of the bounds
3
Exact resolution: branch-and-bound procedure
8
Formulation and new semidefinite bounds
Formulation of densest k-subgraph problem
Notation: G = (V, E) unweighted graph (|V | = n)
W = (wij ) adjancency-matrix (/2)
Initial modelling as {0, 1}-QP with constraints

P
>
 max
(ij)∈E wij yi yj = y W y
P
i yi = k

y ∈ {0, 1}n
9
Formulation and new semidefinite bounds
Formulation of densest k-subgraph problem
Notation: G = (V, E) unweighted graph (|V | = n)
W = (wij ) adjancency-matrix (/2)
Initial modelling as {0, 1}-QP with constraints

P
>
 max
(ij)∈E wij yi yj = y W y
P
i yi = k

y ∈ {0, 1}n
Enforcement of constraints: adding n product
P constraints
⇐⇒ adding ( ni=1 yi − k)2 = 0

max y > W y


 P
Pinyi = k
j = 1, . . . , n

i=1 yi yj = k yj ,


y ∈ {0, 1}n
10
Formulation and new semidefinite bounds
Standard reformulation and lifting
Change of variables (and homogenization):
Equivalent formulation as a pure {−1, 1}-QP

 max x> Q x
x> Qj x = 4k − 2n, j ∈ {0, . . . , n}

x ∈ {−1, 1}n+1
11
Formulation and new semidefinite bounds
Standard reformulation and lifting
Change of variables (and homogenization):
Equivalent formulation as a pure {−1, 1}-QP

 max x> Q x
x> Qj x = 4k − 2n, j ∈ {0, . . . , n}

x ∈ {−1, 1}n+1
SDP lifting (e.g. Lovasz ’79, Goemans-Williamson ’95)
hX, Y i = trace(XY ) makes
x> A x = hA, xx> i
>
X = xx
gives xi ∈ {−1, 1} as Xii = 1
Equivalent formulation as linear SDP with rank-one constraint


 max hQ, Xi

hQj , Xi = 4k − 2n, j ∈ {0, . . . , n}
X

ii = 1, i ∈ {0, . . . , n}


rank X = 1, X < 0
12
Formulation and new semidefinite bounds
Idea of the “spherical constraint”
Key remark (Malick ’07):
For all X < 0 satisfying Xii = 1,
we have
X<0
Xii = 1
n+1
kXk 6 n + 1
13
Formulation and new semidefinite bounds
Idea of the “spherical constraint”
Key remark (Malick ’07):
For all X < 0 satisfying Xii = 1,
we have
X<0
Xii = 1
n+1
kXk 6 n + 1
kXk = n + 1 ⇐⇒ rank X = 1
“spherical constraint”
X of rank 1
14
Formulation and new semidefinite bounds
New formulation with the spherical constraint
Replace rank X = 1 by kXk2 = (n + 1)2
New formulation of the k-densest subgraph problem as a
linear SDP with one (nonconvex) quadratic constraint

max hQ, Xi




 hQj , Xi = 4k − 2n, j ∈ {0, . . . , n}
hEi , Xi = 1, i ∈ {0, . . . , n}


X<0



kXk2 = (n + 1)2
The difficulty is now concentrated in this spherical constraint...
Drop it: you get the usual SDP relaxation
Don’t want to do it: the SDP bound is tight, but expensive !
Idea: keep the constraint and dualize it !
15
Formulation and new semidefinite bounds
Getting bounds by duality
Dualize the spherical constraint with α ∈ R

max hQ, Xi − α(kXk2 − (n + 1)2 )



hQj , Xi = 4k − 2n, j ∈ {0, . . . , n}
θ(α) :=
hEi , Xi = 1, i ∈ {0, . . . , n}



X<0
16
Formulation and new semidefinite bounds
Getting bounds by duality
Dualize the spherical constraint with α ∈ R

max hQ, Xi − α(kXk2 − (n + 1)2 )



hQj , Xi = 4k − 2n, j ∈ {0, . . . , n}
θ(α) :=
hEi , Xi = 1, i ∈ {0, . . . , n}



X<0
Weak duality: each θ(α) gives an upper bound
17
Formulation and new semidefinite bounds
Getting bounds by duality
Dualize the spherical constraint with α ∈ R

max hQ, Xi − α(kXk2 − (n + 1)2 )



hQj , Xi = 4k − 2n, j ∈ {0, . . . , n}
θ(α) :=
hEi , Xi = 1, i ∈ {0, . . . , n}



X<0
Weak duality: each θ(α) gives an upper bound
Comparison:
θ(α) 6 θ(β) when α 6 β
18
Formulation and new semidefinite bounds
Getting bounds by duality
Dualize the spherical constraint with α ∈ R

max hQ, Xi − α(kXk2 − (n + 1)2 )



hQj , Xi = 4k − 2n, j ∈ {0, . . . , n}
θ(α) :=
hEi , Xi = 1, i ∈ {0, . . . , n}



X<0
Weak duality: each θ(α) gives an upper bound
Comparison:
θ(α) 6 θ(β) when α 6 β
No gap (!): in theory, bounds as tight as we want !?
θ(α) −→ val(dense subgraph)
when α → − ∞
19
Formulation and new semidefinite bounds
Getting bounds by duality
Dualize the spherical constraint with α ∈ R

max hQ, Xi − α(kXk2 − (n + 1)2 )



hQj , Xi = 4k − 2n, j ∈ {0, . . . , n}
θ(α) :=
hEi , Xi = 1, i ∈ {0, . . . , n}



X<0
Weak duality: each θ(α) gives an upper bound
Comparison:
θ(α) 6 θ(β) when α 6 β
No gap (!): in theory, bounds as tight as we want !?
θ(α) −→ val(dense subgraph)
when α → − ∞
In practice: only θ(α) for α > 0 are tractable...
20
Formulation and new semidefinite bounds
New family of SDP bounds
In fact, the useful new family of SDP bounds θ(α) with α > 0
Properties:
θ(0) is the standard SDP bound...
...computed by any SDP solver (IP, SB, PenSDP,...)
θ(α) for α > 0 boils down to a SDP least-squares problem...
...computed by nonlinear optimization methods (Malick ’04)
21
Formulation and new semidefinite bounds
New family of SDP bounds
In fact, the useful new family of SDP bounds θ(α) with α > 0
Properties:
θ(0) is the standard SDP bound...
...computed by any SDP solver (IP, SB, PenSDP,...)
θ(α) for α > 0 boils down to a SDP least-squares problem...
...computed by nonlinear optimization methods (Malick ’04)
Key practical observation: θ(α) is easier than θ(0) to get !
But: θ(0) 6 θ(α) and θ(α) harder when α → 0 ! So what ?
Need of a numerical study of the ratio tigthness/cpu cost
22
Relaxed resolution: comparison of the bounds
Outline
1
Formulation and new semidefinite bounds
2
Relaxed resolution: comparison of the bounds
3
Exact resolution: branch-and-bound procedure
23
Relaxed resolution: comparison of the bounds
Technical point 1: how to choose α ?
Our strategy = having SDP-like bounds
We fix α = 10−4
– θ(10−4 ) ≈ θ(0)
– cpu time is reasonable
24
Relaxed resolution: comparison of the bounds
Technical point 1: how to choose α ?
Our strategy = having SDP-like bounds
– θ(10−4 ) ≈ θ(0)
– cpu time is reasonable
We fix α = 10−4
Example on an instance (with n = 300, d = 25%, k = 75)
Observe when α → 0 : convergence to θ(0) + cpu increase
2800
2.4
2700
2.2
2
1.8
2500
1.6
CPU Time (s)
θ(α)
2600
2400
1.4
2300
1.2
SDP Bound
2200
0
0.002
0.004
0.006
0.008
1
0.01
Value of α
25
Relaxed resolution: comparison of the bounds
Numerical comparison of standard and new SDP bounds
We compare: θ(α = 10−4 ) vs. the standard SDP bound θ(0)
Solvers:
– our home-made solver for θ(α) (Malick-Roupin ’10)
– SB, bundle method for λmax for θ(0) (Helmberg-Rendl ’00)
– CSDP, interior-point method for θ(0) (Borcher ’99)
Test-problems: graphs of Billionnet-Elloumi-Plateau ’09
26
Relaxed resolution: comparison of the bounds
Numerical comparison of standard and new SDP bounds
We compare: θ(α = 10−4 ) vs. the standard SDP bound θ(0)
Solvers:
– our home-made solver for θ(α) (Malick-Roupin ’10)
– SB, bundle method for λmax for θ(0) (Helmberg-Rendl ’00)
– CSDP, interior-point method for θ(0) (Borcher ’99)
Test-problems: graphs of Billionnet-Elloumi-Plateau ’09
Collection with 5 instances of graphs for each parameters:
– number of vertices n ∈ {80, 100, 300}
– density d ∈ {25%, 50%, 75%}
– size of the subgraph k ∈ {n/4, n/2, 3n/4}
For given n : 45 instances (5 for each param. setting)
27
Relaxed resolution: comparison of the bounds
Comparison of the solvers (1)
a result = the mean over 45 instances of same size n
n
80
100
300
θ(α), α = 10−4
time gap(%)
0.15” 0.07%
0.19” 0.08%
2.81” 0.05%
28
Relaxed resolution: comparison of the bounds
Comparison of the solvers (1)
a result = the mean over 45 instances of same size n
n
80
100
300
θ(α), α = 10−4
time gap(%)
0.15” 0.07%
0.19” 0.08%
2.81” 0.05%
θ(0) by SB
time
θ(α)
1.09”
0.76”
2.9”
1.63”
39.64” 25.15”
θ(0) by
time
0.34”
0.58”
10.12”
CSDP
θ(α)
0.28”
0.47”
8.05”
29
Relaxed resolution: comparison of the bounds
Comparison of the solvers (1)
a result = the mean over 45 instances of same size n
n
80
100
300
θ(α), α = 10−4
time gap(%)
0.15” 0.07%
0.19” 0.08%
2.81” 0.05%
θ(0) by SB
time
θ(α)
1.09”
0.76”
2.9”
1.63”
39.64” 25.15”
θ(0) by
time
0.34”
0.58”
10.12”
CSDP
θ(α)
0.28”
0.47”
8.05”
Our solver to compute θ(α) is
quick - especially for large problems
tight - mean gap with standard SDP 6 1%
30
Relaxed resolution: comparison of the bounds
Comparison of the solvers (1)
a result = the mean over 45 instances of same size n
n
80
100
300
θ(α), α = 10−4
time gap(%)
0.15” 0.07%
0.19” 0.08%
2.81” 0.05%
θ(0) by SB
time
θ(α)
1.09”
0.76”
2.9”
1.63”
39.64” 25.15”
θ(0) by
time
0.34”
0.58”
10.12”
CSDP
θ(α)
0.28”
0.47”
8.05”
Our solver to compute θ(α) is
quick - especially for large problems
tight - mean gap with standard SDP 6 1%
reliable - running times are almost constant for given size
Ex: for n = 100, mean standard deviation of cpu times
σθ(α) = 0.02
σSB = 2.32
σCSDP = 0.11
31
Relaxed resolution: comparison of the bounds
Comparison of the solvers (2)
Example: for a graph with n = 80, d = 50%, k = 40
540
"SDLS"
530
"SB"
"CSDP"
Bound
520
510
500
490
0
0.2
0.4
0.6
0.8
1
1.2
1.4
CPU time (s)
32
Relaxed resolution: comparison of the bounds
Technical point 2: back to formulations
Performance of solvers depends on the formulations
33
Relaxed resolution: comparison of the bounds
Technical point 2: back to formulations
Performance of solvers depends on the formulations
2 equivalent {0, 1}-QP for densest subgraph problem...


max y > W y
max y > W y






>y = k
e> y = k
eP
2
( yi − k) = 0 
y> Cj y = k yj , j = 1 : n





n
y ∈ {0, 1}
y ∈ {0, 1}n
...lead to 2 equivalent
SDP relaxations...
34
Relaxed resolution: comparison of the bounds
Technical point 2: back to formulations
Performance of solvers depends on the formulations
2 equivalent {0, 1}-QP for densest subgraph problem...


max y > W y
max y > W y






>y = k
e> y = k
eP
2
( yi − k) = 0 
y> Cj y = k yj , j = 1 : n





n
y ∈ {0, 1}
y ∈ {0, 1}n
8
...lead to 2 equivalent
SDP relaxations...
7
6
CPU Time
5
...for which the solvers
behave differently !
4
3
2
Choose the best for each
solver: CSDP vs θ, SB
1
0
SDLS θs(10-4)
SDLS θp(10-4)
SB θs(0)
SB θp(0)
CSDP θs(0)
CSDP θp(0)
Graph Size = 100
35
Exact resolution: branch-and-bound procedure
Outline
1
Formulation and new semidefinite bounds
2
Relaxed resolution: comparison of the bounds
3
Exact resolution: branch-and-bound procedure
36
Exact resolution: branch-and-bound procedure
Simple branch-and-bound
Characteristics of our branch-and-bound algorithm:
Initialization: greedy algorithm gives
– (good) feasible point
– lower bound on optimal solution
Extremely simple branching strategy:
– fixed order of separation
– depth-first
Bounding strategy: new SDP bound θ(α) with α = 10−4
– Our solver admits early stops
– Warm-restart for sub-problems
Expect the bounding, the rest is rather standard
37
Exact resolution: branch-and-bound procedure
Comparison with the best to get densest subgraphs
Comparison with QCR of Billionnet-Elloumi-Plateau ’09
Quadratic programming approach that mixes nicely
– SDP for computing parameters of convex relaxation
– CPLEX for branch-and-bound (MIQP to be solved)
Numerical comparison: same instances - same machine
38
Exact resolution: branch-and-bound procedure
Comparison with the best to get densest subgraphs
Comparison with QCR of Billionnet-Elloumi-Plateau ’09
Quadratic programming approach that mixes nicely
– SDP for computing parameters of convex relaxation
– CPLEX for branch-and-bound (MIQP to be solved)
Numerical comparison: same instances - same machine
Aggregated results
800
700
600
Average
CPU time
500
400
SDLS
QCR
300
200
100
0
40
80
100
Graph size
39
Exact resolution: branch-and-bound procedure
Numerical comparison: some more details
2500
2000
SDLS
QCR
1500
1000
500
100
3n/4
75%
100
3n/4
25%
100
3n/4
25%
100
n/2
75%
100
n/2
50%
100
n/2
25%
100
n/4
75%
100
n/4
50%
100
n/4
25%
80
3n/4
75%
80
3n/4
25%
80
3n/4
25%
80
n/2
75%
80
n/2
50%
80
n/2
25%
80
n/4
75%
80
n/4
50%
80
n/4
25%
0
Comparable results - with different strategies !
Our bound is more expensive but we prune very well
Ex: nb of nodes in tree for n = 100: 48,675 vs 950,041
40
Exact resolution: branch-and-bound procedure
Conclusion on numerical experiments
The bounds θ(α) are interesting
– they provide SDP-quality bounds
– they are cheaper to get: good ratio tightness/computing-time
The solver for θ(α) combines advantages of SB and CSDP
–
–
–
–
gives guaranteed upper bounds (like SB)
has a sharp initial decrease (like SB)
is reliable (like CSDP)
we can interrupt it (like SB)
The branch-and-bound using θ(α)
– uses SDP-like bounds (all way long) so prunes very well
– has performances comparable with the best (that uses CPLEX)
J. Malick and F. Roupin
Numerical study of SDP bounds for the k-cluster problem
Electronic Notes in Discrete Mathematics: Proceedings of ISCO, 2010
J. Malick and F. Roupin
Solving k-cluster problems to optimality with semidefinite programming
To appear in Mathematical Programming, 2011
41
Exact resolution: branch-and-bound procedure
Conclusion...
Essential points of this work
– new formulation of rank-one constraint
– new SDP-like bounds for the densest k-subgraph
– competitive branch-and-bound to compute dense subgraphs
42
Exact resolution: branch-and-bound procedure
Conclusion...
Essential points of this work
– new formulation of rank-one constraint
– new SDP-like bounds for the densest k-subgraph
– competitive branch-and-bound to compute dense subgraphs
On-going research on these new bounds
– generalisation 1: universality using α (Malick Roupin ’11)
– generalization 2: to other combinatorial problems
easy, in theory... but still requires work, in practice...
– densest k-subgraph: simple formulation but challenging for
pure-SDP approach (no SDP approach)
43
Exact resolution: branch-and-bound procedure
Conclusion... and next talk !
Essential points of this work
– new formulation of rank-one constraint
– new SDP-like bounds for the densest k-subgraph
– competitive branch-and-bound to compute dense subgraphs
On-going research on these new bounds
– generalisation 1: universality using α (Malick Roupin ’11)
– generalization 2: to other combinatorial problems
easy, in theory... but still requires work, in practice...
– densest k-subgraph: simple formulation but challenging for
pure-SDP approach (no SDP approach)
Advertisement: next talk of Nathan Krislock on max-cut
– different presentation of the family + details on computation
– max-cut admits SDP-based B&B (Wiegele et al ’09)
– managment of inequalities + control on α...
44
Exact resolution: branch-and-bound procedure
Conclusion... and next talk !
Essential points of this work
– new formulation of rank-one constraint
– new SDP-like bounds for the densest k-subgraph
– competitive branch-and-bound to compute dense subgraphs
On-going research on these new bounds
– generalisation 1: universality using α (Malick Roupin ’11)
– generalization 2: to other combinatorial problems
easy, in theory... but still requires work, in practice...
– densest k-subgraph: simple formulation but challenging for
pure-SDP approach (no SDP approach)
Advertisement: next talk of Nathan Krislock on max-cut
– different presentation of the family + details on computation
– max-cut admits SDP-based B&B (Wiegele et al ’09)
– managment of inequalities + control on α...
thanks !
45