Convex optimization

C ONVEX OPTIMIZATION
theory and practice
Constantinos Skarakis
BT Group plc
Mobility Research Centre,
Complexity Research
Adastral Park
Martlesham Heath
Suffolk IP5 3RE
United Kingdom
Supervisors:
University of York
Department of Mathematics,
Heslington
York, YO10 5DD
United Kingdom
Dr. K. M. Briggs
Dr. J. Levesley
Dissertation submitted for the MSc in Mathematics with Modern Applications,
Department of Mathematics, University of York, UK on August 22, 2008
2
Contents
1
Introduction
2
Theoretical Background
2.1 Convex sets . . . . . . .
2.2 Definiteness . . . . . . .
2.3 Cones . . . . . . . . . . .
2.4 Generalised inequalities
2.5 Convex functions . . . .
3
4
5
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Convex optimization
3.1 Quasiconvex optimization . . . . . . . .
3.1.1 Golden section search . . . . . .
3.2 Duality . . . . . . . . . . . . . . . . . . .
3.3 Linear programming . . . . . . . . . . .
3.3.1 Linear-fractional programming .
3.4 Quadratic programming . . . . . . . . .
3.4.1 Second-order cone programming
3.5 Semidefinite programming . . . . . . .
Applications
4.1 The Virtual Data Center problem .
4.1.1 Scalability . . . . . . . . . .
4.2 The Lovász ϑ function . . . . . . .
4.2.1 Graph theory basics . . . .
4.2.2 Two NP-complete problems
4.2.3 The sandwich theorem . . .
4.2.4 Wireless Networks . . . . .
4.2.5 Statistical results . . . . . .
4.2.6 Further study . . . . . . . .
4.3 Optimal truss design . . . . . . . .
4.3.1 Physical laws of the system
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
9
9
14
16
18
19
.
.
.
.
.
.
.
.
23
25
27
28
30
31
32
33
33
.
.
.
.
.
.
.
.
.
.
.
35
35
39
41
41
42
43
46
47
48
51
51
4
CONTENTS
4.4
4.3.2 The truss design optimization problem
4.3.3 Results . . . . . . . . . . . . . . . . . . .
Proposed research . . . . . . . . . . . . . . . . .
4.4.1 Power allocation in wireless networks .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 52
. 54
. 59
. 59
Chapter 1
Introduction
The purpose of this paper is to bring closer the mathematical and the computer science, in the field where it makes the most sense to do so: Mathematical optimization. More specifically Convex optimization. The reason
is simply because most real problems in engineering, information technology, or many other industrial sectors involve optimizing some kind of
variable. That can either be minimizing cost, emissions, energy usage or
maximizing profit, radio coverage, productivity. And most of the times, if
not always, these problems are either difficult, or very inefficient to solve
by hand. Thus the use of technology is required.
The advantages of convex optimization, is that first of all, it includes a
large number of problem classes. In other words, many of the most commonly addressed optimization problems are convex. Or even if they are
not convex, some times [sections 3.1, 3.3.1], there is a way to reformulate
them into convex form. What’s even more important than that, is that
due to the convex nature of the feasible set of the problem, any local optimum is also the global optimum. Algorithms written to solve convex optimization problems take advantage of such particularities, and are faster,
more efficient and very reliable. Especially in specific classes of convex
programming, such as linear or geometric programing, algorithms have
been written to deal with these particular cases even more efficiently. As
a result, very large problems, with even thousands of variables and constraints, are solvable in sufficiently little time.
The event that inspired this project, was the relatively recent release
of CVXMOD; a programming tool for solving convex optimization problems, written in the programming language python. The release of an optimization software is not something new. Similar tasks can be performed
using Matlab solvers, like cvx, but it normally takes an above medium
level of expertise and many lines of code and maybe also include signifi-
6
Introduction
CVXMOD
python
CVXOPT
solvers
DSDP
GLPK
Lapack & BLAS
Atlas
GSL
solvers
Programming language C
Figure 1.1: CVXMOD, CVXOPT and their dependencies.
cant cost from the purchase of the program. What’s really exciting about
this new piece of software, is that it is free, open-source, and it makes it
possible to solve a large complex convex optimization problem using just
15 or 20 lines of code. It works as a modeling layer for CVXOPT, a python
module using the same principles with cvx.
CVXOPT comes with its own solvers written completely in python.
However, the user has the option to use in some cases C solvers, translated
into python. These are the same solvers used by cvx in Matlab, and rely
on the same C libraries [Frigo and Johnson, 2003, Makhorin, 2003, Benson
and Ye, 2008, Galassi et al., 2003, Whaley and Petitet, 2005]. These libraries
have been used for many years and as a result they have been well tested
and constitute to a very reliable set of optimization tools. Among other
things, we hope to test the efficiency of CVXMOD and CVXOPT in our
examples. If the results are satisfactory, that means that we will have in our
hands tools that are very high-level, as the programming language they
are written in themselves, as well as a good mathematical tool. Then the
only difficulty will lie in formulating the problem mathematically. Here is
a list of applications we present as example problems that can be efficiently
solved using CVXOPT [Boyd and Vandenberghe, 2004]
·
·
·
·
Optimal trade-off curve for a regularized least-squares problem.
Optimal risk-return trade-off for portfolio optimization problems.
Robust regression.
Stochastic and worst-case robust approximation.
7
from cvxmod import *
c=matrix((−1,1),(2,1))
A=matrix((−1.5,0.25,3.0,0.0,−0.5,1.0,1.0,−1.0,−1.0,−1.0),(5,2))
b=matrix((2.0,9.0,17.0,−4.0,−6.0),(5,1))
x=optvar(’x’,2)
p=problem(minimize(tp(c)*x),[A*x<=b])
p.solve()
Program 1.1: lpsimple.py
·
·
·
·
·
·
·
·
·
·
·
·
Polynomial and spline fitting.
Least-squares fit of a convex function.
Logistic regression.
Maximum entropy distribution with constraints.
Chebyshev and Chernoff bounds on probability of correct detection.
Ellipsoidal approximations of a polyhedron.
Centers of a polyhedron.
Linear discrimination.
Linear, quadratic and fourth-order placement.
Floor planning.
Optimal hybrid vehicle operation.
Truss design [Freund, 2004].
At this point, we present a very simple example. We will use CVXMOD to solve the following linear program:
minimize y − x,
subject to y 6 1.5x + 2,
y 6 −0.25x + 9,
y > 3.0x − 17,
y > 4,
y > −0.5x + 6.
We arrange the coefficient of the objective and constraints appropriately




−1.5
1
2
 0.25
 9 
1 




−1


,
17
3 −1  ,
c=
,
A=
b=


1

 −4 
0 −1 
−0.5 −1
−6
8
Introduction
(ε)1
(ε)3
(ε)0
(ε)4
(ε)2
Figure 1.2: (ε)0 : y = 3x/2 + 2, (ε)1 : y = −x/4 + 9, (ε)2 : y = 3x − 17,
(ε)3 : y = 4, (ε)4 : y = −x/2 + 6
so we can rewrite the problem using a more efficient description. Now
the objective can be written as cT x and all five constraints are included in
Ax 4 b.
Program 1.1 will provide the optimal solution, which is -3. In Figure 1.2, one can get a better visualization of problem. The shaded area
is the feasible region of the problem, determined by the constraints. We
seek the line y = x + α, that intersects the feasible set for the minimum
value of α. The dashed line, is the optimal solution we are looking for.
Since the program gave output -3, the dashed line will be
y = x − 3.
Chapter 2
Theoretical Background
The definition and results given below are mostly based on and in accordance with Boyd and Vandenberghe [2004].
For the sake of a better visual result, and also to maintain consistency
between the mathematical formulation and the code implementation of
applications, we will be using notation somewhat outside the pure mathematical formalism. First of all, we will enumerate starting from zero,
which means that all vector and matrix indices will start from zero and
so will summations and products. Also, we will use the symbol ’b’ for
”probability” quantities, to denote complement with respect to 1. In other
b = 1 − λ.
words, if λ ∈ [0, 1] then λ
2.1
Convex sets
A (geometrical) set C is convex, when for every two points that lie in C,
every point on the line segment that connects them, also lies in C.
Definition 1. Let C ⊆ Rn . Then C is convex, if and only if for any x0 , x1 ∈ C
and any λ ∈ [0, 1]
b 0 + λx1 ∈ C.
λx
(2.1)
The above definitions can be generalised inductively for more than two
points in the set. So we can say that a set is convex, if and only if it contains
all possible weighted averages of its points, which means all linear combinations of its points, with coefficients that sum up to 1. Such combinations
are also called convex combinations. The set
X
{λ0 x0 + · · · + λk−1 xk−1 | xi ∈ S, i = 0 . . . k − 1, λi > 0,
λi = 1}, (2.2)
06i<k
10
Theoretical Background
is the set of all possible convex combinations of points of a set S and is
called the convex hull of S. The convex hull is the smallest convex set that
includes S. It could also be described as the intersection of all convex
sets that include S. Convex sets are very common in geometry and very
important in numerous mathematical disciplines.
Affine sets
These are the most immediate, or even trivial example of convex sets.
Definition 2. A set A ⊆ Rn is affine, if for any x0 , x1 ∈ A and λ ∈ R
b 0 + λx1 ∈ A.
λx
(2.3)
What the definition formally states is that a set A is affine, if for any
two points that lie in A, every point on the line through these two points,
also lies in A. The simple example of an affine set, is of course the straight
line. Affine sets are convex. However the converse is not true. Convex
sets are not (in general) affine.
Hyperplanes
Definition 3. A hyperplane is a set of the form
{x | aT x = b},
(2.4)
where x, a ∈ Rn , b ∈ R and a 6= 0.
Geometrically speaking, a hyperplane is a translation of a subspace of
Rn of dimension n − 1. Most importantly, its elements, as elements of Rn ,
are linearly dependent. For example, the hyperplanes of R2 will be the
straight lines. In R4 , the hyperplanes will be 3-dimensional spaces. In R3
hyperplanes will be nothing more than normal planes. For that reason,
from now forth we will use the term plane instead of hyperplane.
Definition 4. A (closed) halfspace, is a set of the form
{x | aT x 6 b},
(2.5)
where x, a ∈ Rn , b ∈ R and a 6= 0. If the inequality is strict, then we have an
open halfspace.
2.1 Convex sets
11
P
E
Figure 2.1: Left: A polyhedron in two dimensions, represented as the intersection of 5 halfspaces. Right: A three dimensional ellipsoid.
A plane divides the space in two halfspaces. For example the plane
{x | aT x = b},
will divide Rn into the halfspaces {x | aT x < b} and {x | − aT x < −b}.
Since planes are collections of linearly dependent points, they are affine
and consequently, they are convex. On the other hand, halfspaces are convex sets, but they are not affine, because they only include lines that are
parallel to the boundary-plane.
Polyhedra
An intersection of halfspaces and hyperplanes is called a polyhedron. If a
polyhedron is bounded, it can be called a polytope.
Definition 5. A polyhedron is a set of the form
P = {x | aTi x 6 bi , i = 0, . . . , m − 1, cj x = dj , j = 0, . . . , p − 1}.
(2.6)
The most immediate nontrivial example of a polyhedron in two dimensions, would a polygon, or just the interior of the polygon, if we demand
strict inequalities in the definition. Needless to say, that if the system of
equalities and inequalities in (2.6) have no solution, then the P will be the
empty set.
Polyhedra are convex sets. A proof of this statement, would be a special case to the proof that intersections of convex sets are convex:
12
Theoretical Background
Lemma 1. Any countable intersection of convex sets, is a convex set.
Proof. We will prove the statement for the intersection of two sets. The
rest of the proof is simple induction. Let C, D be convex subsets of Rn . Let
x0 , x1 ∈ C ∩ D and let λ ∈ [0, 1]
C ∩ D ⊆ C ⇒ x0 , x1 ∈ C ⇒
b 0 + λx1 ∈ C,
λx
(2.7)
C ∩ D ⊆ D ⇒ x0 , x1 ∈ D ⇒
b 0 + λx1 ∈ D.
λx
(2.8)
(2.7) and (2.8)⇒
b 0 + λx1 ∈ C ∩ D.
λx
A very important polyhedron in the field of mathematical optimization
is the simplex.
Definition 6. Let v0 , v1 , . . . vk ∈ Rn , k 6 n such that v1 − v0 , v2 − v0 , . . . vk − v0
are linearly independent. A simplex, is a set of the form
X
θj = 1}.
(2.9)
{θ0 v0 + · · · + θk vk | θj > 0, j = 0, . . . , k − 1,
06j<k
In two dimension, an example of a simplex can be any triangle and in
three dimensions any tetrahedron.
Norm Balls
A norm ball - or just ball, if the norm we are referring to is the Euclidean is well known to be a set of the form
{x ∈ Rn | kx − xc k 6 r}.
(2.10)
Point xc is the center of the norm ball, r ∈ R is its radius and k · k is any
norm on Rn .
It is easy to prove convexity of norm balls.
Proof. Let B be a norm ball, in other words, a set of the form (2.10). Let
λ ∈ [0, 1].
x0 , x1 ∈ B ⇒ kx0 − xc k 6 r, kx1 − xc k 6 r
b 0 + λx1 − xc k = kλx
b 0 + λx1 − λx
b c − λxc k 6
kλx
b 0 − xc k + λkx1 − xc k 6 λr
b + λr = r.
λkx
2.1 Convex sets
13
Ellipsoids
A generalization of the 2 dimensional conic section, the ellipse, is the ellipsoid. In 3 dimensions, it’s what one might describe as a melon-shaped set.
A rigorous definition would be
Definition 7. An ellipsoid is a set of the form
{xc + Au | kuk2 6 1},
(2.11)
where xc ∈ Rn is the center of the ellipsoid, A ∈ Rn×n is a square matrix and
k · k2 is the Euclidean norm. If A is singular, the set is called a degenerate
ellipsoid.
A more practical definition of an ellipsoid will be given after definiteness has been defined. If A was replaced by a real number r, then (2.11)
would be an alternative representation of the Euclidean ball. Ellipsoids
are also convex:
Proof. Let E be a set of the form (2.11). Let λ ∈ [0, 1] and u0 , u1 ∈ Rn with
ku0 k2 , ku1 k2 6 1.
xc + Au0 , xc + Au1 ∈ E ⇒
b 0 + λu1 ∈ E,
b c + λAu
b 0 + λxc + λAu1 = xc + A λu
λx
because
b 0 + λu1 k2 6 λku
b 0 k2 + λku1 k2 = 1.
kλu
Set convexity is preserved under intersection. It is also preserved under affine transformation, which is application of a function of the form
f : Rn → Rm : f (x) = Ax + b.
In other words, if C is a convex subset of Rn , then the set
f (C) := {f (x) | x ∈ C},
is a convex subset of Rm . What’s more, the inverse image of C under f :
f −1 (C) := {x | f (x) ∈ C},
is also a convex subset of Rn . Convexity is not however preserved under
union, or any other usual operation between sets.
14
2.2
Theoretical Background
Definiteness
Let Sn be the set of square n × n symmetric matrices:
A ∈ Sn ⇔ AT = A.
Definition 8. A symmetric matrix A ∈ Sn is called positive semidefinite or
nonnegative definite if for all its eigenvalues λi
λi > 0,
i = 0, . . . , n − 1.
If the above inequality is strict, then A is called positive definite. If −A is
positive definite, A is called negative definite. If −A is positive semidefinite,
then A is called negative semidefinite, or nonpositive definite.
The set of symmetric positive semidefinite matrices we will denote as
The set of positive definite symmetric matrices, we will write as Sn++
(Also, we will note R+ for the non-negative real numbers and R++ for the
positive real numbers).
Sn+ .
Lemma 2.
A ∈ Sn+ ⇔ ∀ x ∈ Rn , xT Ax > 0,
A ∈ Sn++ ⇔ ∀ x ∈ Rn , xT Ax > 0.
(2.12)
(2.13)
Proof. Let λmin be the minimum eigenvalue of A. We use the property
[Boyd and Vandenberghe, 2004, page 647] that for every x ∈ Rn
λmin xT x 6 xT Ax.
Then if λmin is positive, so will xT Ax for all x ∈ Rn .
Definition 9. The Gram matrix associated with v0 , . . . , vm−1 , vi ∈ Rn , i =
0, . . . , m − 1 is the matrix
G = V T V,
V = [v1 · · · vm−1 ],
so that Gij = viT vj .
Theorem 1. Every Gram matrix G is positive semidefinite and for every positive
semidefinite matrix A, there is a set of vectors v0 , . . . , vn−1 such that A is the
Gram matrix associated with those vectors.
Proof. (⇒)Let x ∈ Rn :
xT Gx = xT (V T V )x = (xT V T )(V x) = (V x)T (V x) > 0.
(⇐) For this direction we use the fact that every positive semidefinite matrix A has a positive semidefinite ”square root” matrix B such that A = B 2
[Horn and Johnson, 1985, pages 405,408]. Then A will be the Gram matrix
of the columns of B.
2.2 Definiteness
15
Solution set of a quadratic inequality
This example gives us good reason to present the following
Theorem 2. A closed set C is convex if and only if it is midpoint convex, which
means that it satisfies (2.1) for λ = 1/2.
Proof. One direction is obvious. If C is convex, then it will be midpoint
convex. For the other direction, suppose C is closed and midpoint convex.
Then, for every point in C, the midpoint also belongs in C. Now take two
points in a, b ∈ C and let [a, b] be the closed interval from a to b, taken on
the line that connects the two points, with positive direction from a to b.
We want to prove that x ∈ C ∀x ∈ [a, b]. Take the following sequence of
partitions of [a, b]
n−1 2[
c
k kd
+1
k+1
k
a + n b, n a + n b .
Pn =
n
2
2
2
2
k=0
All lower and upper bounds of the closed intervals in Pn belong to C,
b + λb for some
from midpoint convexity. Take any point x ∈ [a, b]. So x = λa
λ ∈ [0, 1]. Then there is a closed interval in Pn , of length 1/2n that contains
x. Now take the sequence of points in C,
xn =
c
c
k
k k
k k+1
kd
+1 a
+
b
x
∈
[
a
+
b,
a
+
b] .
2n
2n
2n
2n
2n
2n
If we let n → ∞, then xn → x. Since C is closed, then it must contain its
limit points. Thus x ∈ C.
We shall use this to prove that the set
C = {x ∈ Rn | xT Ax + bT x + c 6 0},
where b ∈ Rn and∈ Sn , is convex for positive semidefinite A.
Proof. Since C is closed, we will prove that it is midpoint convex and then
use Theorem 2, to show that it is convex. Let f (x) = xT Ax + bT x + c. Let
x0 , x1 ∈ C ⇒ f (x0 ), f (x1 ) 6 0. All we need to show is that the midpoint
16
Theoretical Background
(x0 + x1 )/2 ∈ C, or f ((x0 + x1 )/2)) 6 0.
1
1
f ((x0 + x1 )/2) = (x0 + x1 )T A(x0 + x1 ) + bT (x0 + x1 ) + c
4
2
1 T
1 T
1
= x0 Ax0 + x1 Ax1 + c + xT0 Ax1
4
4
2
1
1
1
1 T
= (f (x0 ) + f (x1 )) + x0 Ax1 − xT0 Ax0 − xT1 Ax1
2
2
4
4
1
1
T
= (f (x0 ) + f (x1 )) − (x0 − x1 ) A(x0 − x1 )
2
4
6 0,
if A ∈ Sn+
2.3
Cones
Definition 10. A set Ω is called a cone, if
∀x ∈ Ω, λ > 0 λx ∈ Ω.
(2.14)
If in addition, Ω is a convex set, then it is called a convex cone.
Lemma 3. A set Ω is a convex cone if and only if
∀x0 , x1 ∈ Ω, λ0 , λ1 > 0
λ0 x0 + λ1 x1 ∈ Ω,
(2.15)
Proof. (⇒)Define
y :=
\
λ0
λ0
λ0
λ1
x0 +
x1 =
x0 +
x1 .
λ0 + λ1
λ0 + λ1
λ0 + λ1
λ0 + λ1
Ω is convex, thus y ∈ Ω. Also, Ω is a cone, thus
(λ0 + λ1 )y = λ0 x0 + λ1 x1 ∈ Ω
(⇐) If (2.15) is true, then if we choose λ1 = 0 we have shown that Ω satisies
b0 , we see
the cone property. And if we choose any λ0 ∈ [0, 1] and λ1 := λ
that Ω also satisfies the convex property.
Similarly, we can show that every combination with non-negative coefficients of points of a cone, also lies in the cone. Such combinations are
called conic or non-negative linear combinations. The set of all possible conic
combinations of points in a set S, is called the conic hull of S. And similarly
2.3 Cones
17
with the convex hull of S, it is smallest convex cone that contains S. For
example, in R2 , the conic hull of the circle with centre (1,1) and radius 1, is
the first quadrant.
A very important example of a convex cone is the quadratic or secondorder cone
C = {(x, t) ∈ Rn+1 | kxk2 6 t}.
(2.16)
The proof follows a simple application of the triangular inequality. Another important example of a convex cone, is the set Sn+ : the set of symmetric positive semidefinite matrices.
Proof. Let λ0 , λ1 ∈ R+ and A0 , A1 ∈ Sn+ .
xT (λ0 A0 + λ1 A1 )x = λ0 xT A0 x + λ1 xT A1 x > 0 ⇒
λ0 A0 + λ1 A1 ∈ Sn+ .
Definition 11. Let Ω be a convex cone that also satisfies the following
properties:
1. Ω is closed.
2. Ω is solid, in other words, it has non-empty interior.
3. If x ∈ Ω and −x ∈ Ω then x = 0. (Ω is pointed).
Then we say that Ω is a proper cone.
Proper cones are very important to the definition of a partial ordering
on Rn , or even on Sn . That is a non-trivial problem without a unique optimal solution. There is also another type of cone we will be interested in
further on.
Definition 12. Let Ω be a cone. The set
Ω∗ = {y | xT y > 0 for all x ∈ Ω},
(2.17)
is called the dual cone of Ω.
The dual cone is very useful, as it has many desirable properties [Boyd
and Vandenberghe, 2004]:
1. Ω∗ is closed and convex.
2. Ω0 ⊆ Ω1 ⇒ Ω∗0 ⊇ Ω∗1 .
3. If Ω has nonempty interior, then Ω∗ is pointed.
4. If the closure of Ω is pointed, then Ω∗ has nonempty interior.
5. Ω∗ is the closure of the convex hull of Ω.
18
Theoretical Background
2.4
Generalised inequalities
As already mentioned, the purpose for introducing the notion of cones,
was to be able to define a partial ordering for vectors and matrices. This
way, we will be able to give meaning to expressions like matrix A is less than
matrix B. The problem of defining a partial ordering in such spaces does
not have a unique or even best solution. Here, we will give the partial
ordering that serves our needs, that is the needs of convex optimization.
Definition 13. Let Ω ⊆ Rn be a proper cone. Then the generalized inequality
associated with Ω will be defined as
x 4Ω y ⇔ y − x ∈ Ω.
(2.18)
We will manipulate the above ordering, as we do with ordering on real
numbers. We will write
• x <Ω y ⇔ x − y ∈ Ω.
• x ≺Ω y ⇔ y − x ∈ Ω̊, where Ω̊ is the interior of the set Ω. In other
words, Ω without the boundary.
We shall call the last relation, the strict generalized inequality.
If Ω := R+ , then we get the normal ordering of the real numbers. That
means that the generalized inequality we have defined, is indeed a generalization of the usual inequality in real number. That means that the usual
ordering in R can be considered as a special case of 4Ω . A very significant detail, however, is that the generalized ordering is a partial ordering,
which means that not all elements of Rn will be necessarily comparable.
In order to make use of the ordering we have just defined, we have to
define Ω to be the appropriate set for each case. To compare vectors, let
Ω := Rn+ (the nonnegative orthant). Then the generalized inequality will
be equivalent to componentwise usual inequality of real numbers.
x 4Rn+ y ⇔ xi 6 yi ,
i = 0, . . . , n − 1.
For symmetric matrices, let Ω := Sn+ ; the positive semidefinite cone. In this
case
A 4Sn+ B ⇔ B − A ∈ Sn+ .
In the last two cases, we will drop the subscript, and imply in each case
the appropriate proper cone. This has the immediate advantage, that it
equips us with a very compact notation for positive semidefinite matrices.
A < 0 ⇔ xT Ax > 0 ∀x ∈ Rn .
(2.19)
2.5 Convex functions
2.5
19
Convex functions
Definition 14. Let D ⊆ Rn . A function f : D → R is convex if and only if
D is convex and for all x0 , x1 ∈ D and λ ∈ [0, 1]
b 0 + λx1 ) 6 λf
b (x0 ) + λf (x1 ).
f (λx
(2.20)
We say f is strictly convex, if and only if, the above inequality, with x0 6= x1
and λ ∈ (0, 1), is strict. If −f is (strictly) convex, then we say f is (strictly)
concave. The above inequality is also called Jensen’s inequality.
Visually, a function is convex if the chord between any two points of
its graph is completely above the graph. There are cases of course, that the
graph of the function is a line, or a plane, or a multidimensional analogue.
In this case, (2.20) holds trivially as an equation. What’s more, it holds for
−f as well. These functions are called affine functions and have the form
f : Rn → Rm : f (x) = Ax + b,
(2.21)
where A ∈ Rm×n , x ∈ Rn and b ∈ Rm , and for the reason we just explained,
are both convex and concave.
In order to give an immediate relation between convex sets and convex
funcitons, we need the following:
Definition 15. The epigraph of a function f : Rn ⊇ D → R, is the set
defined as
epi f = {(x, t) | x ∈ D, f (x) 6 t} ⊆ Rn+1 .
The set Rn+1 \ epi f , in other words the set
hypo f = {(x, t) | x ∈ D, f (x) > t} ⊆ Rn+1 .
is called the hypograph of f .
Sometimes, nonstrict inequality can be allowed in the definition of the
hypograph of f . What’s important, is that the epigraph is the set of all
points that lie ”above” the graph of f and the hypograph is the set of all
points that lie ”below” the graph of f . A function f is convex, if and only
if its epigraph is a convex set. And f is concave, if and only if its hypograph is a
convex set.
To determine convexity of functions, it is more efficient to use calculus,
than the definition of convexity (under the hypothesis of course, that the
function is differentiable). We can use the simple observation, that if a
function is convex, then the graph will always be above the tangent line,
taken at any point of the domain of the function. To put it rigorously
20
Theoretical Background
epi f
f (x)
g(x)
hypo f
h(x)
Figure 2.2: Left: Epigraph and hypograph of a function. Right: The graph
of a convex (h) and a concave (g) function
Theorem 3. A differentiable function f : Rn ⊇ D → R is convex, if and only if
D is a convex set and
f (y) > f (x) + ∇f (x)T (y − x),
(2.22)
for all x, y ∈ D.
Equation (2.22) is also called the first order condition. In case that f is
also twice differentiable, the second order condition can also be used.
Theorem 4. Let H = ∇2 f (x), be the Hessian of the twice differentiable function
f : Rn ⊇ D : R. Then f is convex if and only if D is a convex set, and
H < 0.
(2.23)
In one dimension, the first and second order conditions, become
f (y) > f (x) + f 0 (x)(y − x),
f 00 (x) > 0.
(2.24)
(2.25)
Operations that preserve convexity
Here we present briefly a list of the most important operations that preserve convexity. Some are obvious, for the rest the proof can be found in
Boyd and Vandenberghe [2004].
2.5 Convex functions
21
• Nonnegative weighted sums: Let fi be convex functions and wi > 0 for
i = 0, . . . , m − 1. Then g is a convex function where
g = w0 f0 + · · · + wm−1 fm−1 .
• Pointwise maximum and supremum: . Let fi be convex functions, i =
0, . . . , m − 1. Then g is a convex function, where
g = max{f0 , . . . , fm−1 }.
• Scalar composition rules: (For all the results below, we suppose that
g(x) ∈ dom f .) Let f (x) be a convex function. Then f (g(x)) is a
convex function if
– g(x) is an affine function. That is a function of the form Ax + b.
– g is convex and f is nondecreasing.
– g is concave and f is nonincreasing.
The rules for vector composition can be constructed if we apply the
above to every argument of f . Also we can find the rules that preserve concavity by just taking the dual (in the mathematical logic
sense) propositions of the above (minimum instead of maximum, decreasing instead of increasing, concave instead of convex, etc.).
22
Theoretical Background
Chapter 3
Convex optimization
First of all, we introduce the basic concept behind mathematical optimization. An optimization problem, or program is a mathematical problem of the
form
minimize f0 (x)
subject to fi (x) 6 bi ,
i = 1, . . . m.
(3.1)
We call f0 (x) the objective function, fi (x) the constraint functions and x the
optimization variable. bi are normally called the limits or bounds for the
constraints. Normally, fi : Di → R, where Di ⊆ Rn , for i = 0, . . . m ∈ N. If
the domains Di are not the same set for all i, then we look for a solution in
the set D = ∪n−1
i=0 Di , under the condition of course that this is a nonempty
set. If D is nonempty, any x ∈ D that satisfies the constraints is called
a feasible point. If there is at least one such x, then (3.1) is called feasible.
Otherwise, it is called infeasible. In case D0 = Rn and ∀x ∈ D0 , f0 (x) = c,
c ∈ R, then any x minimizes f0 trivially, as long as (3.1) is feasible. In that
case, we only have to check that the constraints are consistent. In other
words, verify that D is non-empty and then solve the following:
find
x
subject to fi (x) 6 bi
i = 1, . . . , m.
(3.2)
This is called a feasibility problem. And any solution to (3.2), will be a feasible point and vice versa. If we denote S, the solution set to (3.2), then the
set F = S ∩ D0 is called the feasible set of (3.1).
In the case that the constraint functions are omitted, then the problem
is called an unconstrained optimization problem. Also, for some i > 0,
there is the possibility that instead of the inequality, we have the equality
fi (x) = bi . If we incorporate the constraint bounds into the constraint
24
Convex optimization
function and also rename appropriately, we can rewrite (3.1) in what is
called standard form:
minimize f (x)
subject to gi (x) 6 0 i = 0, . . . , p − 1
hi (x) = 0 i = 0, . . . m − 1.
(3.3)
Now that we have the problem in standard form, we can express its solution, or optimal value f ∗ as
f ∗ = inf{f (x) | gi (x) 6 0, i = 0, . . . , p − 1, hi (x) = 0, i = 0, . . . , m − 1}.
We call the x∗ , such that f (x∗ ) = f ∗ the optimal point, or the global optimum.
Of course, in this general case, there is a possibility that we also have local
optimal points. Which means that there is ε > 0 such that
f (x) = inf{f (z) | gi (z) 6 0, i = 0, . . . , p − 1, hi (z) = 0, i = 0, . . . , m − 1,
kz − xk 6 ε.} (3.4)
Definition 16. Consider the optimization problem in standard form (3.3).
Suppose that the following conditions are satisfied:
• The objective function is convex; in other words, f satisfies Jensen’s
inequality (2.20).
• The inequality constraints gi , i = 0 . . . m−1 are also convex functions.
• The equality constraints are affine functions; that is functions of the
form hi (x) = Ai x + bi , i = 0 . . . p − 1, A ∈ Rs×n , b ∈ Rs .
Then (3.3) is called a convex optimization problem (in standard form) or COP.
Theorem 5. Consider any convex optimization problem and suppose that this
problem has a local optimal point. Then that point is also the global optimum.
Proof. [Boyd and Vandenberghe, 2004] Let x be a feasible and local optimal
point for the COP (3.3). Then it satisfies (3.4). If it is not global optimum,
then there is feasible y 6= x, and f (y) < f (x). That means that ky − xk > ε,
otherwise, we would have f (x) 6 f (y). Consider the point
ε
b + λy,
.
z = λx
λ=
2ky − xk
Since the objectives and the constraints are convex functions, they have
convex domains. By Lemma 1, the feasible set F of (3.3) is a convex set.
Thus, z is also feasible. And by definition,
kz − xk = ε/2 < ε,
3.1 Quasiconvex optimization
25
so f (x) 6 f (z). But Jensen’s inequality for f gives
b (x) + λf (y) < f (x)
f (z) 6 λf
(because f (y) < f (x)). By contradiction, x is the global optimum.
This fact makes algorithms for solving convex problems much more
simple, effective and fast. The reason is that we have very simple optimality conditions, especially in the differentiable case.
Theorem 6. Consider the COP (3.3). Then x is optimal, if and only if x is feasible
and
∇f (x)T (y − x) > 0,
(3.5)
for all feasible y.
Proof. (⇒) Suppose x satisfies (3.5). Then if y is feasible, by (2.22) we have
that f (y) > f (x).
(⇐) Suppose that x is optimal, but there is feasible y such that
∇f (x)T (y − x) < 0.
b + λy, λ ∈ [0, 1]. Since the feasible set is
Then consider the point z = λ
convex, then z is feasible. Furthermore
d
f (z)
= ∇f (x)T (y − x) < 0,
dλ
λ=0
which means that for small enough, positive λ, f (z) < f (x). That is a
contradiction to the assumption that x was optimal.
For all the reasons mentioned above, it is only natural to try to extend the use of convex optimization techniques. To succeed that, we use
mathematical ‘tricks’ to transform some non-convex problems, into convex, whenever that is possible.
3.1
Quasiconvex optimization
Definition 17. Let f : Rn ⊇ D → R such that D is convex and for all α ∈ R,
Sα := {x ∈ D | f (x) 6 α} are also convex sets. Then f is called quasiconvex.
The sets Sα are called the α−sublevel, or just sublevel sets of f . If −f is
quasiconvex, then f is called quasiconcave. If f is both quasiconvex and
quasiconcave, then it is called quasilinear.
26
Convex optimization
γ
f (x)
β
α
Sα Sβ S
γ
-
Figure 3.1: A quasiconvex function f .
Quasiconvex functions could be loosely described as ”almost” convex
functions. However they are not convex, which means that it is possible
for a quasiconvex function to have non-global local optima. As a result,
we can’t solve (3.3), for f quasiconvex, using convex problem algorithms.
However, there is a method to approach these problems making use of the
advantages of convex optimization, and that is through convex feasibility
problems.
Let φt (x) : Rn → R, t ∈ R such that φt (x) 6 0 ⇔ f (x) 6 t and also φt
is a nonincreasing function of t. We can easily find such an φt . Here is an
outline of how one might construct one:

 linear decreasing non-negative, x 6 inf St ,
0,
x ∈ St ,
φt (x) =

linear increasing non-negative, x > sup St .
If we also adjust the linear components of φt , such that it is continuous
and nonincreasing in t, then it is easy to see that this is a convex function
such as the one we are looking for. Now consider the following feasibility
problem
find
x
subject to φt (x) 6 0
gi (x) 6 0 i = 0, . . . , p − 1
hi (x) = 0 i = 0 . . . m − 1.
(3.6)
We still denote f ∗ as the optimal value of (3.3). But now we suppose that
f is quasiconvex and the restraints are as in the COP. Now, if (3.6) is feasible, then f ∗ 6 t. As a matter of fact f ∗ 6 f (x) 6 t. If it is not feasible,
3.1 Quasiconvex optimization
27
def quasib(dn,up,eps,feas):
’Bisection method for quasiconvex optimization’
while up−dn>eps: # eps is the tolerance level
t=(dn+up)/2.0
b=feas(t) # function feas solves the feasibility problem at the midpoint
if b==’optimal’:
up=t
else:
dn=t
return t
Program 3.1: quasibis.py
then f ∗ > t. We can then construct a bisection method, such that starting
with a large interval that contains f ∗ , we can find it (within given tolerance boundaries) by solving a convex feasibility problem and bisecting
the initial interval.
Program 3.1 realizes that. After each step in the loop, the interval is
bisected. Thus after k steps, it has length 2−k (up − dn). The operation
ends when that number gets smaller than ε. So the algorithm terminates
when k > log2 ((up − dn)/ε).
3.1.1
Golden section search
Bisection is what we need for this operation we describe above. But we
can’t discuss optimization and not mention the golden section search. It is a
very efficient technique for finding the extremum of a quasiconvex or quasiconcave function in one dimension. In contrast to the bisection method,
at each step the golden section search algorithm involves the evaluation
of the function at three steps, and then an additional fourth one. After the
control, three points of these four are selected, and the procedure continues.
The algorithm starts with input a closed interval [dn,up], which we
know contains the minimum and an internal point g0 . Then find which one
of the intervals [dn, g0 ],[g0 , up] is the largest and evaluate the function at an
internal point g1 of that interval. Then compare the value of the function
in the two points, g0 and g1 and repeat the step, with one of the triplets
(dn, g1 , g0 ) or (g0 , up, g1 ). To ensure the fastest possible mean convergence
b
time, we choose
√ g0 as the golden mean of up and dn (g0 = θdn+θup, where
θ = 2/(3 + 5)). Hence the name of the method. For the same reasons g1
28
Convex optimization
f (x)
dn
g0
g1
up
Figure 3.2: Golden section search
will be chosen to be dn − g0 + up, thus ensuring that up − g0 = g1 − dn.
To be more precise take an example where we seek the minimum of a
quasiconvex function f (Figure 3.2). We begin with the interval [dn,up],
which we know contains the minimum and the internal point g0 . From
the choice of g0 , [g0 , up] will be the largest subinterval. Next we find g1
and compare f (g0 ) to f (g1 ). If the former is greater than the latter, repeat
the step with [g0 , up] and g1 . Otherwise repeat with [dn, g1 ] and g0 . The
algorithm is an extremely efficient one-dimensional localization method
and is closely related to Fibonacci search.
3.2
Duality
It is necessary, before introducing the notion of the dual problem, to give
some basic definitions.
Definition 18. Consider the optimization problem in standard form (3.3).
Define the Lagrange dual function g : Rp × Rm → R as
p−1
m−1
X
X
g(λ, ν) = inf f (x) +
λi gi (x) +
νi hi (x) ,
(3.7)
x∈D
∪p−1
i=0
i=0
i=0
where D = dom(f )
dom(gi ) ∪m−1
dom(hi ). The
infii=0
Pp−1
P quantity whose
mum is taken, L(x, λ, ν) := f (x) + i=0
λi gi (x) + m−1
ν
h
(x)
,
is
called
i i
i=0
the Lagrangian of the problem. The variables λ and ν are called the dual
variables and λi , νi the Lagrange multipliers associated with the ith inequality and equality constraint accordingly.
3.2 Duality
29
The reason to present the dual function is that it can be used to provide
a lower bound for the optimal value f ∗ of (3.3), for all λ < 0 and for all ν.
Proof. Consider a feasible point x̄ for (3.3). Then gi (x̄) 6 0 for all i =
0 . . . p − 1 and hi (x̄) = 0 for all i = 0 . . . m − 1. Take λ < 0. Then
p−1
X
i=0
λi gi (x̄) +
m−1
X
νi hi (x̄) 6 0 ⇒ L(x̄, λ, ν) 6 f (x̄) ⇒
i=0
g(λ, ν) = inf L(x, λ, ν) 6 L(x̄, λ, ν) 6 f (x̄).
What’s more, since g is the pointwise infimum of a family of affine
functions of (λ, ν), it is always concave, even though (3.3) might not be
convex. This makes the search for a best lower bound a convex objective.
Definition 19. Consider (3.3), the optimization problem in standard form.
Then the problem
maximize g(λ, ν)
subject to λ < 0,
(3.8)
is called the Lagrange dual problem associated with (3.3). In that context, the
latter will also be referred to as the primal problem.
We can see, that for the reasons already mentioned, the dual problem is
always convex, even if the primal is not. Thus we can always get a lower
bound for the optimal value of any problem of the form (3.3), by solving a
convex optimisation problem.
Lemma 4 (Weak Duality). Let f ∗ be the optimal solution to the primal problem
(3.3), and d∗ the optimal solution to the dual problem (3.8). Then, d∗ is always a
lower bound for f ∗ : f ∗ > d∗.
We have already proven this property for all values of g. Thus it trivially holds for the maximum value of g. Even more useful is the property
that holds when the primal is a convex problem. Then, under mild conditions, we have guaranteed equality and the dual problem is equivalent to
the primal. A simple sufficient such condition is Slater’s condition. That is
the existence of a feasible point x, such that gi (x) < 0 for all i such that gi
is not affine. The proof of the next theorem is given in Boyd and Vandenberghe [2004].
Theorem 7 (Strong Duality). Let (3.3) be a convex optimization problem. Let
(3.8) be its dual. If (3.3) also satisfies Slater’s condition, then f ∗ = d∗ , where f ∗
and d∗ are the optimal values of the primal and the dual problem accordingly.
30
Convex optimization
Many solvers in optimization software, rely on iterative techniques
based on Theorem 7. They compute the value of the dual and the primal objective on an initial point, find the difference between the two and
if it is greater than some tolerance level, they continue and evaluate on
the next point, which is chosen based on the value of the gradient and the
Hessian. The operation terminates when the tolerance level is reached.
3.3
Linear programming
An important class of optimization problems is linear programming. In this
class of problems, the objective function and the constraints are affine.
Thus we have the general form:
minimize cT x + d
subject to Gx 4 h
Ax = b,
(3.9)
where G ∈ Rp×n , A ∈ Rm×n , c ∈ Rn , b ∈ Rp , h ∈ Rm and d ∈ R. The above is the
general form of a linear program, and will be referred to as LP. Since affine
functions are convex, linear programming can be considered as a special
case of convex optimization. However, it has been developed separately,
and nowadays we have very efficient ways and algorithms to solve the
general LP. Note that the feasible set of an LP is a polyhedron.
Although (3.9) is in standard form as a COP, it is not in LP standard
form. The standard form LP would be
minimize cT x
subject to Ax = b
x < 0.
(LP)
There is also the inequality form LP which very often occurs.
minimize cT x
subject to Ax 4 b.
(3.10)
Now, given the standard form LP (LP), we can find its dual:
minimize bT y
subject to AT y + z = c
z < 0.
(3.11)
3.3 Linear programming
31
fi (x)
f1 (x)
x
f0 (x)
f2 (x)
f3 (x)
i
Figure 3.3: The objective function fi (x) = ceiix+d
for a one dimensional
x+fi
generalised linear fractional problem. For every value of x, the maximum
is taken over all dashed curves. The result is a quasiconvex curve shown
in bold.
3.3.1
Linear-fractional programming
If we replace the linear objective in (3.9) with a ratio of linear functions,
then the problem is called a linear-fractional program.
minimize
subject to
cT x + d
eT x + f
eT x + f > 0
Gx 4 h
Ax = b.
(LFP)
This problem is not convex any more, but quasiconvex. In this case however, there is a more effective approach than the quasiconvex optimization
approach. This problem can be transformed to an equivalent LP, with the
cost of adding one more extra optimization variable and under the condition that the feasible set of (LFP) is nonempty. Consider the LP
minimize cT y + dz
subject to Gy − hz 4 0
Ay − bz = 0
eT y + f z = 1
z < 0.
(3.12)
This problem is equivalent to (LFP) [Boyd and Vandenberghe, 2004]. Also,
if we let x∗ and y ∗ , z ∗ be the optimal points of these two problems, then
32
Convex optimization
x∗
f (x)
F
x∗
F
f (x)
Figure 3.4: Left: Quadratic Program. Right: Quadratically Constraint
Quadratic Program. The shaded region is the feasible set F , the thin curves
are the contour lines of the objective function f and x∗ is the optimal point.
x∗ = y ∗ + z ∗ and y ∗ = x∗ /(eT x∗ + f ) and z ∗ = 1/(eT x∗ + f ).
If we make one small alteration however, this problem becomes much
more difficult and no longer equivalent to LP. We can change the objective,
into
cTi x + di
,
06i<ρ eT x + fi
i
max
which means that we have ρ linear fractional functions, whose maximum
we want to minimize. This problem is the generalised linear-fractional program and is also quasiconvex. We couldn’t however apply the transformation to LP here, as it is possible that for different values of x, we get the
maximum for different i (Figure 3.3). Thus, this problem is significantly
harder.
3.4
Quadratic programming
If the objective in (3.3) is a convex quadratic function and also the equality constraint functions are affine, then we have a quadratic program (QP).
If also the equality constraint functions gi are convex quadratic, then we
have a quadratically constrained quadratic program (QCQP). A QP and a QCQP
3.5 Semidefinite programming
33
can have the forms
minimize 12 xT P x + q T x + r
subject to Gx 4 h
Ax = b,
minimize
subject to
1 T
x P x + qT x + r
2
1 T
x Pi x + qiT x + ri ,
2
i = 0, . . . , p − 1
(QP)
(QCQP)
Ax = b,
accordingly, where x ∈ Rn , P, P0 , . . . , Pp−1 ∈ Sn+ , G ∈ Rp×n , A ∈ Rm×n , h ∈ Rp ,
b ∈ Rm , q ∈ Rn , r ∈ R.
A very well-known example of QP is the least-squares problem or regression analysis, where the objective function is kAx − bk22 . However, this unconstrained minimization problem does have an analytic solution. That is
x = Cb, where
C = V Σ−1 U T
and U ΣV T is the singular value decomposition of A [Boyd and Vandenberghe,
2004, A.5.4].
3.4.1
Second-order cone programming
A generalization of the QCQP is the problem
minimize cT x
subject to kAi x + bi k2 6 eTi + di ,
F x = g,
i = 0, . . . , p − 1
(SOCP)
where x ∈ Rn , Ai ∈ Rni ×n , bi ∈ Rni , c, ei ∈ Rn , di ∈ R, F ∈ Rm×n and g ∈ Rm .
Problem (SOCP) is called a second-order cone program. The name is due to
the fact that we require (Ai x + bi , eTi x + di ) to lie in the second order cone
(2.16) for i = 0, . . . , p − 1.
3.5
Semidefinite programming
We can also allow generalised matrix inequalities for our constraints. An
optimization problem with linear objective and affine constraints including equalities, inequalities and generalised inequalities, is called a conic
34
Convex optimization
form problem or a cone program. We can easily include all inequality constraints into one generalised matrix inequality constraint using simple linear algebra. So we can write the general cone program as
minimize cT x
subject to F x + g 4Ω 0
Ax = b,
(3.13)
where c, g, b are the vector parameters of the problem and F, A are matrices
and Ω is a proper cone. Since proper cones are convex sets, this problem
remains a convex optimization problem.
In the case that Ω = Sn+ , the positive semidefinite cone, then (3.13) is
called a semidefinite program or just SDP. In semidefinite programming,
the inequality constraint can be interpreted as requiring for a symmetric
matrix to be positive semidefinite. The standard form for the SDP is
minimize cT x
subject to x0 F0 + x1 F1 + · · · + xn−1 Fn−1 + G 4 0
Ax = b,
(SDP)
where F0 , F1 , Fn−1 , G ∈ Sn , A ∈ Rm×n and b, c ∈ Rn . We could see the SDP as
a generalisation of the LP. As a matter of fact, if F0 , . . . , Fn−1 , G are all
diagonal, then (SDP) is a LP with n linear inequality and m linear equality
constraints.
There has been much software developed for solving SDP. Due to the
nature of the problem, these solvers very efficient and reliable [Benson
et al., 2000, Benson and Ye, 2008, Vandenberghe et al., 1998].
There are other types of convex optimization problems, like geometric
programming, which we do not mention as we are not considering any applications of these types in this report.
Chapter 4
Applications
In this chapter, we are presenting a set of real optimization problems that
we try to understand better and if possible solve with our python tools. We
are attempting a new perspective on these problems, which will hopefully
lead to better ways of solving them.
4.1
The Virtual Data Center problem
This problem was suggested to us by members of the BT ICT Research
team as a yet unsolved problem related to a service which is still in the
planning process: Infrastracture as a Service. We consider M servers, providing services via a network to N users. These servers could be Data Centers equipped with powerful hardware and the users can be customers,
who can use part of that equipment for their own business, for a lower
cost than buying the equipment on their own. Information is being transferred through a network. There is a cost for every user to connect to any
server, which could be latency, or any other performance measure. The objective is to assign users to servers, in a way that maximizes performance
by minimizing the connection cost. There are capacity constraints on the
servers. For simplicity, we assume that all users take up the same amount
of resources, so the constraint will be on the maximum amount of users a
server can be connected to. Another constraint, is that only one server can
be connected to each user. We don’t allow for a user to be connected to
more than one server.
In case that M = N , this is the Assignment problem, where N people
have to be assigned to do N jobs with cost αij for person i to do job j.
If we were to allow any number of users to be connected to any number
of servers, this would be the general case of the Transportation Problem
36
Applications
[Garvin, 1960].
But let’s formulate this problem rigorously. Let C be the cost matrix we
described above. C will be M × N and cij will be the cost for user j to be
connected to server i. Our optimization variable X, will be also an M × N
matrix with elements 0, or 1. If element ij is 1, that will imply a connection
between server i and user j. There is also an M -dimensional vector a,
whose ith element is the capacity of server i. Our objective then will be to
minimize the sum over all rows of C T X, which is the total connection cost,
under the condition that server i is connected to ai users at most.
In order to be consistent with the LP format and also to implement the
problem more easily, instead of the matrices C and X, we will use c, x
which are both M N × 1 and are the same as the original problem parameters, only ”flattened”. To be more precise, the ij th element of C will be the
(i mod M + j(N − 1))th element of c, and the same for X and x.




c00
x00
 c10 
 x10 




 .. 
 .. 




 . 
 . 
x00 x01 · · ·
c00 c01 · · ·





 xM 0 

c
M0
..
...







.
C = c10
 −→ c = 
 , X = x10
 c01  , x =  x01  .
 . 
 . 
..
..
 .. 
 .. 
.
xM N
.
cM N




c 
x 
 M 1
 M 1
..
..
.
.
This formulation is common in literature and the notation used is
c = vec(C), x = vec(X).
(4.1)
We will use this notation to describe the above transformation from here
forth. After that formulation, the objective will just be cT x and we can now
give the problem in standard LP form. To keep things as simple as possible
we will denote k(i, j) = i mod M + j(N − 1):
minimize cTX
x
subject to
xk(i,j) = 1, j = 0, . . . , N − 1
06i<M
X
xk(i,j) 6 ai , i = 0, . . . , M − 1
(4.2)
06j<N
x>0
The first constraint simply says that exactly one element of every column
of X must be non-zero, which means that exactly one connection is allowed per user. The second is just the capacity constraint on the data
4.1 The Virtual Data Center problem
37
from cvxopt.base import matrix,spmatrix
from cvxopt.solvers import lp
def VDCopt(n DC,n users,cost,capacity):
ln,p,dict=len(capacity),n DC*n users,{}
rindx=[0]*p+[[i+1]*n DC for i in xrange(n users−1)]
rindx[p:]=[rindx[p:][i][j] for i in xrange(n users−1) for j in xrange(n DC)]
cindx=range(p)+[range(n DC*(i+1))[i*n DC:(i+1)*n DC] for i in xrange(n users)]
cindx[p:]=[cindx[p:][i][j] for i in xrange(n users−1) for j in xrange(n DC)]
A=spmatrix(1.0,rindx,cindx)
b=matrix(1.0,(n users,1))
b[0]=n users
capr=[[p+i]*n users for i in xrange(ln)]
caprindx=[capr[i][j] for i in xrange(ln) for j in xrange(n users)]
capc=[range(p)[i::n DC] for i in range(ln)]
capcindx=[capc[i][j] for i in xrange(ln) for j in xrange(n users)]
G=spmatrix([−1.0]*p+[1.0]*(ln*n users),range(p)+caprindx,range(p)+capcindx)
h=matrix([0.0]*p+capacity,(p+ln,1))
a=lp(cost,G,h,A,b,solver=’glpk’)[’x’]
dict[’Optimal value’]=(cost.T*a)[0]
for i in xrange(p):
if a[i]>0.9:
# give solution in the form “u j:(DC i,cost ij)”
dict[’u_%d’%(i/n DC,)]=(’DC_%d’%(i%n DC,),cost[i])
return dict
if
name ==’__main__’:
cost=matrix([10.0, 22.0, 12.0, 21.0, 15.0, 25.0]) # [c00,c10,c01,c11,c02,c12]
capacity=[2,2]
print VDCopt(2,3,cost,capacity)
Program 4.1: VDCopt.py: A simplified function VDCopt which solves the
VDC problem. Its arguments are the number of users and data centers,
a matrix ”cost” which is as vec(C) in (4.1) and a list of integers, for the
capacity of each data center. In this example we have 3 users and 2 data
centers with capacity 2 for each. Although it would be cheaper for all
the users to be connected to the first data center (with corresponding costs
10,12 and 15), the capacity constraint forces the solution: {’Optimal value’:
46.0, ’u 2’: (’DC 0’, 15.0), ’u 1’: (’DC 1’, 21.0), ’u 0’: (’DC 0’, 10.0)}
38
Applications
u_26
u_19
13
10
DC_5
(2)
u_12
DC_3
(6)
u_14
15
10
10
12
DC_4
(3)
u_3
u_17
u_20
10
u_6
12
u_24
u_16
11
DC_0
(2)
DC_1
(2)
10
DC_0
(4)
u_11
u_9
11
10
u_8
10
11
12
DC_2
(5)
u_18
u_27
10
21
u_28
DC_9
(5)
10
u_23
16
10
u_22
10
10
DC_1
(6)
u_29
10
u_25
15
u_1
13
10
u_4
u_0
u_2
12
13
10
u_21
u_2
u_0
u_1
Virtual Data Center optimal assignment
(optimal value : 46)
DC_7
(7)
u_15
u_13
13
10
DC_6
(1)
10
u_7
u_10
DC_8
(1)
10
u_5
Virtual Data Center optimal assignment
(optimal value : 334)
Figure 4.1: The solution for the virtual data center problem. Left: the small
example of Program 4.1. Right: a bigger example with 10 data centers and
30 users. The numbers in brackets denote the capacity of every data center.
The labels on the edges denote the cost of connection between data center
i and customer j.
centers. We don’t need to add any additional constraints to make x a 0,1
vector, as the integer right hand sides of the constraints will force an integer optimal solution (proof in [Garvin, 1960] pages 92,116) given that we
use the simplex method to calculate the solution. That is ensured by using
the ’glpk’ optional solver in CVXOPT [Makhorin, 2003]. Then, the positive
constraint, and the constraint of all elements in one column adding up to
1, will allow any element to be either 0 or 1.
Program 4.1 gives a simple function VDCopt which solves (4.2) for
a small example and returns a dictionary with the assignments and the
value of the minimal cost. In our examples, we are using connection cost
values to be integers in the range of 10 and 25, which is realistic enough
if the cost is latency measured in hops (the number of intermediate routers, or
points between the source and the target of transmission) and the internet, or
any other extended network is used to provide the connections.
4.1 The Virtual Data Center problem
4.1.1
39
Scalability
An important question to ask about such a network is: how does it scale,
as the number of users grow? In other words, we would like to know for
how big a number of users, can our optimum configuration be considered
satisfactory? That depends first of all on the characteristics of the network
itself.
We have considered two cases. First, we consider the case where the
capacities of the data centers came from a uniform-like distribution. In
the second, we took the extreme case of having one big data center with
large capacity and all the others were very small in comparison and practically equivalent to each other. In both cases we range the number of users
from the same number as the data centers to the maximum capacity of the
network. The results are shown in Figure 4.2.
What we can conclude from the results, is that in the first case, the network corresponds extremely well to the increasing load of users. In plain
terms, that means that the provider can guarantee performance to the customer. No matter how many users are added, even when the capacity of
the network is reached, the user will never understand the difference.
In the second case, we see a completely different picture. For a small
number of users, the picture is similar to the previous case. However,
there seems to be a critical number and when exceeded, the average cost,
or cost per user increases dramatically. How soon that critical number
will be reached, depends of course on the number of data centers. In the
bottom right example in Figure 4.2, it may not even be reached even when
the number of users exceeds 3000, although eventually it will.
What the above describe could be two different situations. We can
imagine a company providing the data center service, trying to decide
how to set up their network. One choice would be to decide beforehand
how big the capacity of the network will ever have to be and distribute
that almost evenly among their data centers. The other choice, would be
to build one data centers in a place with high demand, at that present time,
like an industrial city and small ones spread out on a wider area. The large
data center could be possibly upgraded in the future to cope with an increase in the number of customers. At that time, that might be the cheapest
solution, but as the number of customers grew, the first set-up would be
the most reliable and efficient.
Applications
17
16
16
15
15
14
13
13
12
11
11
0
200
400
600
number of users
800
10
1000
17
17
16
16
15
15
cost per user
cost per user
14
12
10
14
13
11
11
500
10
1000 1500 2000 2500 3000
number of users
17
17
16
16
15
15
14
13
11
11
1000
2000
3000
4000
number of users
5000
800
1000
0
500
1000 1500 2000 2500 3000
number of users
13
12
0
400
600
number of users
14
12
10
200
13
12
0
0
14
12
10
cost per user
cost per user
17
cost per user
cost per user
40
10
0
1000
2000
3000
4000
number of users
5000
Figure 4.2: Scalability of the virtual data center problem. Top: 10 data
centers and users ranging from 10 to 1000. Middle: 30 data centers and
users ranging from 30 to 3000. Bottom: 50 data centers and users ranging
from 50 to 5000. Left: Size of data centers uniformly distributed. Right :
Most of the total network capacity is concentrated on one data center and
the rest is evenly distributed among the rest.
4.2 The Lovász ϑ function
4.2
41
The Lovász ϑ function
We are interested in calculating the clique and the chromatic number of a
given graph. Finding the chromatic number of a graph is the same problem as finding the minimum number of colours to be used in a map, so that
no two countries that share borders will have the same colour. Hence the
name ”chromatic”. Another problem related with the chromatic number,
would be the least number of channels to be used in a wireless communications network, so that there will be no interference in the network.
4.2.1
Graph theory basics
First of all, let’s introduce some basic concepts about graphs. Intuitively, a
graph is a set of nodes, some pairs of which are connected. Images come
to mind, varying from two connected dots, to complex representations of
networks (see Figure 4.3).
Definition 20. Let V be a set. Let E ⊆ V, where V is the set of all possible
pairs of elements of V . The ordered pair (V, E), is called the graph (or
undirected graph) with vertices (or nodes) V and edges E. If the elements
of V are ordered pairs, then (V, E) is called a directed graph. We denote
G for the set of all graphs. (Note: Here, we will only discuss undirected
graphs and for a finite set of vertices. So we will drop the characterization
”undirected” and imply it whenever we write ”graph”.)
Definition 21. An induced subgraph of (V, E) is an ordered pair (U, D),
where U ⊆ V , D ⊆ E ∩ U and U is the set of all possible pairs of elements
of U . (Note: Again, we are only going to consider induced subgraphs, so
we will drop the characterisation ”induced”, and just write ”subgraph”.
We will also write F ⊆ G if and only if F is an induced subgraph of G.)
We have
n
n(n − 1)
,
kVk =
=
2
2
where n = kV k. It is convenient notation to use an uppercase letter to
represent a graph. So we can define a graph as G = (V, E) or even G(V, E),
and for later reference we might just write graph G.
Let’s take an example. Let V = {0, 1, 2, 3, 4}. Then
V = {{0, 1}, {0, 2}, {0, 3}, {0, 4}, {1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 4}}.
42
Applications
2
4
1
0
4
1
3
3
2
0
Figure 4.3: Left: a simple graph with 5 nodes and 7 edges. Right: its
complement.
Let E = {{0, 1}, {0, 3}, {0, 4}, {1, 2}, {1, 4}, {2, 4}, {3, 4}}. Now we can define the graph G(V, E), whose graphic representation is shown in Figure
4.3.
The labels of the nodes, are not important, in the sense that the same
exact graph, or ”shape” can be drawn using another set of edges. What is
important however is the topology of the graph. Nor is it important that
we chose natural numbers for the labels. As in the definition, V can be any
set, even not a subset of the real numbers. Here are some more definitions
that will be useful further on.
e = (V, E),
e where
Definition 22. Let G = (V, E) ∈ G. Define the graph G
e = V \ E. Then G
e is called the complement of G.
E
Definition 23. Let G(V, E) ∈ G. If ω(F ) = χ(F ) for every F ⊆ G, then we
say G is perfect.
4.2.2
Two NP-complete problems
There are two problems related with graphs that we are interested in. Both
of them are NP-complete, which means that they cannot be solved in polynomial time. Following the notation given above we give the following.
Definition 24. A graph G(V, E) is complete, if E = V. A subgraph F of G
is called a clique if it is complete. The number of vertices in a largest clique
of G is called the clique number of G. The minimum number of different
labels that can be assigned to each node of G, such that every label appears
in each clique at most once, is called the chromatic number of G.
4.2 The Lovász ϑ function
43
We will denote the clique number of a graph G by ω(G) and its chromatic number by χ(G). As we’ve already mentioned, calculating ω and χ
for an arbitrary graph, are two very complex problems. However, in special cases, there is a way to compute both those numbers in polynomial
time.
4.2.3
The sandwich theorem
In 1981, Grötschel, Lovász, and Schrijver [Knuth, 1994] proved that for any
graph G, there is a real number that lies between ω(G) and χ(G), which
can be calculated in polynomial time. The computation involves solving a
semidefinite program for the complement of the graph and that makes it
a convex optimization problem.
Definition 25 (The Lovász ϑ function). Let G(V, E) ∈ G. Consider the following optimization problem.
maximize trace 1n 1Tn X
subject to xij = 0 for {i, j} ∈ E
trace X = 1
X < 0,
(4.3)
where X = [xij ] ∈ Rn×n , 1n is the n-dimensional column vector of all ones
and n = kV k. Define ϑ : G → R such that ϑ(G) = f ∗ , where f ∗ is the
optimal value of (4.3).
As given, (4.3) is not an SDP. But we can construct an equivalent SDP in
T
standard
P form. First we note that trace 1n 1n X is just another way of writing i,j xij . We can now force the trace X = 1 constraint, by demanding
xn−1,n−1 = 1 −
X
xii .
06i<n−1
We can also enforce the xij = 0 for {i, j} ∈ E constraint a priori. Then our
objective is simplified, as all elements of the diagonal will add to 1 and the
result will be
X
X
xij = 1 + 2
xij .
i,j
i,j|{i,j}6∈E and i6=j
And this is exactly the quantity we want to maximize. The factor of two
will appear, due to symmetry({i, j} ∈ E ⇔ {j, i} ∈ E). Our problem now,
would be maximizing the above quantity, under the condition that the
original matrix stays positive semidefinite. Let m = kEk and d = kVk −
44
Applications
’’’
Calculation of theta(G), for
G=({0,1,2,3,4} , {{0,1},{0,3},{0,4},{1,2},{1,4},{2,4},{3,4}})
’’’
from cvxopt.solvers import sdp
from cvxopt.base import matrix,spmatrix
def thetasmpl(n,m,ce):
’Given the edges of the complement of G, returns theta(G)’
d=n*(n−1)/2−m+n
c=−matrix([0.0]*(n−1)+[2.0]*(d−n))
Xrow=[i*(1+n) for i in xrange(n−1)]+[a[1]+a[0]*n for a in ce]
Xcol=range(n−1)+range(d−1)[n−1:]
X=spmatrix(1.0,Xrow,Xcol,(n*n,d−1))
for i in xrange(n−1): X[n*n−1,i]=−1.0
sol=sdp(c,Gs=[−X],hs=[−matrix([0.0]*(n*n−1)+[−1.0],(n,n))])
return 1.0+matrix(−c,(1,d−1))*sol[’x’]
if
name ==’__main__’:
M,N=7,5 # the number of edges and nodes in G
cedges=[(0,2),(1,3),(2,3)] # list of the edges of the complement of G
print thetasmpl(n=N,m=M,ce=cedges)
Program 4.2: graphex0opt.py. The program returns 2, which is the clique
number and the chromatic number of the complement of G.
m+n. This is the number of non-zero elements in X in the lower triangular
e plus the elements of the diagonal. If we subtract one
(which is exactly kEk)
more element, which will be xn−1n−1 , since we made that redundant, we
have the dimension of our problem. That means that if our optimization
variable is x, then x = [xk−1 (0) , xk−1 (1) , . . . , xk−1 (d−1) ]T ∈ Rd−1 , where k is any
isomorphism
e ∪ {{0, 0}, {1, 1}, . . . , {n − 2.n − 2}} → {0, 1, 2, . . . , d − 1}.
k:E
Now let c be a (d − 1)-dimensional vector. That will be our multiplication
vector in our objective. To make cT x give the result we want, we put zeros
in n − 1 places in c and twos in the rest. The positions of the zeros will
correspond to the positions of the diagonal elements of X in x. We have
to leave the constant outside for now, but we can add it in the end. The
last thing to do, is to construct the constraint X < 0. What is left now
is to place the elements of x back in their original place in X. Of course,
in order to retain the standard form, we will have the constraint −X 4 0
4.2 The Lovász ϑ function
45
instead. So now we have the following problem
maximize cT x X
subject to −
xk−1 (i) Hi − Hd−1 4 0,
(4.4)
06i<d−1
where Hi ∈ Rn×n , 0 6 i < d − 1 are matrices of all zeros except for the
element in position (k −1 (i)) which will be 1. Also, if ci = 0 then the matrix
Hi will have a −1 in the last position (n − 1, n − 1). Finally, Hd−1 ∈ Rn×n
will be a matrix of all zeros, except for the element in the last position
(n − 1, n − 1), which will be 1. Here’s an illustration of the problem’s
parameters, where we place the diagonal elements of X in the first n − 1
positions of x.


x00 x01
···


.
x10 . .

X= .
,
X
 ..
1−
xij 
06i,j<n−1
T
c = [0, . . . , 0, 2, . . . , 2] ∈ R
d−1
, ci = 0 ⇔ i ∈ {0, 1, . . . , n − 1},
x = [x00 , x11 , . . . , xn−2n−2 , . . . , xk(i) , . . .]T , i ∈ {n, n + 1, . . . , d − 1},






0 ··· 0
0 ··· 0
0 ··· 0





..  ,
Hs =  ... 1
Hr =  ... 1 ...  ,
Hd−1 =  ... . . . ...  .
. 
0 . . . −1
0 ... 0
0 ... 1
where 0 6 s < n, n 6 r < d − 1 and the 1 in Hs and Hs goes in the k(s)th
and the k(r)th position accordingly. It can be seen that (4.4), is a SDPin
standard form, without the equality constraints. And if f ∗ is its optimal
value, then ϑ(G) = f ∗ + 1. Program 4.2 gives a function to solves (4.4)
for this example in a very simple but adequate way for now. (A better
function has been constructed to solve the problem and extract the data
for the results furtherdown.)
e E)
e be its
Theorem 8 (The sandwich theorem). Let G(V, E) ∈ G and let G(V,
complement. Then
e 6 ϑ(G) 6 χ(G).
e
ω(G)
(4.5)
(Proof in Lovász [1979].) An immediate result of this theorem, is that
for perfect graphs, χ and ω can be calculated in polynomial time, as the ϑ
function of the complement graph. But even for the general case, having a
lower bound for the chromatic number and an upper bound for the clique
number for any graph, is a big advantage. For historical reasons, here is
the original definition of ϑ as given in Lovász [1979].
46
Applications
Figure 4.4: A geometric random graph with 100 nodes.[.pdf picture generated with generate grg graph.py, written by Dr. Keith M. Briggs.]
Definition 26. Let G(V, E) ∈ G. Also let
A := {A ∈ Sn | aij = 0 if {i, j} ∈ E}.
Denote λ0 (A) > λ1 (A) > . . . λn−1 (A), the eigenvalues of A. Then
λ
(A)
0
,
ϑ(G) = max 1 −
A∈A
λn−1 (A) where n = kV k.
4.2.4
Wireless Networks
We consider a wireless network with n transmitter-receiver devices in fixed
arbitrary positions. We are not interested in all the technical details of the
model, except for the fact that every device receives or transmits information over a fixed radio frequency, or channel. For simplicity, we assume
that all devices have the same radius of transmission. This model can be
represented as a graph G(V, E), where V = {0, 1, . . . , n − 1} and {i, j} ∈ E
if and only if device i is within the transmission radius of device j (and
vice versa of course). If we place n devices randomly on the plane and
choose the edges as described, what we have is a geometric random graph
(Figure 4.4). Every clique in the graph represents a set of devices that can
intercommunicate. That means, that every transmission of a device within
the clique, will be detected by any other device in the clique, that receives
from the same channel.
4.2 The Lovász ϑ function
47
In most cases, this detection is undesired and considered interference.
Then we must assign a different transmission channel to every device in
the same clique, to avoid interference. Even in cases when this detection
is desired, we still may have to assign channels in such a way that every
device in the same clique transmits to a different channel.
Under the above constraints, the problem of finding the least number
of channels to be used in a wireless communications network is exactly
the problem of defining the chromatic number χ of the graph G, that represents the network as described earlier. Then finding the ϑ number of the
complement of G, will give us a lower bound on χ in polynomial time,
as well as an upper bound on ω, the maximum number of devices in the
same transmission radius.
4.2.5
Statistical results
We were interested in investigating the distribution of the ϑ function over
a random sample of graphs with roughly the same number of edges. We
choose the geometric random graph model, because it is a realistic model
for wireless communications applications.
We took a sample of M geometric random graphs of n nodes and radius r. For every graph, the nodes where randomly placed in the unit
square and we varied r from 0 to 1 with step 0.1. That meant that when
r = 0, no nodes would be connected, and the result would always be an
empty graph. When r = 1, for every pair of nodes there was a probability
p that they would be connected with an edge, with π/4 6 p 6 1. To take
one sample, we fixed n, r, generated the geometric random graph with
those attributes, computed its ϑ and repeated the operation M times. We
then plotted the result in a histogram. We varied r as described and thus
generated 10 histograms, which we plotted on the same graph. The results
for M = 500, n = 30 and M = 100, n = 80 are shown in Figure 4.5.
To interpret the result we take into consideration Theorem 8. In the
case of zero radius, the sample is consistent of M empty graphs. The complement of the empty graph, is a complete graph. The χ and ω numbers of
a complete graph are both equal to n, the number of nodes in the graph.
This is simple to understand, as in a complete graph, the whole graph is
a clique. Thus, the largest clique of the graph is the graph itself and the
number of its nodes is ω = n. And it is straightforward to see how χ = n
as well. On the other hand, when r = 1 most of the graphs in the sample
will be complete, or almost complete. That results in empty graph complements, which trivially have ω = χ = 1 or complements that only have
Applications
42
39
36
33
30
27
24
21
18
15
12
90
85
80
75
70
65
60
55
50
45
40
35
30
25
20
9
15
θ
θ
48
6
10
5
3
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
r
0.9
1.0
1.1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
r
Figure 4.5: Distribution of the ϑ function over 10 samples of geometric
random graphs of 30 (left) and 80 (right) nodes on a logarithmic scale.
one edge, or a few but isolated edges, in which case ω = χ = 2. For all the
cases in between, the best information we can acquire for ω and χ is the
Lovász ϑ function, whose distribution we have illustrated in Figure 4.5.
4.2.6
Further study
Except for geometric random graphs and the apparent immediate applications of that model, it is also very interesting to investigate the distribution
of the ϑ function in a more theoretical level. For that purpose we will use
a different random graph model, the Erdős-Rényi random graph. In short,
G(n, p). As implied by the notation, in this model every edge appears
in the graph with probability p. For example, to generate a G(n, 0.5), we
would go through all possible pairs of the n nodes and every time would
connect them with an edge, depending on the outcome of a coin toss.
We computed the distributions of G(n = 50, 0 6 p 6 1) and plotted
them as shown in Figure 4.6. First of all we must stress the fact that although ω and χ take integer values, there is nothing in the definition to imply that for ϑ, except for perfect graphs. However, as shown in Figure 4.6,
the distribution of ϑ is strongly concentrated on integer values, with noninteger values occurring less often but almost certainly. For example, we
know [Lovász, 1979] that for cycle graphs with m nodes and m odd, there
is a closed formula. A cycle graph intuitively, is a graph that looks like a
closed chain. More formally, if Cm = G(V, E), where V = {0, . . . , m − 1} is
4.2 The Lovász ϑ function
49
Figure 4.6: The distribution of ϑ over 1000 samples of 100 G(n, p) graphs
each of n = 50 and 0 6 p 6 1. The horizontal axis is p and the vertical
is the logarithm of ϑ(G). For every observed point a dot is drawn on the
picture. At first, the dot is blue. Whenever a dot has to be drawn on top of
an existing dot, the colour of the dot moves up the spectrum and closer to
red. Thus, blue areas in the graph indicate the lowest frequency and red
indicate the highest frequency.[Data and .png picture acquired in collaboration
with Dr. Keith M. Briggs]
50
Applications
a cycle graph, then E = {{0, 1}, {1, 2}, . . . , {m − 2, m − 1}, {0, m − 1}}. For
these graphs then, if m is odd
ϑ(Cm ) =
m cos(π/m)
.
1 + cos(π/m)
(4.6)
One hypothesis then would be that one case when non-integer ϑ occurs, is
when the graph contains an m-cycle with m odd.
We could make more very interesting hypotheses. For instance, if one
observes at the ϑ of small graphs in Figures 4.16 and 4.17, it seems that
whenever a graph is consistent of smaller isolated subgraphs, the ϑ of the
graph is equal to the sum of the ϑ of its isolated subgraphs. The investigation of these conjectures would make a very interesting research subject on
its own and would much likely contribute to an even more efficient way
of computing the ϑ function.
4.3 Optimal truss design
51
0
0.8
0.80
0
0
0.8
0.8
0
0.80
0.80
0.8
0.80
0.0.80
80
0.80
0.80
0.80
Figure 4.7: A very simple truss.
4.3
Optimal truss design
This example was inspired by ”Truss Design and Convex Optimization” [Freund, 2004]. A truss is a structure mathematically similar to a graph. To
be precise, the set of trusses, could be considered as a subset of the set
of graphs. Bridges are trusses and so are cranes, the roofs on big structures, like train stations and football fields, and the Eiffel Tower. Since
they are real constructions and not theoretical objects, they are either 2dimensional or 3-dimensional. In this section, we are considering only 2dimensional (rectangular) trusses. These examples are applicable mostly
to the design of bridges, but other applications can be found as well.
4.3.1
Physical laws of the system
A truss, as a graph object, consists of nodes and edges. The positions of the
nodes are significant, as are the dimensions of the edges. We will refer to
the edges of a truss as bars, as it is consistent with the structural character
of the model. We will enumerate the nodes and the bars and denote the
length of bar k by Lk . Another important feature of the truss is the material
we will use to construct it. To keep our model simpler and maybe more
realistic, we suppose that all bars are made of the same material. The
material the bars are made of determines the stiffness of the model. A
measure of stiffness is Young’s modulus, denoted E or Y and measured in
units of pressure (pascal). Here is Young’s modulus for different material,
in GPa (Gigapascal) [data taken from Wikipedia [2008c]:
Rubber
Oak wood
Bronze
Titanium
Steel
0.01-0.1
11
103-124
105-120
190
52
Applications
Most nodes of the truss will be free to move, but naturally, some will be
fixed in place, or static. When an external force F is applied on a free node,
the truss will deform elastically and some free nodes will displace. That
causes the bars to stretch or compress, which results on internal forces fk
from the bars to be applied on the nodes they are connecting. These forces
counteract the external force. The equilibrium condition can be written as
Af = −F,
(4.7)
where f is the vector of all internal forces fk and A is the projection matrix.
The k th column of A corresponds to bar k. The rows of A correspond to
the horizontal, or vertical axis, taken once for every free node. Instead of
enumerating the rows of A it would be easier to label them. So if our truss
has 4 nodes and nodes 1, 3 are static, then the rows of A will be labeled
as {0x , 0y , 2x , 2y , 4x , 4y }. Now the jx i element of A will be the projection
of bar i onto the horizontal axis, with j taken as the origin, if j is one of
the nodes that bar i connects. Otherwise it will be zero. Likewise for the
jy i element. If we denote u the vector of displacements of the free nodes,
given projected on the x and y axis, and t the vector of the volumes of the
bars, then the internal force fk on bar k is given by
fk = −
E
t (a )T u,
2 k k
Lk
(4.8)
where ak is the k th column of A. We note that −(ak )T u is the first order
Taylor approximation of the distortion of bar k, which is acceptable for
such small displacements of the nodes. The elastic energy stored in the
truss when under stress, which is the same as the work performed by the
external force F that causes the stress on the truss is given by
1 T
F u.
2
This quantity is also called the compliance of the truss.
4.3.2
The truss design optimization problem
Our objective is to construct the truss to be as stiff as possible. In other
words make it resistant to external forces. Our problem then will be to
find which vector of volumes t will minimize the compliance. We also
have to impose an upper bound on the total volume of bars, otherwise the
solution would be infinite volume for all bars. Such a constraint would
4.3 Optimal truss design
53
express the fact that there is either finite resources to construct the truss,
or even that the cost of the construction must not exceed a certain amount.
We can now write the problem as:
minimize
subject to
1 T
F u
2 X
m
Ek
T
tk 2 (ak )(ak ) u = F
Lk
k=1
(4.9)
Mt 4 d
t < 0.
The first constraint is a result of the equilibrium conditions (4.7) and (4.8).
The second is a linear equality constraint on the volumes of the bars, which
can be the upper bound on the total volume as well as upper bounds on
specific bars. The last constraint only dictates that volumes are positive
quantities. This is equivalent (proof in Freund [2004] pages 37-40) with
the following:
minimize θ
subject to

FT
X

m

Ek
F
T  < 0
tk 2 (ak )(ak )
Lk
k=1
θ
(4.10)
Mt 4 d
t < 0,
which is an SDP. There is also a SOCP equivalent of the problem, but we
prefer the SDP for the reasons described in the end of Chapter 3. To illustrate the SDP form of the problem above better and to be consistent with
our definitions, we rewrite the problem as
minimize cT x
subject to diag(M t − d, −t, −Q) 4 0,
where

(4.11)

FT
X

m

Ek
Q=
T ,
F
tk 2 (ak )(ak )
Lk
k=1
θ
c = [1, 0, . . . , 0]T , x = [θ, t0 , . . . , tm ]T and diag(A0 , . . . , As ) is the block diagonal matrix with blocks A0 , . . . , As .
54
4.3.3
Applications
Results
We have developed a python module (given in the Appendix) for CVXOPT to solve this problem. Some interesting results are shown in Figures
4.9-4.14. Of course our results are weak, in the sense that we don’t consider
variables like construction difficulties, nor do we have the actual data to
know what material would be used or if we should take other factors into
account like seismic resistance, not to mention that the problems we solved
were planar. The models below, show the solutions to the problem from a
mathematical point of view. However, such a theoretical model can prove
useful, as it can produce ideal designs, that may not be constructible, but
into which additional constraints and realistic parameters can be put, resulting to constructible solutions that are as close to the theoretical optimal
solution as possible.
The simplest Figure (4.9), resembles the kingpost design. The design
was used in Medieval, Queen Anne and Gothic Revival Architecture as
roof support and has also been used in early aviation on wire-based aircrafts.
In Figure 4.13, we have the model for the Tyne bridge in NewcastleUpon-Tyne. The bridge is supported on the ground, but the road is higher,
close to the middle of the bridge. The actual bridge has a double arch
on top, whereas our model suggests that the optimal design should have
three or four arches. While experimenting with truss examples, the same
pattern seemed to arise every time. Instead of the triangulation we are
used to see in most trusses, like the Eiffel tower, or all cranes (Figure 4.11),
the model tends to add extra thin layers next to layers that sustain most of
the stress. A characteristic example is displayed in Figure 4.14.
Another interesting pattern, is the shape of the arch that seems to occur
naturally in the design of bridges. We took the bridge model shown in
Figure 4.14 and tried to fit the points on the main arch in different models.
The data seemed to fit almost perfectly to the catenary. That is a curve of
the form α+β cosh γx and represents the shape of a hanging chain, or rope,
when supported at its ends.
4.3 Optimal truss design
55
max vol
min vol
max vol
min vol
2.76
2.
2.
76
Figure 4.8: The legend for the figures below (left: in color, right: in grayscale). Although line thickness is proportional to the actual thickness of
the bars, color labeling is also used for the scaling volumes. A shade closer
to red (black in gray-scale), means larger volume than a shade closer to
blue (white). These figures are zoomable pdf pictures. In quadruple zoom
level or above, one can see the actual volumes on the bars (in arbitrary
units). Nodes are drawn as small gray disks. Bigger black disks denote
static nodes. Forces are drawn with black arrows.
76
2.76
8.
29
2.76
29
8.
Figure 4.9: The king post design. Photo taken from Wikipedia [2008a]
56
Applications
78
78
0.
2.72
1.94
1.17
4.86
4.86
4.86
0.
0.
78
78
0.
3.50
0.
78
0.
78
0.39
78
78
0.
78
0.
78
1.17
0.
0.39
24
0.
1.94
2.72
3.50
8.1
3.
8.1
0
0
0
8.1
0
8.1
3.
24
4.86
4.86
4.86
Figure 4.10: Optimal hanging sign designs.
1.02
3
2.0
4.2
2
0.34
0.30
8
3.6
0
2.3
0.31
5
5.7
0.39
0.76
31.81
4
2.5
17.81
17.81
26.72
3.8
2
26.72
22.27
22.27
24.81
24.81
3.8
22.90
22.90
2
2
19.09
2
3.8
3.8
2
3.8
21.00
4.89
4
2.5
21.00
5
5.2
7.26
38.17
45
25.
17.18
1.02
0.89
2
11.35
0.86
2.7
8
15.85
0.84
3.2
17.76
4
9.54
6.09
9.64
12.80
0.91
4.6
72
12.
9.54
17.18
18.33
19.09
3.18
19.09
2
1.5
12.
72
12.72
15.91
Figure 4.11: Optimal design for a simple crane.
4.3 Optimal truss design
6.0389
1.9511
9.047
12.
9
5.972
077
78
4.5425
8.4447
6.0556
9
31.66
1.1946
78
31.66
498
8
077
9
12.6671
498
6.0389
9.047
2.7
9
12.
1.1946
5.972
2.7
6.0389
6.0556
6.0389
8.4447
1.9511
138
6.8
4.5425
5
9.755
57
9.755
6.8
138
5
8
940
940
2
618
14.
2
3
14.
14.6182
13.8617
940
2
44.
44.
2
618
14.
13.8617
78
31.66
31.3
212
14.6182
11.356
31.66
78
212
31.3
78
31.66
31.66
2
14.6182
940
2
44.
6.3336
6.3336
78
6.3336
618
2
3.6088
9
10.614
13.8617
3.6088
9
10.614
2
618
3
11.356
14.
14.6182
6.3336
13.8617
6.3336
44.
6.3336
Figure 4.12: Optimal design for a train bridge.
0.13
0.40
1.32
33.78
39.28
13.79
19.02
14.92
28.06
2.72
19.89
3.38
4.06
8.63
27.17
19.69
10.13
1.97
1.30
1.61
1.61
0.99
3.58
4.36
1.78
3.09
1.00
1.26
0.86
0.86
0.86
19.02
32.81
4.21
0.86
1.26
2.24
4.36
3.09
3.58
1.00
2.24
0.99
3.15
14.92
9.01
21.92
3.15
28.06
1.32
2.22
32.81
28.11
1.97
2.72
19.02
8.63
0.40
15.14
28.11
2.62
19.89
15.14
2.22
39.28
2.62
33.78
1.78
13.79
21.92
4.06
19.02
3.38
10.13
1.30
0.13
9.01
19.69
27.17
9.35
6.34
0.47
2.33
2.33
17.15
4.36
17.15
4.36
1.28
4.36
4.22
4.76
4.36
40.29
2.12
2.68
2.12
24.23
21.20
11.28
2.33
40.29
12.47
21.82
4.36
12.47
6.24
12.47
0.93
0.93
64.61
2.33
4.36
2.33
2.33
52.15
67.40
4.36
4.36
4.66
25.65
2.33
4.66
40.29
64.61
6.24
21.82
25.65
4.36
2.33
4.76
0.47
40.29
1.28
12.47
4.36
0.47
0.56
4.22
4.36
19.49
21.20
0.79
11.28
0.56
2.81
24.23
52.15
2.04
3.65
9.35
0.79
67.40
3.95
7.54
1.06
2.68
2.68
3.65
4.36
4.36
5.61
4.36
4.36
0.15
4.36
4.36
4.36
4.36
4.36
4.36
2.81
4.36
4.36
1.33
2.68
2.68
21.85
21.20
12.68
0.15
2.81
5.61
1.06
7.54
4.36
3.31
12.68
6.00
21.20
6.00
7.31
3.31
2.81
21.85
3.95
7.31
4.21
2.04
6.34
0.47
19.49
Figure 4.13: The Tyne bridge model. Photo taken from Wikipedia [2008b]
58
Applications
1.3492
8.2499
8.2500
13.4922
82.5004
2.3659
8.2499
140.2476
3.0501
10.1668
92.6495
0.0022
8.2500
8.2500
15.9447
73.2713
3.6765
18.2156
2.3659
8.2499
30.8831
3
33.3106
8.2500
8.2499
8.2500
2.3659
3
12.0151
2.4531
2.7722
8.2499
8.2500
8.2499
2.3659
4
2.3659
9.3937
4
1.8202
1.8202
91.826
8.2500
61.766
3.0501
33.3106
2.7722
2.4531
91.826
3.0501
61.766
18.2156
3.6765
30.8831
12.0151
3.0501
4.0477
28.7978
0.0022
73.2713
15.9447
2.3659
3.0501
10.1668
92.6495
61.7973
8.2499
28.7978
8.2500
140.2476
8.2499
1.3492
4.0477
82.5004
13.4922
4.0885
4.6969
1.6743
8.2497
8.2498
0.9242
0.9331
5.4595
7.4190
0.0041
2.3659
2.3660
1.8571
8.2498
4.6969
3.7666
3.7666
1.8057
3.7666
1.5982
16.2599
7.5161
10.5922
9
10.999
13.7497
8.2499
10.8574
4.7259
8.2499
3.4960
4.7259
7.5162
8.2499
3.9990
8.2499
12.2654
19.9948
3.6347
17.7244
6.7960
8.2499
0.2520
8.2499
7.4197
1.8572
2.3660
8.2498
8.2499
8.2498
8.2499
2.3660
7.4197
1.6083
0.0462
6.6048
4.1126
27.1691
7.4190
1.8571
8.2499
3.5281
0.0221
19.5944
2.3660
8.2498
8.2499
8.2498
8.2499
1.8571
1.8572
2.3660
8.2499
8.2499
7.4190
6.6048
8.2499
0.0462
0.2520
8.2499
6.7961
1.9565
1.8571
8.2498
8.2500
8.2498
8.2498
8.2498
8.2500
8.2499
8.2499
2.3659
2.3660
0.9331
5.4595
7.4190
8.2497
6
0.8179
96.447
0.8182
1.6743
4.2588
257.141
3
7.5161
8.2499
20.537
22.7221
7.5161
8
10.999
8.2499
9
3.7666
8
3.7666
10.999
1.8057
5.0923
13.7497
3
1.5982
3.9990
20.537
8.2499
6.3151
5.0923
22.7221
4.7259
9
4.7259
6.1837
4.2587
0.8225
7.5003
2
10.818
257.141
0.6929
26.8600
1.9565
19.5944
3.4960
2.7257
19.9948
3.6346
4.7259
2.0565
2.5551
12.420
9
10.999
10.8575
1.7034
9
4.6969
0.8182
0.0221
3.5281
27.1691
4.1126
2.7257
2
10.818
1.6083
0.6929
0.8225
17.7244
16.2599
0.9237
6
0.9237
15.9452
4.6969
26.8600
12.2654
10.5922
13.7497
6.1837
13.7497
0.0041
4.0885
7.5003
11.4461
9
2.5552
2.0564
5.0923
12.420
6.3151
5.0923
6.6014
0.8179
6.6014
96.447
15.9452
9.3937
0.9242
11.4461
1.7035
4.7259
Figure 4.14: Optimal design for a road bridge.
1
0
−1
−2
f(x)
−3
−4
−5
−6
−7
−8
−12
−11
−10
−9
−8
−7
−6
−5
−4
−3
−2
−1
0
x
1
2
3
4
5
6
7
8
9
10
11
12
Figure 4.15: Fitting of the arch of the model in Figure 4.14. The blue line
represents the arch and the red line is the graph of f (x) = α − β cosh(γx).
For this drawing the parameter values α = 4.2, β = 4.1, γ = 0.15 were
used. However in the fitting process, the standard errors came out around
1, 1, 0.01 accordingly, which means that the data could be fitted by a wide
range of catenary curves.
4.4 Proposed research
4.4
4.4.1
59
Proposed research
Power allocation in wireless networks
We consider a problem taken from the field of wireless telecommunications [Boyd and Vandenberghe, 2004, page 196]. Consider n pairs of transmitters (T x ) and receivers (Rx ), located in different positions in a closed
area. To be realistic, we must think that every transmitter is closer to its
pair receiver, than to any other receiver. Every transmitter Tix is supplied
with power pi . The signal power at Rjx from Tix will be proportional to pi
and inversely proportional to some power of the distance Rjx and Tix . Of
course at Rix , the only desirable signal is from Tix . The sum of all signal
powers from all Tjx , j 6= i is considered interference.
Let G ∈ Rn×n be the matrix of all path gains from every transmitter to
every receiver. In short, that means that Gij pi will be the signal power from
Tix at Rjx . Let C be the matrix whose non-zero elements are on the diagonal
and equal to the diagonal elements of G. Let also R = G − C. Obviously,
I will differ with G only on the diagonal, the elements of which will be
all zeros for R. Now let S = Cp and I = Rp, where p = [p0 , . . . , pn−1 ]T
and S, I ∈ Rn . Then the (desired) signal power at Rix will be Si and the
interference at Rix will be Ii .
The objective here will be to have the best possible signal power with
the least possible interference at every receiver. Since we take the position
of the devices to be fixed, we can only vary p, which will be the optimization variable. If we also add noise σ to our model, the objective will be to
maximize
Si
min
06i<n Ii + σi
which is the minimum signal to interference plus noise ratio (SINR) over all
receivers. There will be upper bounds pmax ∈ Rn on the maximum power
to be allocated to every receiver (either due to legal restrictions or electrical capacity of the device), and also upper bounds pR ∈ Rn on the total
received power at every receiver. Now we can write the problem as
minimize
subject to
−Si
06i<n Ii + σi
p 4 pmax
Gp 4 pR
max
(4.12)
where we have used the obvious equivalence of minimizing over the maximum of −SINR to maximizing over the minimum of SINR for the sake of
60
Applications
consistency. Both inequalities are componentwise, as already discussed in
section 2.4.
This problem is an example of generalised LFP and as such, it is quasiconvex. We have tried solving it as described in section 3.1, but the solvers
we used seem to fail. We also tried different approaches than our python
tools, but we were unable to find a complete package designed to cope
with this kind of problems. The difficulty is that the objective is not a
smooth function. Discontinuities in the gradient, make the solvers we
used produce irrelevant results, due to the fact that they rely on the gradient methods which as it seems require a certain degree of smoothness.
Appendix
truss optimize 05.py
A truss optimal design module for CVXOPT.
#!/usr/bin/python
# optimize a truss design using semi-definite programming
# Costas Skarakis 2008-07-23
# theory from “Truss Design and Convex Optimization” by Robert M. Freund
from sys import argv,exit
from math import pi,cos,sin,hypot,atan2
from time import time
from array import array
import random
import cPickle
try:
from cvxopt.base
import matrix,spmatrix,spdiag,sparse
from cvxopt.solvers import sdp
except:
print ’cvxopt not found - please install from http://abel.ee.ucla.edu/cvxopt’
exit(1)
try:
from reportlab.pdfgen import canvas
from reportlab.lib.colors import gray,black
got reportlab=True
except:
print ’ReportLab not found - please install from http://www.reportlab.org/’
got reportlab=False
try:
from hue import h2tuple
except:
print ’No module hue - using grayscale’
def h2tuple(h):
return (h,)*3
pictscale=33
costpervolunit=1.0
class Graph:
’A graph class’
def init (s,E=[ ],n=1):
s.d=None
E=flat(E)
assert False not in [type(i)==int for i in E]
assert len(E)%2==0
assert True not in [E[i]==E[i+1] for i in range(len(E))[::2]]
if not E:
s.V=range(n)
else:
s.V=range(max(E+[0])+1)
s.E=(array(’i’,list(E[::2])),array(’i’,list(E[1::2])))
s.m=len(s.E[0])
s.n=len(s.V)
def neg (s): # complement(graph)
if not s.d:
s.d=dict([(i,[ ]) for i in s.V])#range(s.n)])
for i,j in zip(s.E[0],s.E[1]):
s.d[i].append(j)
s.d[j].append(i)
c=Graph(n=s.n)
l=(s.n*(s.n−1))/2−s.m
a0=array(’i’,[0]*l)
a1=array(’i’,[0]*l)
count=0
for i in s.V:#range(s.n):
for j in s.V[i+1:]:#range(i+1,s.n): # i<j<n
if not j in s.d[i]:
a0[count]=j
a1[count]=i
count+=1
c.E=(a0,a1)
c.m=l
c.n=s.n
c.d=None
return c
def str (s):
r=’’
if s.V==[0]: return r
for i in zip(s.E[0],s.E[1]):
r+=repr(i[0])+’--’+repr(i[1])+’ ’
for i in s.V:
if i not in s.E[0] and i not in s.E[1]: r+=repr(i)+’ ’
r+=’\n’
return r[:−1]
def repr (s):
r=’Graph([’
for k,l in zip(s.E[0],s.E[1]):
r+=’(’+repr(k)+’,’+repr(l)+’)’+’,’
if r[−1]==’,’:
r=r[:−1]
r+=’])\n’
return r[:−1]
def add (s,o):
o=flat(o)
assert False not in [type(i)==int for i in o]
assert len(o)%2==0
assert True not in [o[i]==o[i+1] for i in range(len(o))[::2]]
s.m+=len(o)/2
d=len(o)/2
c=0
for a in o:
if a not in s.V: c+=1
s.V+=[0]*c
c=0
for a in o:
if a not in s.V:
c+=1
s.V[s.n+c−1]=a
s.V.sort()
s.n=len(s.V)
for r in o[::2]:
s.E[0].append(r)
for r in o[1::2]:
s.E[1].append(r)
return s
def iter (s):
return iter(zip(s.E[0],s.E[1]))
def len (s):
return len(s.E)/2
def add node(s,b):
s.n+=1
s.V=range(n)
s.d=None
# b is redundant
def add nodes from(s,nbunch):
b=flat(nbunch)
s.n+=len(b)−1
s.V=range(s.n)
s.d=None
def add edges from(s,ebunch):
b=flat(ebunch)
s. add (b)
def add path(s,nlist):
k=len(nlist)−1
l=[(−1,−1)]*k
for i in range(k): l[i]=(nlist[i],nlist[i+1])
l.sort()
s.add edges from(l)
def delete node(s,n):
if n in s.V:
s.n=s.n−1
s.V=range(s.n)
def order(s):
return s.n
def size(s):
return s.m
def edges(s):
l=[(s.E[0][i],s.E[1][i])for i in range(s.m)]
return l
def nodes(s):
return s.V
### end of class Graph ###
class Truss(Graph):
’ A truss class ’
def init (s,youngs modulus=2e6,Fd={},lnth=0,hght=0,dist=1,st=[ ],maxuv=0.8):
Graph. init (s)
s.dist=dist
s.bunch=[ ]
# bunch of bars with volumes with upper bounds
s.bd=[ ]
# the upper bounds
if lnth and hght: # make rectangular truss
if not dist<=max(lnth,hght):
print ’\nWarning: dist too large.\n’
def barlength(x,y):
return hypot(x/hght−y/hght,x%hght−y%hght)
def allow(x,y):
xj,yj=x%hght,y%hght
xi,yi=x/hght,y/hght
a=(xj!=yj)
b=(xi!=yi)
c=((xj−yj)!=(xi−yi)) and ((xj−yj)!=(yi−xi))
d=(min(lnth,hght)>2) or c
e=(barlength(x,y)<=hypot(1,dist))
f=abs(barlength(x,y)−hypot(1,1))<1e−14
return a and b and c and e or f
h=hght+1
l=lnth+1
s.st=st
s.lnth=lnth
s.hght=hght
s.add nodes from(range(lnth*hght))
nd=s.nodes()
nd.sort()
s.fr=[i for i in nd if i not in s.st]
s.fr.sort()
for k in range(lnth): s.add path(range(hght*k,hght*(k+1)))
for k in range(hght): s.add path(nd[k::hght])
s.add edges from([(x,y) for x in nd for y in nd[x:] if allow(x,y)])
s.ls=s.edges()[:]
s.ls.sort()
s.L=dict([((i,j),barlength(i,j)) for i,j in s.ls])
s.A=dict([(e,dict([((i,t),0.0) for i in s.fr for t in (’x’,’y’)])) for e in s.ls])
for i,j in s.A: # create the projection matrix
sintheta=(j%hght−i%hght)/s.L[(i,j)]
costheta=(j/hght−i/hght)/s.L[(i,j)]
if i in s.fr:
s.A[(i,j)][i,’y’]=sintheta
s.A[(i,j)][i,’x’]=costheta
if j in s.fr:
s.A[(i,j)][j,’y’]=−sintheta
s.A[(i,j)][j,’x’]=−costheta
keys,keysdeep=s.A.keys(),s.A[(0,1)].keys()
keys.sort()
keysdeep.sort()
s.A=[[s.A[e][i] for i in keysdeep] for e in keys]
else:
s.lnth=lnth
s.hght=hght
s.st=st
# the static nodes of the truss
s.fr=[ ]
# the free nodes of the truss
s.ls=[ ]
# the list of bars lexicographically sorted
s.A=matrix([ ])
s.L={}
s.Y=[youngs modulus]*s.size() # stiffness of the bars
s.maxunitvol=maxuv # maximum volume of a bar
s.cost=maxuv*costpervolunit*s.size()
# maximum cost
s.Fd=dict([((i,t),0.0) for i in s.fr for t in (’x’,’y’)])
for i in Fd:
s.Fd[i]=Fd[i]
s.optd=[maxuv]*s.size()
# optimal design for bar volumes
def repr (s):
r=’Truss(youngs_modulus=’
if s.Y:
r+=s.Y[0]
else:
r+=’2e6’
r+=’,Fd=’+repr(s.Fd)+’,lnth=’+repr(s.lnth)+’,hght=’+repr(s.hght)+’,dist=’
r+=repr(s.dist)+’,st=’+repr(s.st)+’,maxuv=’+repr(s.maxunitvol)+’)\n’
return r[:−1]
def add forces(s,Fd):
for i in Fd:
s.Fd[i]=Fd[i]
def bars(s):
l=s.edges()
return l
def add overall cost constraint(s,cost):
s.cost=cost
def add edge constraints(s,ebunch,upbd):
if not len(ebunch)==len(upbd):
print ’Constraint vectors of different sizes ignored.’
s.bunch+=ebunch
s.bd+=upbd
def clear constraints(s):
s.bunch=[ ]
s.bd=[ ]
s.cost=s.maxunitvol*costpervolunit*s.size()
def solve(s):
s.optd=minimum compliance sdp(s,s.Y,s.Fd,s.bunch,s.bd,s.cost)
def draw(s,pdf fn=’truss.pdf’):
bridgeview(s,s.optd,s.Fd,pdf fn=pdf fn)
### end of class Truss ###
class Easytruss(Truss):
def init (s,youngs modulus=2e6,Fd={},lnth=0,hght=0,dist=1,st=[ ],maxuv=0.8):
s.cFd=Fd
s.cst=st
F,sta={},[ ]
for k in st:
if type(k)==tuple:
sta.append(drooc(k[0],k[1],hght))
else:
sta.append(k)
for k in Fd.keys():
if type(k[0])==tuple:
F[(drooc(k[0][0],k[0][1],hght),k[1])]=Fd[k]
else:
F[k]=Fd[k]
Truss. init (s,youngs modulus,F,lnth,hght,dist,sta,maxuv)
if s.edges():
s.cnodes=dict([(coord(a,s.hght),a) for a in s.nodes()])
s.cedges=dict([((coord(a,s.hght),coord(b,s.hght)),(a,b))for a,b in s.edges()])
def add forces(s,Fd):
F={}
for k in Fd.keys():
if type(k[0])==tuple:
F[(drooc(k[0][0],k[0][1],s.hght),k[1])]=Fd[k]
else:
F[k]=Fd[k]
Truss.add forces(s,F)
def cbars(s):
l=s.cedges
return l
def add edge constraints(s,cebunch,upbd):
s.cbunch=cebunch
ebunch=[(−1,−1)]*len(cebunch)
for i in range(len(cebunch)):
t=[−1,−1]
for j in (0,1):
if type(cebunch[i][j])==tuple:
t[j]=drooc(cebunch[i][j][0],cebunch[i][j][1],s.hght)
else:
t[j]=cebunch[i][j]
ebunch[i]=(t[0],t[1])
Truss.add edge constraints(s,ebunch,upbd)
def solve(s):
s.optd=minimum compliance sdp(s,s.Y,s.Fd,s.bunch,s.bd,s.cost)
def draw(s,pdf fn=’truss.pdf’):
bridgeview(s,s.optd,s.Fd,pdf fn=pdf fn)
### end of class Easytruss ###
def flat(seq,first=1):
’ Removes depth levels from a sequence ’
l=[ ]
for t in seq:
try:
t[0]
l+=flat(t,0)
except:
l.append(t)
return l
def minimum compliance sdp(T,Y,F,bunch=[ ],bd=[ ],cost=0.8):
’’’
Takes a truss object T, a material stiffness list Y, a force
list F and solves the optimal truss design semidefinite
problem. The maximum cost contraint is cost.
Upper bounds on the volumes of certain bars listed in bunch
can be added in bd.
’’’
k=F.keys()
k.sort()
F=[F[i] for i in k]
t0=time()
m,n=T.size(),2*len(T.fr)
print ’Number of variables: %g’%(m+1,)
L=[T.L[i] for i in T.ls]
pt,ro,co,pr,rw,cl=[ ],[ ],[ ],[ ],[ ],[ ]
for i in range(2*len(T.fr)):
for j in range(m):
if T.A[j][i]:
ro.append(i)
co.append(j)
pt.append(T.A[j][i])
A=spmatrix(pt,ro,co)
Q=[(Y[k]/L[k]**2)*(A[:,k]*A[:,k].trans()) for k in range(m)]
c=matrix([1.0]+[0.0]*m)
if bunch and len(bunch)==len(bd):
pr=[1.0]*len(bunch)
rw,cl=range(m+2,m+2+len(bunch)),[1+edgeno(T,i,j) for i,j in bunch]
else:
bunch,bd=[ ],[ ]
Gl=spmatrix([−1.0]*m+[1.0]*m+pr,range(1,m+1)+[m+1]*m+rw,2*range(1,m+1)+cl,(m+2+len(rw),m+1))
hl=matrix([0.0]+[0.0]*m+[cost]+bd)
forth=spmatrix(−1.0,[0],[0],((n+1)*(n+1),1))
fort=[−spdiag([0,q]) for q in Q]
roc,cor=[0]*n,range(1,n+1)
forF=−spmatrix(F+F,roc+cor,cor+roc)
Gs=[sparse([[forth]]+[[f[::]] for f in fort])]
hs=[matrix(forF)]
print ’setup time: %f’%(time()−t0,)
t0=time()
try:
sol=sdp(c=c,Gl=Gl,hl=hl,Gs=Gs,hs=hs,solver=’dsdp’)
except:
sol=sdp(c=c,Gl=Gl,hl=hl,Gs=Gs,hs=hs)
print ’solve time: %f’%(time()−t0,)
return sol[’x’][1::]
def edgeno(B,i,j):
’’’
Given a truss B and an edge (i,j), finds the number
of the edge, or prints an error message.
’’’
r,t=−1,−1
for k in range(B.size()):
if B.ls[k]==(i,j):
t=k
r+=1
if t<0: print i,j,’not an edge’
if r>0: print i,j,’multiple edge’
return t
def bridgeview(B,v,Fd,eps=9e−4,save=True,filename=’_truss_optimize_last.data’,pdf fn=’truss.pdf’):
’’’
Takes a bridge object and a vector of weights and creates
a pdf picture of the bridge with the thickness of the
lines representing the bars, proportional to the thickness of the bars.
The bars are coloured from blue to red, from the bar with the
minimum volume up to the bar with the maximum volume.
If save=True, the data from the last picture
are saved in filename and the picture can be redrawn using
drawlast(filename).
’’’
if not got reportlab: return
psc=pictscale
ht,ln=B.hght,B.lnth
if save:
f=open(filename,’w’)
cPickle.dump((B.bars(),v,Fd,B.L,ln,ht,B.st,B.fr),f)
f.write(’\n’)
f.close()
u=v[:]
v=[v[edgeno(B,i,j)]/B.L[(i,j)] for i,j in B.ls]
re,gr,bl=h2tuple(0);
re,gr,bl=h2tuple(1);
c=canvas.Canvas(pdf fn,(psc*ln,psc*ht))
coo=[(i/ht,i%ht,j/ht,j%ht,v[edgeno(B,i,j)],u[edgeno(B,i,j)],i,j) for i,j in B.ls]
m,M=min(v),max(v)
mc,Mc=min(u),max(u)
n,N=min([abs(Fd[i]) for i in Fd]),max([abs(Fd[i]) for i in Fd])
scale=1.0/(M−m)
if Mc==mc:
scalec=1.0
else:
scalec=1.0/(Mc−mc)
if N==n:
elacs=1.0
else:
elacs=1.0/(N−n)
def pict(c):
c.setFont(’Helvetica’,10*psc/500.0)
drawnode=[ ]
for a,b,e,d,f,g,i,j in coo:
if g<eps:
continue
drawnode+=[i,j]
str=unicode(’%.2f’%g)
c.setLineWidth(30*scale*(f−m)*psc/500.0)
re,gr,bl=h2tuple(1−scalec*(g−mc))
c.setStrokeColorRGB(re,gr,bl)
c.line(psc*a+(psc/2.0),psc*b+(psc/2.0),psc*e+(psc/2.0),psc*d+(psc/2.0))
c.translate((psc/2.0)*(a+e)+(psc/2.0),(psc/2.0)*(b+d)+(psc/2.0))
c.rotate(abs(atan2(a−e,b−d))*180.0/pi−90)
c.drawString(0,psc/100.0,str)
c.rotate(−abs(atan2(a−e,b−d))*180.0/pi+90)
c.translate(−(psc/2.0)*(a+e)−(psc/2.0),−(psc/2.0)*(b+d)−(psc/2.0))
c.setStrokeColorRGB(1,1,1)
c.setLineWidth(2*psc/500.0)
for i in B.st:
c.circle(psc*(i/ht)+(psc/2.0),psc*(i%ht)+(psc/2.0),psc/12.0,1,1)
c.setFillGray(0.7)
for i in B.nodes():
if i in drawnode:
c.circle(psc*(i/ht)+(psc/2.0),psc*(i%ht)+(psc/2.0),psc/25.0,1,1)
for n in B.fr:
if not [Fd[(n,s)] for s in (’x’,’y’)]==[0.0,0.0]:
arrow(c,n,Fd[(n,’x’)],Fd[(n,’y’)])
def arrow(c,x,xh,xv):
c.setStrokeColor(black)
c.setFillColor(black)
h,v=elacs*(xh−n),elacs*(xv−n)
c.setLineWidth(psc/100.0)
sc=hypot(h,v)*psc/2.5
i=x/ht
j=x%ht
c.translate(psc*i+(psc/2.0),psc*j+(psc/2.0))
c.rotate(atan2(v,h)*180.0/pi)
p=c.beginPath()
p.moveTo(0,0)
p.lineTo(sc,0)
p.lineTo(sc−20*psc/500.0,20*psc/500.0)
p.lineTo(sc−20*psc/500.0,−20*psc/500.0)
p.lineTo(sc,0)
c.drawPath(p,1,1)
c.rotate(−atan2(v,h)*180.0/pi)
c.translate(−psc*i−(psc/2.0),−psc*j−(psc/2.0))
pict(c)
c.showPage()
c.save()
print ’acroread %s’%pdf fn
print ’evince %s’%pdf fn
def drawlast(filename=’_truss_optimize_last.data’):
’ Draws picture recovered from data file. ’
f=open(filename,’r’)
T=Truss()
barlist,T.optd,T.Fd,T.L,T.lnth,T.hght,T.st,T.fr=cPickle.load(f)
f.close()
T.add edges from(barlist)
barlist.sort()
T.ls=barlist
T.draw()
def coord(i,h):
if (type(i),type(h))!=(int,int):
print ’Taking integer part’
i,h=int(i),int(h)
return i/h,i%h
def drooc(x,y,h):
if y>h:
print ’y must be less than the height’
return −1
slash=[i/h for i in range((x+1)*h+y*h)]
modulus=[i%h for i in range((x+1)*h+y*h)]
re=[i for i in range((x+1)*h+y*h) if slash[i]==x and modulus[i]==y]
if len(re)!=1:
print ’Error’
return re[0]
def example(no=0,fname=’truss_example_’):
numberofexamples=7
if no>=numberofexamples:
print ’List of examples available: ’,range(numberofexamples)
elif no==0: ## 3X2 ##
st=[0,4]
y=1e6
Fd={}
Fd[(3,’y’)]=−1000.0
T=Truss(youngs modulus=y,Fd=Fd,lnth=3,hght=2,dist=3,st=st)
T.solve()
T.draw(pdf fn=fname+’%i.pdf’%no)
elif no==1: ## 2X5 ##
st=[0,4]
y=1000.0
Fd={}
Fd[(2,’x’)]=−100.0
T=Truss(youngs modulus=y,Fd=Fd,lnth=2,hght=5,dist=3,st=st)
T.solve()
T.draw(pdf fn=fname+’%i.pdf’%no)
elif no==2: ## 5X2 ##
st=[0,8]
y=1000.0
Fd={}
Fd[(3,’y’)]= 50.0
Fd[(3,’x’)]=−100.0
Fd[(7,’y’)]= 50.0
Fd[(7,’x’)]= 100.0
T=Truss(youngs modulus=y,Fd=Fd,lnth=5,hght=2,dist=3,st=st)
T.solve()
T.draw(pdf fn=fname+’%i.pdf’%no)
elif no==3: ## 3X5 ##
st=[0,10]
y=1000.0
Fd={}
Fd[(4,’y’)]=−1.0
Fd[(4,’x’)]= 1.3
T=Truss(youngs modulus=y,Fd=Fd,lnth=3,hght=5,dist=2,st=st)
T.solve()
T.draw(pdf fn=fname+’%i.pdf’%no)
elif no==4: ## 9X4 ##
st=[0,32]
Fd={}
for i in range(4,32,4): Fd[(i,’y’)]=−1000.0
T=Truss(Fd=Fd,lnth=9,hght=4,dist=3,st=st)
T.solve()
T.draw(pdf fn=fname+’%i.pdf’%no)
elif no==5: ## 15X6 ##
st=[12,18,66,72]
Fd={}
for i in range(90)[2::6]: Fd[(i,’y’)]=−200.0
T=Truss(Fd=Fd,lnth=15,hght=6,dist=4,st=st)
T.solve()
T.draw(pdf fn=fname+’%i.pdf’%no)
elif no==6: ## 30X10 ##
print ’This may take a few minutes. . .’
st=[20,30,260,270]
Fd={}
for i in range(300)[::10]:
if i not in st: Fd[(i,’y’)]=−1000.0
T=Truss(Fd=Fd,lnth=30,hght=10,dist=4,st=st)
T.solve()
T.draw(pdf fn=fname+’%i.pdf’%no)
if
name ==’__main__’:
if len(argv)==1:
st=[(0,0),(2,0)]
Fd={}
Fd[((1,1),’y’)]=−100
B=Easytruss(Fd=Fd,lnth=3,hght=2,st=st)
B.solve()
B.draw()
else:
job=int(argv[1])
example(job)
The Lovász number index of small graphs
[.pdf pictures provided by Dr. Keith M. Briggs.]
1
θ=2
2
θ=1
3
θ=3
4
θ=2
5
θ=2
6
θ=1
7
θ=4
8
θ=3
9
θ=3
10
θ=3
11
θ=2
12
θ=2
13
θ=2
14
θ=2
15
θ=2
16
θ=2
17
θ=1
18
θ=5
19
θ=4
20
θ=4
21
θ=4
22
θ=4
23
θ=3
24
θ=3
25
θ=3
26
θ=3
27
θ=3
28
θ=3
29
θ=3
30
θ=3
31
θ=3
32
θ=3
33
θ=3
34
θ=3
35
θ=3
36
θ=3
37
θ=3
38
θ=2
39
θ=2
40
θ=2
41
θ=2.236
42
θ=2
43
θ=2
44
θ=2
45
θ=2
46
θ=2
47
θ=2
48
θ=2
49
θ=2
50
θ=2
51
θ=1
52
θ=6
53
θ=5.001
54
θ=5
55
θ=5
56
θ=5
57
θ=5
58
θ=4
59
θ=4
60
θ=4
61
θ=4
62
θ=4
63
θ=4
64
θ=4
65
θ=4
66
θ=4
67
θ=4
68
θ=4
69
θ=4
70
θ=4
71
θ=4
72
θ=4
73
θ=4
74
θ=4
75
θ=4
76
θ=4
77
θ=4
78
θ=4
79
θ=4
80
θ=4
81
θ=4
82
θ=4
83
θ=4
84
θ=4
85
θ=4
86
θ=4
87
θ=3
88
θ=4
89
θ=3
90
θ=3
91
θ=3
92
θ=3
93
θ=3
94
θ=3
95
θ=3
96
θ=3
97
θ=3.236
98
θ=3
99
θ=3
100
θ=3
101
θ=3
102
θ=3
103
θ=3
104
θ=3
105
θ=3
106
θ=3
107
θ=3
108
θ=3
109
θ=3
110
θ=3
111
θ=3
112
θ=3
113
θ=3
114
θ=3
115
θ=3
116
θ=3
117
θ=3
118
θ=3
119
θ=3
120
θ=3
121
θ=3
122
θ=3
123
θ=3
124
θ=3
125
θ=3
126
θ=3
127
θ=3
128
θ=3
129
θ=3
130
θ=3
131
θ=3
132
θ=3
133
θ=3
134
θ=3
135
θ=3
136
θ=3
137
θ=3
138
θ=3
139
θ=3
140
θ=3
141
θ=3
142
θ=3
143
θ=3
144
θ=3
145
θ=3
146
θ=3
147
θ=3
148
θ=3
149
θ=3
150
θ=3
151
θ=3
152
θ=3
153
θ=3
154
θ=3
155
θ=3
156
θ=3
157
θ=3
158
θ=3
159
θ=3
160
θ=3
161
θ=3
162
θ=3
163
θ=3
164
θ=3
165
θ=3
166
θ=3
167
θ=3
168
θ=2
169
θ=3
170
θ=2
171
θ=2
172
θ=2
173
θ=2
174
θ=2
175
θ=2
176
θ=2
177
θ=2
178
θ=2
179
θ=2
180
θ=2
181
θ=2
182
θ=2
183
θ=3
184
θ=2.236
185
θ=2.236
186
θ=2.236
187
θ=2
188
θ=2
189
θ=2
190
θ=2
191
θ=2
192
θ=2
193
θ=2
194
θ=2
195
θ=2
196
θ=2
197
θ=2
198
θ=2
199
θ=2
200
θ=2
201
θ=2
202
θ=2
203
θ=2
204
θ=2
205
θ=2
206
θ=2
207
θ=1
Figure 4.16: The ϑ number of all small graphs of up to 7 nodes.
1
θ=7
2
θ=6
3
θ=6
4
θ=6
5
θ=6
6
θ=6
7
θ=6
8
θ=5
9
θ=5
10
θ=5
11
θ=5
12
θ=5
13
θ=5
14
θ=5
15
θ=5
16
θ=5
17
θ=5
18
θ=5
19
θ=5
20
θ=5
21
θ=5
22
θ=5
23
θ=5
24
θ=5
25
θ=5
26
θ=5
27
θ=5
28
θ=5
29
θ=5
30
θ=5
31
θ=5
32
θ=5
33
θ=5
34
θ=5
35
θ=5
36
θ=5
37
θ=5
38
θ=5
39
θ=5
40
θ=5
41
θ=5
42
θ=5
43
θ=5
44
θ=5
45
θ=5
46
θ=5
47
θ=5
48
θ=5
49
θ=5
50
θ=5
51
θ=5
52
θ=5
53
θ=5
54
θ=5
55
θ=5
56
θ=5
57
θ=5
58
θ=4
59
θ=5
60
θ=4
61
θ=4
62
θ=4
63
θ=4
64
θ=4
65
θ=4
66
θ=4
67
θ=4
68
θ=4
69
θ=4
70
θ=4
71
θ=4
72
θ=4
73
θ=4
74
θ=4.236
75
θ=4
76
θ=4
77
θ=4
78
θ=4
79
θ=4
80
θ=4
81
θ=4
82
θ=4
83
θ=4
84
θ=4
85
θ=4
86
θ=4
87
θ=4
88
θ=4.001
89
θ=4
90
θ=4
91
θ=4
92
θ=4
93
θ=4
94
θ=4
95
θ=4
96
θ=4
97
θ=4
98
θ=4
99
θ=4
100
θ=4
101
θ=4
102
θ=4
103
θ=4
104
θ=4
105
θ=4
106
θ=4
107
θ=4
108
θ=4
109
θ=4
110
θ=4
111
θ=4
112
θ=4
113
θ=4
114
θ=4
115
θ=4
116
θ=4
117
θ=4
118
θ=4
119
θ=4
120
θ=4
121
θ=4
122
θ=4
123
θ=4
124
θ=4
125
θ=4
126
θ=4
127
θ=4
128
θ=4
129
θ=4
130
θ=4
131
θ=4
132
θ=4
133
θ=4
134
θ=4
135
θ=4
136
θ=4
137
θ=4
138
θ=4
139
θ=4
140
θ=4
141
θ=4
142
θ=4
143
θ=4
144
θ=4
145
θ=4
146
θ=4
147
θ=4
148
θ=4
149
θ=4
150
θ=4
151
θ=4
152
θ=4
153
θ=4
154
θ=4
155
θ=4
156
θ=4
157
θ=4
158
θ=4
159
θ=4
160
θ=4
161
θ=4
162
θ=4
163
θ=4
164
θ=4
165
θ=4
166
θ=4
167
θ=4
168
θ=4
169
θ=4
170
θ=4
171
θ=4
172
θ=4
173
θ=4
174
θ=4
175
θ=4
176
θ=4
177
θ=4
178
θ=4
179
θ=4
180
θ=4
181
θ=4
182
θ=4
183
θ=4
184
θ=4
185
θ=4
186
θ=4
187
θ=4
188
θ=4
189
θ=4
190
θ=4
191
θ=4
192
θ=4
193
θ=4
194
θ=4
195
θ=4
196
θ=4
197
θ=4
198
θ=4
199
θ=4
200
θ=4
201
θ=4
202
θ=4
203
θ=4
204
θ=4
205
θ=4
206
θ=4
207
θ=4
208
θ=4
209
θ=4
210
θ=4
211
θ=4
212
θ=4
213
θ=4
214
θ=4
215
θ=4
216
θ=4
217
θ=4
218
θ=4
219
θ=4
220
θ=4
221
θ=4
222
θ=4
223
θ=4
224
θ=4
225
θ=4
226
θ=4
227
θ=4
228
θ=4
229
θ=4
230
θ=4
231
θ=4
232
θ=4
233
θ=4
234
θ=4
235
θ=4
236
θ=4
237
θ=4
238
θ=4
239
θ=4
240
θ=4
241
θ=4
242
θ=4
243
θ=4
244
θ=4
245
θ=4
246
θ=4
247
θ=4
248
θ=4
249
θ=4
250
θ=4
251
θ=4
252
θ=4
253
θ=4
254
θ=4
255
θ=4
256
θ=4
257
θ=4
258
θ=4
259
θ=4
260
θ=4
261
θ=4
262
θ=4
263
θ=4
264
θ=4
265
θ=4
266
θ=4
267
θ=4
268
θ=4
269
θ=4
270
θ=4
271
θ=4
272
θ=4
273
θ=4
274
θ=4
275
θ=4
276
θ=4
277
θ=4
278
θ=4
279
θ=4
280
θ=4
281
θ=4
282
θ=4
283
θ=4
284
θ=4
285
θ=4
286
θ=4
287
θ=4
288
θ=4
289
θ=4
290
θ=4
291
θ=4
292
θ=4
293
θ=4
294
θ=4
295
θ=4
296
θ=4
297
θ=4
298
θ=4
299
θ=4
300
θ=4
301
θ=4
302
θ=4
303
θ=4
304
θ=4
305
θ=4
306
θ=4
307
θ=4
308
θ=4
309
θ=4
310
θ=4
311
θ=4
312
θ=4
313
θ=4
314
θ=4
315
θ=4
316
θ=4
317
θ=4
318
θ=4
319
θ=4
320
θ=4
321
θ=4
322
θ=4
323
θ=4
324
θ=4
325
θ=4
326
θ=4
327
θ=4
328
θ=4
329
θ=4
330
θ=4
331
θ=4
332
θ=4
333
θ=4
334
θ=4
335
θ=4
336
θ=4
337
θ=4
338
θ=4
339
θ=4
340
θ=4
341
θ=4
342
θ=4
343
θ=4
344
θ=4
345
θ=4
346
θ=4
347
θ=4
348
θ=3
349
θ=4
350
θ=3
351
θ=3
352
θ=3
353
θ=3
354
θ=3
355
θ=4
356
θ=3.236
357
θ=3
358
θ=4
359
θ=3
360
θ=3
361
θ=3.236
362
θ=3
363
θ=3
364
θ=4
365
θ=3
366
θ=3
367
θ=3
368
θ=3
369
θ=3
370
θ=3
371
θ=3
372
θ=3
373
θ=3
374
θ=3
375
θ=3
376
θ=3
377
θ=4
378
θ=3.002
379
θ=3
380
θ=3
381
θ=3
382
θ=3
383
θ=3
384
θ=3
385
θ=3
386
θ=3
387
θ=3
388
θ=3
389
θ=3
390
θ=3
391
θ=3.196
392
θ=3
393
θ=3
394
θ=3
395
θ=3
396
θ=3
397
θ=3
398
θ=3
399
θ=3
400
θ=3
401
θ=3
402
θ=3
403
θ=3
404
θ=3
405
θ=3
406
θ=3
407
θ=3
408
θ=3
409
θ=3
410
θ=3
411
θ=3
412
θ=3
413
θ=3
414
θ=3
415
θ=3
416
θ=3
417
θ=3
418
θ=3
419
θ=3
420
θ=3
421
θ=3
422
θ=3
423
θ=3
424
θ=3
425
θ=3
426
θ=3
427
θ=3
428
θ=3
429
θ=3
430
θ=3
431
θ=3
432
θ=3
433
θ=3
434
θ=3
435
θ=3
436
θ=3
437
θ=3
438
θ=3
439
θ=3
440
θ=3
441
θ=3
442
θ=3
443
θ=3
444
θ=3
445
θ=3
446
θ=3
447
θ=3
448
θ=3
449
θ=3
450
θ=3.317
451
θ=4
452
θ=3
453
θ=3
454
θ=3.236
455
θ=3
456
θ=3.236
457
θ=3
458
θ=4
459
θ=3
460
θ=3
461
θ=3
462
θ=3
463
θ=3
464
θ=3
465
θ=3
466
θ=3
467
θ=3
468
θ=3
469
θ=3
470
θ=3
471
θ=3
472
θ=3
473
θ=3.236
474
θ=4
475
θ=3.236
476
θ=3.236
477
θ=3.236
478
θ=3.236
479
θ=3.236
480
θ=3.236
481
θ=3.236
482
θ=3
483
θ=3
484
θ=3
485
θ=3
486
θ=3
487
θ=3
488
θ=3
489
θ=3
490
θ=3
491
θ=3
492
θ=3
493
θ=3
494
θ=3
495
θ=3
496
θ=3
497
θ=3
498
θ=3
499
θ=3
500
θ=3
501
θ=3
502
θ=3.196
503
θ=3.064
504
θ=3
505
θ=3
506
θ=3
507
θ=3
508
θ=3
509
θ=3
510
θ=3
511
θ=3
512
θ=3
513
θ=3
514
θ=3
515
θ=3
516
θ=3
517
θ=3
518
θ=3
519
θ=3
520
θ=3
521
θ=3
522
θ=3
523
θ=3
524
θ=3
525
θ=3
526
θ=3
527
θ=3
528
θ=3
529
θ=3
530
θ=3
531
θ=3
532
θ=3
533
θ=3
534
θ=3
535
θ=3
536
θ=3
537
θ=3
538
θ=3
539
θ=3
540
θ=3
541
θ=3
542
θ=3
543
θ=3
544
θ=3
545
θ=3
546
θ=3
547
θ=3
548
θ=4
549
θ=3
550
θ=3
551
θ=3
552
θ=3
553
θ=3
554
θ=3
555
θ=3
556
θ=3
557
θ=3
558
θ=4
559
θ=3.236
560
θ=3
561
θ=3
562
θ=3
563
θ=3
564
θ=3
565
θ=3
566
θ=3
567
θ=3
568
θ=3
569
θ=3
570
θ=3
571
θ=3
572
θ=3
573
θ=3
574
θ=3
575
θ=3
576
θ=3
577
θ=3
578
θ=3
579
θ=3
580
θ=3
581
θ=3
582
θ=3
583
θ=3
584
θ=3
585
θ=3
586
θ=3
587
θ=3
588
θ=3
589
θ=3
590
θ=3
591
θ=3
592
θ=3
593
θ=3
594
θ=3
595
θ=3
596
θ=3
597
θ=3
598
θ=3
599
θ=3
600
θ=3
601
θ=3
602
θ=3
603
θ=3
604
θ=3
605
θ=3
606
θ=3
607
θ=3
608
θ=3
609
θ=3
610
θ=3
611
θ=3
612
θ=3
613
θ=3
614
θ=3
615
θ=3
616
θ=3
617
θ=3
618
θ=3
619
θ=3
620
θ=3
621
θ=3
622
θ=3
623
θ=3
624
θ=3
625
θ=3
626
θ=3
627
θ=3
628
θ=3
629
θ=3
630
θ=3
631
θ=3
632
θ=3
633
θ=3
634
θ=3
635
θ=3
636
θ=3
637
θ=3
638
θ=3
639
θ=3
640
θ=3
641
θ=3
642
θ=3
643
θ=3
644
θ=3
645
θ=3
646
θ=3
647
θ=3
648
θ=3
649
θ=3
650
θ=3
651
θ=3
652
θ=3
653
θ=3
654
θ=3
655
θ=3
656
θ=3
657
θ=4
658
θ=3
659
θ=3
660
θ=3
661
θ=3
662
θ=3
663
θ=3
664
θ=3
665
θ=3
666
θ=3
667
θ=3
668
θ=3
669
θ=3
670
θ=3
671
θ=3
672
θ=3
673
θ=3
674
θ=3
675
θ=3
676
θ=3
677
θ=3
678
θ=3
679
θ=3
680
θ=3
681
θ=3
682
θ=3
683
θ=3
684
θ=3
685
θ=3
686
θ=3
687
θ=3
688
θ=3
689
θ=3
690
θ=3
691
θ=3
692
θ=3
693
θ=3
694
θ=3.196
695
θ=3
696
θ=3
697
θ=3
698
θ=3
699
θ=3
700
θ=3
701
θ=3
702
θ=3
703
θ=3
704
θ=3
705
θ=3
706
θ=3
707
θ=3
708
θ=3
709
θ=3
710
θ=3
711
θ=3
712
θ=3
713
θ=3
714
θ=3
715
θ=3
716
θ=3
717
θ=3
718
θ=3
719
θ=3
720
θ=3
721
θ=3
722
θ=3
723
θ=3
724
θ=3
725
θ=3
726
θ=3
727
θ=3
728
θ=3
729
θ=3
730
θ=3
731
θ=3
732
θ=3
733
θ=3
734
θ=3
735
θ=3
736
θ=3
737
θ=3
738
θ=3
739
θ=3
740
θ=3
741
θ=3
742
θ=3
743
θ=3
744
θ=3
745
θ=3
746
θ=3
747
θ=3
748
θ=3
749
θ=3
750
θ=3
751
θ=3
752
θ=3
753
θ=3
754
θ=3
755
θ=3
756
θ=3
757
θ=3
758
θ=3
759
θ=3
760
θ=3
761
θ=3
762
θ=3
763
θ=3
764
θ=3
765
θ=3
766
θ=3
767
θ=3
768
θ=3
769
θ=3
770
θ=3
771
θ=3
772
θ=3
773
θ=3
774
θ=3
775
θ=3
776
θ=3
777
θ=3
778
θ=3
779
θ=3
780
θ=3
781
θ=3
782
θ=3
783
θ=3
784
θ=3
785
θ=3
786
θ=4
787
θ=3
788
θ=3
789
θ=3
790
θ=3
791
θ=3
792
θ=3
793
θ=3
794
θ=3
795
θ=3
796
θ=3
797
θ=3
798
θ=3
799
θ=3
800
θ=3
801
θ=3
802
θ=3
803
θ=3
804
θ=3
805
θ=3
806
θ=3
807
θ=3
808
θ=3
809
θ=3
810
θ=3
811
θ=3
812
θ=3
813
θ=3
814
θ=3
815
θ=3
816
θ=3
817
θ=3
818
θ=3
819
θ=3
820
θ=3
821
θ=3
822
θ=3
823
θ=3
824
θ=3
825
θ=3
826
θ=3
827
θ=3
828
θ=3
829
θ=3
830
θ=3
831
θ=3
832
θ=3
833
θ=3
834
θ=3
835
θ=3
836
θ=3
837
θ=3
838
θ=3
839
θ=3
840
θ=3
841
θ=3
842
θ=3
843
θ=3
844
θ=3
845
θ=3
846
θ=3
847
θ=3
848
θ=3
849
θ=3
850
θ=3
851
θ=3
852
θ=3
853
θ=3
854
θ=3
855
θ=3
856
θ=3
857
θ=3
858
θ=3
859
θ=3
860
θ=3
861
θ=3
862
θ=3
863
θ=3
864
θ=3
865
θ=3
866
θ=3
867
θ=3
868
θ=3
869
θ=3
870
θ=3
871
θ=3
872
θ=3
873
θ=3
874
θ=3
875
θ=3
876
θ=3
877
θ=3
878
θ=3
879
θ=3
880
θ=3
881
θ=3
882
θ=3
883
θ=3
884
θ=3
885
θ=3
886
θ=3
887
θ=3
888
θ=3
889
θ=3
890
θ=3
891
θ=3
892
θ=3
893
θ=3
894
θ=3
895
θ=3
896
θ=3
897
θ=3
898
θ=3
899
θ=3
900
θ=3
901
θ=3
902
θ=3
903
θ=3
904
θ=3
905
θ=3
906
θ=3
907
θ=3
908
θ=3
909
θ=3
910
θ=3
911
θ=3
912
θ=3
913
θ=3
914
θ=3
915
θ=3
916
θ=3
917
θ=2
918
θ=3
919
θ=2
920
θ=2
921
θ=2
922
θ=3
923
θ=3
924
θ=3
925
θ=3
926
θ=3
927
θ=3
928
θ=3
929
θ=3
930
θ=3
931
θ=3
932
θ=3
933
θ=2
934
θ=2.236
935
θ=3
936
θ=2
937
θ=2
938
θ=2
939
θ=2
940
θ=3
941
θ=2
942
θ=2
943
θ=2
944
θ=2
945
θ=2
946
θ=2
947
θ=2
948
θ=2
949
θ=2
950
θ=2
951
θ=2
952
θ=2
953
θ=2
954
θ=2
955
θ=2
956
θ=3
957
θ=2
958
θ=3
959
θ=2
960
θ=3
961
θ=2
962
θ=2
963
θ=2
964
θ=3
965
θ=2.236
966
θ=2.236
967
θ=3
968
θ=2.236
969
θ=2.236
970
θ=2.236
971
θ=2.236
972
θ=2.236
973
θ=2.236
974
θ=2
975
θ=2.236
976
θ=2
977
θ=2
978
θ=2
979
θ=2
980
θ=2
981
θ=2
982
θ=2
983
θ=2
984
θ=2
985
θ=2
986
θ=2
987
θ=2
988
θ=2
989
θ=2
990
θ=2
991
θ=2
992
θ=2
993
θ=2
994
θ=3
995
θ=2.236
996
θ=2.236
997
θ=2.236
998
θ=2.236
999
θ=2.236
1000
θ=2.236
1001
θ=2.236
1002
θ=3
1003
θ=2
1004
θ=2.236
1005
θ=2
1006
θ=2
1007
θ=2.109
1008
θ=2
1009
θ=2
1010
θ=2
1011
θ=2
1012
θ=2
1013
θ=2
1014
θ=2
1015
θ=2
1016
θ=2
1017
θ=2
1018
θ=2
1019
θ=2
1020
θ=2
1021
θ=2
1022
θ=2
1023
θ=2
1024
θ=2
1025
θ=2
1026
θ=2
1027
θ=2
1028
θ=2
1029
θ=2
1030
θ=2
1031
θ=2
1032
θ=2
1033
θ=2
1034
θ=2
1035
θ=2
1036
θ=2
1037
θ=2
1038
θ=2
1039
θ=2
1040
θ=2
1041
θ=2
1042
θ=2
1043
θ=2
1044
θ=1
Figure 4.17: The ϑ number of all graphs of 7 nodes.
Bibliography
Steven J. Benson and Yinyu Ye. Algorithm 875:DSDP5: Software for
semidefinite programming. ACM Transactions on Mathematical Software, 34(3), 2008. URL http://doi.acm.org/10.1145/1356052.
1356057.
Steven J. Benson, Yinyu Ye, and Xiong Zhang. Solving large-scale sparse
semidefinite programs for combinatorial optimization. SIAM Journal on
Optimization, 10(2):443–461, 2000.
Stephen Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge
University Press, 2004.
Robert M. Freund.
Truss design and convex optimization.
Technical
report,
Massachusetts
Institute
of
Technology,
April 2004.
URL http://ocw.mit.edu/NR/rdonlyres/
Sloan-School-of-Management/15-094JSpring-2004/
2172F2C6-E4DB-45EB-AFD3-36C92E2AABB5/0/truss_design_
art.pdf.
Matteo Frigo and Steven G. Johnson. The design and implementation of fftw3. Technical report, Massachusetts Institute of Technology, 2003. URL http://www.fftw.org/. http://www.fftw.org/
fftw-paper-ieee.pdf.
M. Galassi, Jim Davies, and Brian Gough. Gnu scientific library reference
manual (2nd ed.). Technical Report ISBN 0954161734, 2003. URL http:
//www.gnu.org/software/gsl/.
Walter W. Garvin. Introduction to Linear Programming. McGraw-Hill Book
Company, Inc, 1960.
Roger A. Horn and Charles R. Johnson. Matrix Analysis. Cambridge University Press, 1985.
Donald E. Knuth. The sandwich theorem. The Electronic Journal of Combinatorics, 1(A1), 1994. URL http://www.emis.de/journals/EJC/
Volume_1/volume1.html#A1.
L. Lovász. On the shannon capacity of a graph. Information Theory, IEEE
Transactions on, 25(1):1–7, Jan 1979. ISSN 0018-9448.
Andrew Makhorin. Technical report, Department for Applied Informatics,
Moscow Aviation Institute, Moscow, Russia, 2003. URL http://www.
gnu.org/software/glpk/. The GLPK package is a part of the GNU
project, released under the aegis of GNU. (C) Copyright 2000, 2001, 2002,
2003.
Lieven Vandenberghe, Stephen Boyd, and Shao-Po Wu. Determinant maximization with linear matrix inequality constraints. SIAM Journal on
Matrix Analysis and Applications, 19(2):499–533, 1998. URL citeseer.
ist.psu.edu/vandenberghe98determinant.html.
R. Clint Whaley and Antoine Petitet. Minimizing development and
maintenance costs in supporting persistently optimized BLAS. Software: Practice and Experience, 35(2), 2005. URL http://math-atlas.
sourceforge.net/.
Wikipedia.
King post — wikipedia, the free encyclopedia, 2008a.
URL http://en.wikipedia.org/w/index.php?title=King_
post&oldid=227514338. [Online; accessed 30-July-2008].
Wikipedia.
Tyne bridge — wikipedia, the free encyclopedia,
2008b. URL http://en.wikipedia.org/w/index.php?title=
Tyne_Bridge&oldid=229811490. [Online; accessed 21-August2008].
Wikipedia.
Young’s modulus — wikipedia, the free encyclopedia,
2008c. URL http://en.wikipedia.org/w/index.php?title=
Young%27s_modulus&oldid=232681599. [Online; accessed 21August-2008].