Improved box representations of Pareto sets and application to

Mathematical Methods of Operations Research manuscript No.
(will be inserted by the editor)
Improved box representations of Pareto sets and
application to bicriteria multicommodity network
flows
Horst W. Hamacher · Ky Vu
Received: date / Accepted: date
Abstract A successful way to deal with the, in general, prohibitively large
Pareto set of a given multicriteria problem is to find a good representative system of finitely many of these solutions. In this paper, several alternatives to the
area-based box algorithm of Hamacher et al. [8] for finding representative systems of a given bicriteria optimization problem are suggested. It is argued that
the distance approach, represented by the perimeter of the largest representing
box, is better suited for practical applications. The resulting peripheral box
algorithm is analyzed with respect to its worst-case complexity. Using branchand-price methods, it is tested on instances of the bicriteria, multicommodity
network flow problem. Compared with the original area-based algorithm it is
shown to be superior in its output quality and competitive with regard to
running time. A further refinement is proposed by using surrogate models in
which the Pareto curve is interpolated. The resulting (theoretically) speed-up
in computing a box representation with given accuracy indicates the potential
of this approach.
Keywords Representative system · Bicriteria optimization · Multicommodity
flow · Box algorithm
This research has been partially supported by the Federal Ministry of Education and Research Germany, grant DSS Evac Logistic, FKZ 13N12229, and by a travel grant, European
Union Seventh Framework Programme (FP7-PEOPLE-2009-IRSES), grant agreement N 0
246647, and by the New Zealand Government as part of the OptALI project.
Horst W. Hamacher
Department of Mathematics, University of Kaiserslautern, Germany.
E-mail: [email protected]
Ky Vu
Laboratoire d’Informatique de l’Ecole Polytechnique (LIX), France.
E-mail: [email protected]
2
Horst W. Hamacher, Ky Vu
1 Introduction
The multicommodity flow problem is well-known in network optimization and
was extensively studied in the last decades. The problem arises when several
commodities, vehicles or messages share the same network. These commodities
must not only satisfy their own constraints but they also interact with each
other (see [1], [5]).
In practice, optimization models often have to take two or more conflicting
objectives into consideration. In these cases, we can not, in general, find solutions that simultaneously optimize all the objectives. Instead, we look for
Pareto solutions, i.e solutions with the property that none of the objectives
can be improved without worsening one of the other. Finding all such solutions
is the subject of multicriteria (or multiobjective) optimization (see [4] or [14]).
Subsequently, we assume that the reader is familiar with the basic concepts of
this subject.
In this paper, we combine these two classes of problems into one and study
the bicriteria multicommodity flow problem. This problem is interesting both
from a practical and theoretical point of view but it has not been well-studied
(see [12]). We focus on the integer case of the problem. Since the single objective multicommodity flow problem is NP-hard, this obviously also holds
for the bicriteria version of problem. Moreover, it is well-known (see [7]) that
the problem is intractable (i.e the number of Pareto solutions is exponentially large with respect to the input). Therefore, instead of finding the entire
Pareto nondominated set, we concentrate on finding a representative system,
i.e. a finite subset of the Pareto nondominated set satisfying provable quality
measures (see [8]). Based on our past positive experience (e.g. in applications
of the health and management sector, see [6, 9]) we think that this approach
is well-suited for all applications in which the bicriteria multicommodity flow
problem can serve as model.
After formally introducing the bicriteria multicommodity network flow problem in Section 2, we will in the subsequent sections make the following contributions:
– Develop a new version of the box algorithm for finding representative systems of bicriteria optimization problems with respect to the distance accuracy criterion (Section 3).
– Combine the ideas of surrogate modeling and representative systems to
produce several algorithms with competitive performance (Section 4).
– Apply the branch-and-price method to solve the problem in an efficient
way by combining the resulting box methods and exploiting the structure
of the bicriteria multicommodity flow problem (Section 5).
Improved box representation of Pareto sets
3
2 The problem
Let G = (N, A) be a directed graph where N is a set of n nodes and A is a
set of m arcs. Assume that there are K ≥ 2 commodities transported in this
network. For each 1 ≤ k ≤ K, let bki be the supply/demand of commodity k
at node i ∈ N and let ukij be the maximum flow of commodity k through arc
(i, j) ∈ A. Moreover, for each (i, j) ∈ A, there is a capacity uij which limits
the total flow of all commodities moving through that arc.
Let xkij be the flow of commodity k through (i, j) ∈ A. We need to find these
flows in order to minimize the network cost, which is a 2-dimensional linear
function. Formally, the bicriteria multicommodity flow problem can be stated
as
Minimize
!
P
P
ck xkij
P 1≤k≤K P (i,j)∈A ij
k
k
1≤k≤K
(i,j)∈A dij xij
subject to
X
k
xij −
j:(i,j)∈A
X
k
k
xji = bi
∀ i ∈ N, 1 ≤ k ≤ K
j:(j,i)∈A
k
k
0 ≤ xij ≤ uij ∀ (i, j) ∈ A, 1 ≤ k ≤ K
X
k
xij ≤ uij ∀ (i, j) ∈ A.
1≤k≤K
In the special case of two commodities, a modified version of this problem has
been solved by Sedeno-Noda et al. in 2005 [13]. In their paper, negative flows
are allowed and the last constraint is replaced by
X
|xkij | ≤ uij for all (i, j) ∈ A.
1≤k≤K
The authors extended the idea of ”changing variables” in the classical paper
of T.C Hu in 1963 [10] to apply to the multiobjective problem. In this way,
they reduced the multicommodity problem to several single commodity subproblems. While this method is very interesting, it cannot be generalized to
the case of more than two commodities.
3 Box methods for finding representative systems
Geometrically, the Pareto nondominated set YN of a bicriteria minimization
problem looks like a curve (however in the integer case, it is obviously not
a curve). Therefore, its representative system must be a set which is good
enough to approximate this curve, while it is computationally cheaper to get.
A general definition and discussion of representative systems can be found in
the paper of Hamacher et al. 2005 [8]. In that paper, the authors proposed
4
Horst W. Hamacher, Ky Vu
a method called the box method for finding representative systems of a large
class of all discrete bicriteria optimization problems.
In the box method, the representative system Rep is a set of nondominated
points. These points are found by sequentially solving lexicographic ε−constraint
subproblems. Each point is then associated with a rectangle (or box) which represents all nondominated points within it and conversely, every such rectangle
will contain a nondominated point. In the original version of the box algorithm, the predetermined accuracy α of the representative system is reached,
if the largest area of all the resulting boxes is at most α.
We initialize the box algorithm with the starting box R(z 1 , z 2 ) which is defined by the two lexicographical optimal solutions z 1 and z 2 . With Area or
more specifically Area(R(z 1 , z 2 )) we denote its area. Then we iteratively discard unnecessary parts of the box by solving the lexicographic ε−constraint
problems Pε with appropriate values ε. In this way, we generate a collection
of rectangles with decreasing area. The algorithm stops when the accuracy
criterion of the largest area among all boxes is met.
There are two versions of the box method: the a posteriori algorithm and the a
priori algorithm. The main difference between the two algorithms is the order
in which we choose lexicographic ε−constraint subproblems to solve. While
the latter algorithm pre-computes a priori a number of equidistant values for
ε and solves subproblems associated with them, the a posteriori algorithm only
decides the next subproblem after solving the previous one.
Details of the two box algorithms and the following theorems can be found in
Hamacher et al [8].
Theorem 31
The a posteriori algorithm yields an α−representation of YN in which all
representing points are non-dominated after performing at most O(Area/α)
many iterations.
Theorem 32
The a priori algorithm yields an α−representation of YN after computing at
most k = dA/αe − 1 solutions of lexicographic ε−constraint problems.
The previous two box algorithms use the box area to measure the accuracy.
However, in certain applications, we often need other criteria to assess the
accuracy of the representation. The most common measure is defined by the
maximum distance between all non-dominated points and the representation.
Formally, we have the following definition:
Definition 33 Rep ⊆ YN is a representation of YN with distance accuracy
α if for any non-dominated point y ∈ YN , there is some z ∈ Rep such that
ky − zk ≤ α.
Improved box representation of Pareto sets
5
If we use the box area to measure the accuracy as in the previous two algorithms, we might end up with a box which has one very small and one very
large side. Even if the box has small area, the representation might not useful,
since there may be several nondominated points inside that box with a large
difference in one of the objective function values to any of the representing
points. In order to improve this, we will in this section modify the a posteriori
algorithm to produce a representation with fixed distance accuracy.
In the modified algorithm we maintain - as before - a set of boxes. Instead
of looking at the area of boxes, we will, however, in each iteration choose
and update a box with largest side (unless the stopping criterion holds). If
the largest side of that box is horizontal, then we solve the lexicographical
subproblem Pε1 , where

f2 (x)

lexmin
f1 (x)
(Pε1 )

s.t f1 (x) ≤ ε and x ∈ X .
Otherwise, we solve the lexicographcal subproblem (Pε2 ) with

f1 (x)

lexmin
f2 (x)
(Pε2 )

s.t f2 (x) ≤ ε and x ∈ X .
The algorithm stops if we have a set of boxes whose perimeters are all smaller
than 2α.
Definition 34 If this is the case, we say that the resulting representation Rep
has perimeter accuracy α.
We use the perimeter in this setting because if the perimeter of a box is small
enough, then we can control “the error” of the box using other measures. For
example, if a box has the perimeter less than α, then both its diagonal and
large sides are less than α/2. Moreover, the area of the box is also less than
α2 1
Hence any version of the box algorithm using perimeter accuracy as
16 .
stopping criterion will include the area stopping criterion as special case and
will be more powerful.
However, in this set of boxes, there might be some box without any representative point. As was pointed out to us by Kuhn [11], this is the case when
we solve two horizontal and vertical subproblems in sequence. The naive idea
to overcome this problem, is to solve an additional lexicographic ε-constraint
subproblem for each box without representative. However, the box accuracy
is always underestimated in previous steps, and if some box R(a, b) has no
representative, then the two points a and b are close to two nondominated
1
Assume that a and b are two sides of the box. Let p be its perimeter. Since 0 ≤ (a − b)2
it follows that ab ≤
(a+b)2
4
=
p2
16
≤
α2
.
16
6
Horst W. Hamacher, Ky Vu
solutions which are already included in the representative system. Thus, we
can try to expand each box without representative to include either of the two
neighbor nondominated solutions. The two new boxes are likely to be small
(with perimeters less than 2α). Only if both of the two boxes have a perimeter
greater than 2α, we must compute an additional lexicographic ε-constraint
problem associated with the box.
Figure 1 illustrates the idea of the algorithm. In Step 1, the algorithm divides
the starting box into two boxes S1 and S2 . Although the box S2 contains many
nondominated points, it has a small area due to its very small vertical side
(so it would keep this box, and thus a bad representation, if the area accuracy
criterion is used). The horizontal lexicographic ε1 −constraint on the box S1
and the vertical lexicographic ε2 −constraint on the box S2 is applied. The
algorithm continues with the resulting boxes in Step 2 until the perimeters in
list of all boxes is small enough.
Fig. 1 Idea of the a posteriori algorithm with perimeter accuracy
The details of the algorithm are presented in Algorithm 1.
Theorem 35 The a posteriori algorithm with parimeter accuracy produces
a representation for YN with perimeter accuracy α after computing at most
P log4/3 (2)
O( 2α
)
many lexicograhic ε-constraint problems, where P is the perimeter of the starting box R(z 1 , z 2 ).
Proof The number of lexicograhic ε-constraint problems is bounded by the
number of iterations in the While loop and For loop.
Note that after each iteration in the While loop, the perimeters of the resulting boxes are at most 3/4 the perimeter of the original box. Therefore, the
algorithm terminates after a finite number of iterations.
Improved box representation of Pareto sets
Algorithm 1: The a posteriori algorithm with perimeter accuracy
Data: A discrete bicriteria optimization problem, α > 0.
Result: A representation Rep ⊆ YN with perimeter accuracy 2α
Initialization
S := ∅ ; Rep := ∅ ; CheckBox := ∅ ;
Compute the lexicographical minima z 1 and z 2 and the perimeter of R(z 1 , z 2 ) ;
Set Rep := {z 1 , z 2 } ; S := {R(z 1 , z 2 )} ;
while S 6= ∅ do
Choose the box R(y 1 , y 2 ) ∈ S such that its larger side is maximal ;
Remove R(y 1 , y 2 ) from S;
if the large side of R(y 1 , y 2 ) is horizontal then
y 1 +y 2
Solve Pε1 with ε = b 1 2 1 c and obtain optimal solution z ∗ ∈ YN ;
p := (ε + 1, z2∗ − 1) ;
Insert z ∗ to Rep;
if the perimeter of R(y 1 , z ∗ ) > 2α then
Insert R(y 1 , z ∗ ) to S;
else
Insert R(y 1 , z ∗ ) to CheckBox;
end
if the perimeter of R(p, y 2 ) > 2α then
Insert R(p, y 2 ) to S;
else
Insert R(p, y 2 ) to CheckBox ;
end
else
y 1 +y 2
Solve Pε2 with ε = b 2 2 2 c and obtain optimal solution z ∗ ∈ YN ;
p = (z1∗ − 1, ε + 1) ;
Insert z ∗ to Rep;
if the perimeter of R(z ∗ , y 2 ) > 2α then
Insert R(z ∗ , y 2 ) to S;
else
Insert R(z ∗ , y 2 ) to CheckBox;
end
if the perimeter of R(y 1 , p) > 2α then
Insert R(y 1 , p) to S;
else
Insert R(y 1 , p) to CheckBox;
end
end
end
Remove all boxes with at least one representative from Checkbox;
for R(u, v) ∈ Checkbox do
Remove R(u, v) fromCheckbox ;
Find two neighbor nondominated solutions w1 , w2 of u, v from Rep ;
if both perimeters of R(w1 , v) and R(u, w2 ) are greater than 2α then
Solve Pε1 with ε = v1 − 1 and obtain optimal solution z ∗ ∈ YN ;
Insert z ∗ to Rep;
end
end
7
8
Horst W. Hamacher, Ky Vu
All the boxes in the set CheckBox have a perimeter less than or equal to 2α.
Since any non-dominated point z is contained in some box R(a, b) and the
distance between z and a, b is at most one half of the perimeter of R(a, b), the
distance accuracy of the representation follows.
After each iteration in the While loop, the number of boxes in S increases by
at most 1. We claim that, after 2k − 1 iterations, the algorithm produces no
more than 2k boxes, each of which has a perimeter of at most ( 43 )k P .
We prove this claim by induction. Obviously, the claim holds for k = 0, 1.
Assume that the claim holds for all k < s. We prove that the claim also holds
for k = s.
After the first iteration, we have 2 boxes R1 , R2 with perimeters at most
3
i
4 P . Applying the algorithm to each box R , we get: After s − 1 iterations,
s
we have no more than 2 boxes, each of which has a perimeter of at most
( 43 )s ( 34 P ) = ( 43 )s+1 P . In total, we have at most 2s+1 boxes with the above
property, i.e the claim holds for k = s.
From the claim, it follows that the While loop terminates if ( 43 )k P ≤ 2α. It
P
means that the While loop terminates at the latest for k ∗ = dlog4/3 ( 2α
)e.
∗
k
Therefore, the maximum number of iterations in the While loop is 2 − 1 =
P
P log4/3 (2)
)
. The number of iteration in the For
2dlog4/3 ( 2α )e − 1 which is O( 2α
loop is less than the cardinality of the set CheckBox, which means that the
P log4/3 (2)
total number of iterations in both loops is at most O( 2α
)
.
4 A posteriori surrogate-based algorithms
In the preceding section we have shown that the accuracy measured by the
area of the boxes can be replaced by a distance measure while maintaining
a worst-case complexity statement for the running time of the a posteriori
algorithm. In this section we will suggest numerical improvements for the area
and perimiter based class of box algorithms, since both the a priori and the a
posteriori algorithms have some drawbacks.
The main drawback of the a priori algorithm is obvious: it underestimates the
accuracy of the resulting representative system. Usually, we have to solve too
many lexicographic ε-constraint subproblems to obtain a representation which
is unnecessarily better than required. The reason is that, if the representation
consists of k boxes, then the total area of these boxes is less than Area
k , where
Area is the area of the starting box. It means that, the average area value of
these boxes is less than Area
k2 . However, we have no information about these
boxes, so we can only conclude that the accuracy of the representation is Area
k ,
which is much larger than Area
.
k2
On the other hand, for the a posteriori algorithm, we can only assure the
accuracy of representative systems after solving 2k subproblems, where k is
integral. Therefore, it lacks the flexibility to choose the cardinality of the
representative system in advance. And in fact, in order to find a representative
Improved box representation of Pareto sets
9
system with specified accuracy, we often need to find the system with better
accuracy (e.g smaller box area).
When we face the situation where lexicographic ε-constraint subproblems are
difficult and time-consuming to solve, it is necessary to improve both algorithms. Notice that in these algorithms, representative systems are found by
considering the images of the two objectives independent of each other. But
if we look at the Pareto nondominated set, we can consider one objective as
a function of the other objective. Therefore, if we know some nondominated
solutions, we can build a response curve (or a surrogate model) that interpolates the Pareto set. We can use this approximation to identify the next
ε−subproblem to solve.
The advantage of this method is that we can estimate the locations of the next
nondominated solutions, such that we are able to divide a given box into k
sub-boxes which are almost of equal size. In this way, we can use the estimate
Area
Area
for the area of each box and the number of lexicographic
k2 instead of
k
ε−constraint subproblems needed to solve will be reduced significantly.
Our algorithms require several starting nondominated solutions, which can be
found using a very crude approximation accuracy by one of the box algorithms
in the previous sections.
The first modification of the a posteriori algorithm is based on the following
lemma.
Lemma 41 Let z be an arbitrary point inside the box R(x, y) such that Area[R(x, z)] =
Area[R(z, y)]. Then
Area[R(x, z)] ≤
Area[R(x, y)]
.
4
Proof Assume that z divides the horizontal side of R(x, y) into two sub-sides
with corresponding lengths a, b. Then the area of R(x, y) is
Area[R(x, y)]
Area[R(x, z)] Area[R(x, z)]
= (a + b)
+
a
b
2
(a − b)
= 4+
Area[R(x, z)]
ab
≥ 4 Area[R(x, z)],
which finishes the proof.
If the point z ∗ found in Step 2 is not dominated by any nondominated solution,
then the boxes generated by solving the associated lexicographic ε-constraint
problem indeed have the areas smaller than the area of the box R(y 1 , x). Even
if it is not the case, these areas are expected not to differ much from that
value. Therefore they are most likely smaller than 41 the area of the initial box
10
Horst W. Hamacher, Ky Vu
Algorithm 2: A posteriori surrogate-based algorithm with area accuracy.
Data: A discrete bicriteria optimization problem, α > 0.
Result: A representation Rep ⊆ YN with area accuracy α.
Initialization
Find a small set Rep of starting nondominated solutions and a set S of starting boxes.
while S is not empty do
– Step 1: Construct a surrogate model that interpolates the data
{(ς1 , ς2 ) | (ς1 , ς2 )T ∈ YN }.
Denote the resulting response curve by C.
– Step 2: Choose the largest rectangle R(y 1 , y 2 ) ∈ S.
Search along the curve C a point x between y 1 and y 2 such that
Area (R(y 1 , x)) = Area (R(x, y 2 )).
Remove R(y 1 , y 2 ) from S.
– Step 3: Assume x = (ς1 , ς2 )T . Solve Pε1 with ε = bς1 c to obtain optimal solution z ∗ .
Insert z ∗ to Rep.
Set p := (ε + 1, z2∗ − 1).
– Step 4:
If Area(R(y 1 , z ∗ )) > α, insert R(y 1 , z ∗ ) to S.
If Area(R(p, y 2 )) > α, insert R(p, y 2 ) to S.
end
R(y 1 , y 2 ) since the area of R(y 1 , x) is considerably smaller than
(see the difference value
(a−b)2
4ab
Area(R(y 1 ,y 2 ))
4
in Lemma 41).
Assume that S is the set of starting boxes. Using similar arguments as in
the proof of Theorem 35, we conclude for each starting box Ri ∈ S that the
algorithm produces after 2k −1 iterations no more than 2k rectangles inside Ri ,
i
each of which has an approximate area of at most A
. Since ki∗ = dlog4 ( Aαi )e
4k
i
< α, the maximum number of iterations
is the smallest integer k satisfying A
4k
is
X
∗
2ki − 1 <
Ri ∈S
X
21+log4 (
Ai
α
)
Ri ∈S
r
=
X
2
Ri ∈S
r
≤2
Ai
α
(|S|).A
,
α
where A is the area of the first starting box as in Section 3. The last inequality
is due to Cauchy-Schwarz inequality, that is
X
X p 2
Ai ≤ |S|
Ai < |S| · A
Ri ∈S
Ri ∈S
Improved box representation of Pareto sets
11
Compared to O( A
α ) iterations in the a posteriori algorithm with area accuracy
(see Theorm 2.1), the modified algorithm has reduced significantly the number
of iterations.
The advantages of the perimeter accuracy discussed in Section 3 can be combined with the idea of the numerical speed-up of the preceding algorithm. (See
Algorithm 3)
Algorithm 3: A posteriori surrogate-based algorithm with perimeter accuracy.
Data: A discrete bicriteria optimization problem, α > 0.
Result: A representation Rep ⊆ YN with perimeter accuracy α.
Initialization
Find a small set Rep of starting nondominated solutions and a set S of starting boxes.
while S is not empty do
– Step 1: Construct a surrogate model that interpolates the data
{(ς1 , ς2 ) | (ς1 , ς2 )T ∈ Rep}.
Denote the resulting response curve by C.
– Step 2: Choose a rectangle R(y 1 , y 2 ) ∈ S with largest perimeter. Search along the
curve C a point x between y 1 and y 2 such that
Perimeter (R(y 1 , x)) = Perimeter (R(x, y 2 )).
Remove R(y 1 , y 2 ) from S.
– Step 3: Assume x = (ς1 , ς2 )T . Solve Pε1 with ε = bς1 c to obtain optimal solution z ∗ .
Insert z ∗ to Rep.
Set p := (ε + 1, z2∗ − 1).
– Step 4:
If Perimeter (R(y 1 , z ∗ )) > α, insert R(y 1 , z ∗ ) to S.
If Perimeter (R(p, y 2 )) > α, insert R(p, y 2 ) to S.
end
Similar to the a posteriori surrogate-based algorithm with area accuracy, the
estimated maximum number of iterations
P in the a posteriori surrogate-based
algorithm with perimeter accuracy is Ri ∈S 2 Pαi ≤ 2P
α where Pi is the perimeter of the starting box Ri and P is the perimeter of the starting box as in
Section 3.
In both of the preceding algorithms, the points x in Step 2 can be found directly
by solving an equation related to the surrogate model in Step 1 represented,
say, by a function f . For example, for the algorithm with area accuracy, x can
be found by solving the equation
(f (x) − y22 )(y12 − x1 ) =
1 1
(y − y22 )(y12 − y11 ).
4 2
This kind of equation is not difficult to solve, since the surrogate f is often
a simple function. Alternatively, binary seach may be used to find x approximately.
12
Horst W. Hamacher, Ky Vu
For the rest of this section, we present another generic surrogate-based algorithm which is based on the concept of the a priori algorithm. The idea of the
method is to find a small box immediately at each iteration, using some trust
coefficients of the surrogate models. We denote by σ(R) the measure of the
box R. Here σ might be the area, perimeter or any other appropriate measure.
Algorithm 4: Trust coefficient surrogate-based algorithm.
Data: A discrete bicriteria optimization problem, α > 0.
Result: A representation Rep ⊆ YN with σ-accuracy α.
Initialization
Find a small set Rep of starting nondominated solutions and a set S of starting boxes.
while S is not empty do
– Step 1: Construct a surrogate model that interpolates the data
{(ς1 , ς2 ) | (ς1 , ς2 ) ∈ Rep}.
Denote the resulting response curve by C.
– Step 2: Choose any rectangle R(y 1 , y 2 ) ∈ S.
Choose a trust coefficient ξ > 0.
Search along the curve C a point x between y 1 and y 2 such that
σ(R(y 1 , x)) = α − ξ.
Remove R(y 1 , y 2 ) from S.
– Step 3: Assume x = (ς1 , ς2 ).
Solve Pε with ε = bς1 c to obtain optimal solution z ∗ .
Insert z ∗ to Rep.
Set p := (ε + 1, z2∗ − 1).
– Step 4:
If σ(R(y 1 , z ∗ )) > α, then insert R(y 1 , z ∗ ) to S.
If σ(R(p, y 2 )) > α, then insert R(p, y 2 ) to S.
end
The trust coefficients ξ are chosen depending on the specified surrogate models
that we use. Here, the models should be able to generate confidence intervals
for each point ς1 along the horizontal axis (by using, for instance, kriging
techniques). The trust coefficients ξ must satisfy the property: if we have
σ(R(y 1 , x)) = α − ξ, then it would be natural to predict that σ(R(y 1 , x)) ≤ α.
Therefore in Step 4, the box R(y 1 , z ∗ ) will not be inserted into S. Finding
suitable measures of error associated with each surrogate model is one of the
interesting future research topics.
The trust coefficient surrogate-based algorithm can be used to speed up the a
posteriori surrogate-based algorithms, particularly when the required accurate
is relatively small. In this situation, we have a large set of nondominated
solutions that allows us to construct accurate surrogate models to approximate
the Pareto set (thus, trust coefficients are more reliable).
The box methods are based on the idea of iteratively splitting up a set of
boxes into smaller ones. However, when the boxes are small, the a posteriori
Improved box representation of Pareto sets
13
surrogate-based algorithms can be further improved. This can be seen by considering an example with area accuracy: Assume that we need to work with
a box of area 500, while the required area accuracy is 100. If we use the a
posteriori surrogate-based algorithm, we are likely to obtain two boxes with
area between 100 and 125, which are not good enough for the representation.
Thus we need to perform two additional iterations (the generated boxes will
be too small, which is unnecessary). However, if we use the trust coefficient
surrogate-based algorithm, we only need to perform at most 2 iterations.
Note that in the algorithms presented above, additional time for constructing
surrogate models needs to be considered. However, in the case when solving
lexicographic ε−constraint problems is difficult and time-consuming, this additional time may be neglectable.
5 Application to the bicriteria multicommodity networ flow
problem
Due to the structure of the bicriteria multicommodity flow problem, we can
find its representative systems quite fast and efficiently.
Integer programming (IP) subproblems are the main building blocks of the
box methods. They occur very often, in particular in the solution of the lexicographic ε-constraint problems Pε1 or Pε2 . To solve the problem Pε1 , for example,
we have to optimize:
Minimize f2 (x)
(Pε1 [1])
s.t f1 (x) ≤ ε and x ∈ X .
Assume that the minimal objective value of the problem Pε1 [1] is p. Then we
continue to solve the problem:

 Minimize f1 (x)
s.t f1 (x) ≤ ε
(Pε1 [2])

f2 (x) = p and x ∈ X .
This argument shows that solutions of the problem Pε1 can be found by iteratively solving two integer programming problems Pε1 [1] and Pε1 [2]. We can
solve the problems Pε2 and the lexicographical problems P 1 , P 2 in the same
way.
The bicriteria multicommodity flow problem has the nice property that all
the resulting IP subproblems have a similar block structure (as illustrated
in the Figure 2), where each of the independent blocks corresponds to a
(single-commodity) network flow problem. The IPs with this structure can be
solved efficiently by an hybrid of the branch-and-bound and column-generation
method the branch-and-price algorithm (see [2]).
14
Horst W. Hamacher, Ky Vu
Fig. 2 The block structure of IP subproblems
Next, we report on our first experiences with regard to the implementation
and performance of the different versions of box algorithms pesented in this
paper applied to the bicriteria, multicommodity network flow problem. The
algorithms have been implemented in Python, using CPLEX as solver.
To compare and evaluate the quality of the algorithms, we first fix the required
area accuracy (the maximal area of resulting boxes) and then compute the
representations given by those algorithms based on area accuracy. Since each
representative point is found by solving two integer programs with almost the
same size and structure, the running time of an algorithm depends significantly
on the number of representative points.
For the two surrogate-based algorithms, we use a cubic spline interpolation.
However, for generating a set of starting representative points, we use piecewise
linear functions instead. Hence the algorithms behave exactly the same as the
a posteriori algorithm (with area accuracy) in their first steps.
In Table 1, we present implementation results on different network instances
using the following five versions of box algorithms
ALG
ALG
ALG
ALG
ALG
1:
2:
3:
4:
5:
the
the
the
the
the
a priori algorithm with area accuracy
a posteriori algorithm with area accuracy
a posteriori algorithm with perimeter accuracy
surrogate-based algorithm with area accuracy
surrogate-based algorithm with perimeter accuracy
For each algorithm, we use as evaluation criteria of the representation in addition to the number of representative points three other criteria: the maximum
area of boxes, the maximum perimeter of boxes and the running time. The
size of the network is in increasing order, and we choose the area accuracy
in such a way that the maximum number of boxes (precomputation) in the a
priori algorithm is at most 50, 100, 200 and 500.
As can be seen from the table, ALG 1 runs much slower than the four other
algorithms. For example, in the instance where the network consists of 50
nodes and 1185 arcs with 5 commodities, the run time of the algorithm is
more than 10 times the run time of any other algorithm. This confirms our
Improved box representation of Pareto sets
n
50
m
462
K
5
Boxes
49
Area required
840685
50
1187
10
100
5674347.0
50
1185
5
500
244426.0
1000
3000
10
500
16172191.0
Representation
Card of REP
Run time
Max Area
Max Perimeter
Card of REP
Run time
Max Area
Max Perimeter
Card of REP
Run time
Max Area
Max Perimeter
Card of REP
Run time
Max Area
Max Perimeter
15
Alg 1
51
19.43
165612.0
4624.0
102
241.55
796770.0
12074.0
496
638.31
12948.0
2182.0
N/A
N/A
N/A
N/A
Alg 2
11
4.58
548244.0
5972.0
12
27.87
5493305.0
24836.0
26
19.40
284029.0
5920.0
27
358.96
14667100.0
43160.0
Alg 3
12
3.42
619542.0
3666.0
19
46.33
4868469.0
8852.0
37
31.97
252668.0
2104.0
38
502.07
15122688.0
15808.0
Alg 4
9
2.33
822432.0
8328.0
13
29.18
3985650.0
21528.0
27
24.07
283920.0
7222.0
26
333.47
15886680.0
53412.0
Alg 5
13
3.91
546780.0
3278.0
19
39.82
1738143.0
9516.0
37
46.46
234895.0
2106.0
43
604.49
6018600.0
15648.0
Table 1 The implementation result
prediction from the previous sections with regards to the underestimation of
representation accuracy. So we can conclude that the algorithm is inferior to
the others and we will not consider its performance any more.
The a posteriori algorithm with area accuracy (ALG 2) runs faster than the
a posteriori algorithm with perimeter accuracy (ALG 3). The reason is that
we have to convert the perimeter accuracy to the area accuracy, so the area
accuracy is underestimated. Indeed, the maximum area criterion of boxes produced by the a posteriori algorithm with perimeter accuracy is better than the
one generated by the a posteriori algorithm with area accuracy. Especially the
maximal large side length is reduced significantly.
Regarding the number of representative points, we can see that the a posteriori surrogate-based algorithm with perimeter accuracy (ALG 5) is the one
which requires the largest number of representative points (and also the largest
running time).
The representations generated by the a posteriori algorithm (with area/perimeter
accuracy) and the a posteriori surrogate-based algorithm (with area/perimeter
accuracy) are quite similar. We think that the reason the two surrogate-based
algorithms did not perform better is because of the structure of the multicommodity flow problem: the solutions of the problem is quite ”dense” along the
Pareto curve. So it is easy to find a nondominated point in a relatively small
box.
6 Conclusion and future research
In this paper, several alternatives to the area-based box algorithm of Hamacher
et al. [8] for finding representative systems of bicriteria optimization problem
have been suggested. It has been argued that the distance approach, represented by the perimeter of the largest representing box, is better suited for
16
Horst W. Hamacher, Ky Vu
practical applications. The resulting peripheral box algorithm has been analyzed with respect to its worst-case complexity. Using a branch-and-price
methods, it has been tested on instances of the bicriteria multicommodity
network flow problem and it has been shown to be competitive to the areabased algorithm with regard to running time.
The theoretical discussion of using surrogate models in which the Pareto curve
is interpolated and this information is used to (theoretically) speed-up the
computation of a box representation with given accuracy indicates the potential of this approach. The preliminary numerical tests show, however, that
several questions need to be addressed in order to make this approach useful
in practice. These questions include the following
–
–
–
–
Which surrogate models should be used?
How to find starting nondominated points?
How many points are enough to begin the surrogate-based algorithms?
How to choose the trust coefficients ξ in the trust coefficient surrogatebased algorithm?
Answers to these interesting questions which are on the borderline of numerical
analysis and optimization are the subject of current research.
Acknowledgements We would like to thank Tobias Kuhn for pointing out an error in
the computation of the perimeter representation and Marc Goerigk for numerous, helpful
comments on first drafts of our paper.
References
1. Ravindra K. Ahuja, Thomas L. Magnanti & James B. Orlin: Network flows: Theory,
algorithms, and applications, Prentice Hall, Inc., Englewood Cliffs, NJ, xvi+846 pp (1993).
2. Cynthia Barnhart, Ellis L. Johnson, George L. Nemhauser, Martin W. P. Savelsbergh
and Pamela H. Vance : Branch-and-Price: Column Generation for Solving Huge Integer
Programs, Operation Research, Vol 46, 316−329 (1998).
3. Chinchuluun, Altannar and Pardalos, Panos M : A survey of recent developments in
multiobjective optimization, Annals of Operations Research. 154 , 2950 (2007).
4. Matthias Ehrgott: Multicriteria Optimization. Second edition, Springer-Verlag, Berlin,
xiv+323 pp (2005).
5. Fulkerson, D. R: Flows in networks, Recent Advances in Mathematical Programming,
edited by R. L. Graves and P. Wolfe, McGraw-Hill, New York, 319-332 (1963).
6. Horst W. Hamacher & Karl-Heinz Kuefer: Inverse radiation therapy planning - a multicriteria optimization problem, Discrete Applied Mathematics, Vol 118, issue 1,2 145-161
(2002)
7. Horst W. Hamacher, Christian R. Pedersen & Stefan Ruzika : Multiple Objective Minimum Cost Flow Problems: A Review, European Journal of Operational Research, Vol
176, 1404−1422 (2007).
8. Horst W. Hamacher, Christian R. Pedersen & Stefan Ruzika : Finding representative
systems for discrete bicriterion optimization problems, Operations Research Letters, Vol
35 issue 3, 336−344 (2007).
9. Host W. Hamacher, Ruzika, S. & Tanatmis, A.: PROSEL: A Decision Support System for
Projekt Portfolio Selection Based on Multi-Objective Programming, in: Multiple Criteria
Decision Aiding, C. Zopounidis, M. Doumpos (edts.), ISBN 978-1-61668-231-6 (2010) .
Improved box representation of Pareto sets
17
10. Hu, T. C: Multicommodity network flows. Operations Research 11 , 344-360 (1963).
11. Tobias Kuhn: personal communication, (2013).
12. Siamak Moradi : The Bi-objective Multi-Commodity Minimum Cost Flow Problem,
Proceedings of the 45th Annual Conference of the ORSNZ, November 2010.
13. A. Sedeno-Noda, C. Gonzalez-Martin & J. Gutierrez : The biobjective undirected twocommodity minimum cost flow problem, European Journal of Operational Research, Vol
164, 89−103 (2005).
14. R. E. Steuer : Multiple Criteria Optimization: Theory, Computation and Application,
John Wiley, New York, 546 pp, (1986)
15. Wolsey, Laurence A. : Integer Programming, John Wiley & Sons, Inc., New York,
xx+264 pp, (1998).
16. Ky Vu. : Change of Variable Methods and Representative Systems for Multiobjective
Multicommodity Network Flows, Master Thesis, (2012).