Mathematical Methods of Operations Research manuscript No. (will be inserted by the editor) Improved box representations of Pareto sets and application to bicriteria multicommodity network flows Horst W. Hamacher · Ky Vu Received: date / Accepted: date Abstract A successful way to deal with the, in general, prohibitively large Pareto set of a given multicriteria problem is to find a good representative system of finitely many of these solutions. In this paper, several alternatives to the area-based box algorithm of Hamacher et al. [8] for finding representative systems of a given bicriteria optimization problem are suggested. It is argued that the distance approach, represented by the perimeter of the largest representing box, is better suited for practical applications. The resulting peripheral box algorithm is analyzed with respect to its worst-case complexity. Using branchand-price methods, it is tested on instances of the bicriteria, multicommodity network flow problem. Compared with the original area-based algorithm it is shown to be superior in its output quality and competitive with regard to running time. A further refinement is proposed by using surrogate models in which the Pareto curve is interpolated. The resulting (theoretically) speed-up in computing a box representation with given accuracy indicates the potential of this approach. Keywords Representative system · Bicriteria optimization · Multicommodity flow · Box algorithm This research has been partially supported by the Federal Ministry of Education and Research Germany, grant DSS Evac Logistic, FKZ 13N12229, and by a travel grant, European Union Seventh Framework Programme (FP7-PEOPLE-2009-IRSES), grant agreement N 0 246647, and by the New Zealand Government as part of the OptALI project. Horst W. Hamacher Department of Mathematics, University of Kaiserslautern, Germany. E-mail: [email protected] Ky Vu Laboratoire d’Informatique de l’Ecole Polytechnique (LIX), France. E-mail: [email protected] 2 Horst W. Hamacher, Ky Vu 1 Introduction The multicommodity flow problem is well-known in network optimization and was extensively studied in the last decades. The problem arises when several commodities, vehicles or messages share the same network. These commodities must not only satisfy their own constraints but they also interact with each other (see [1], [5]). In practice, optimization models often have to take two or more conflicting objectives into consideration. In these cases, we can not, in general, find solutions that simultaneously optimize all the objectives. Instead, we look for Pareto solutions, i.e solutions with the property that none of the objectives can be improved without worsening one of the other. Finding all such solutions is the subject of multicriteria (or multiobjective) optimization (see [4] or [14]). Subsequently, we assume that the reader is familiar with the basic concepts of this subject. In this paper, we combine these two classes of problems into one and study the bicriteria multicommodity flow problem. This problem is interesting both from a practical and theoretical point of view but it has not been well-studied (see [12]). We focus on the integer case of the problem. Since the single objective multicommodity flow problem is NP-hard, this obviously also holds for the bicriteria version of problem. Moreover, it is well-known (see [7]) that the problem is intractable (i.e the number of Pareto solutions is exponentially large with respect to the input). Therefore, instead of finding the entire Pareto nondominated set, we concentrate on finding a representative system, i.e. a finite subset of the Pareto nondominated set satisfying provable quality measures (see [8]). Based on our past positive experience (e.g. in applications of the health and management sector, see [6, 9]) we think that this approach is well-suited for all applications in which the bicriteria multicommodity flow problem can serve as model. After formally introducing the bicriteria multicommodity network flow problem in Section 2, we will in the subsequent sections make the following contributions: – Develop a new version of the box algorithm for finding representative systems of bicriteria optimization problems with respect to the distance accuracy criterion (Section 3). – Combine the ideas of surrogate modeling and representative systems to produce several algorithms with competitive performance (Section 4). – Apply the branch-and-price method to solve the problem in an efficient way by combining the resulting box methods and exploiting the structure of the bicriteria multicommodity flow problem (Section 5). Improved box representation of Pareto sets 3 2 The problem Let G = (N, A) be a directed graph where N is a set of n nodes and A is a set of m arcs. Assume that there are K ≥ 2 commodities transported in this network. For each 1 ≤ k ≤ K, let bki be the supply/demand of commodity k at node i ∈ N and let ukij be the maximum flow of commodity k through arc (i, j) ∈ A. Moreover, for each (i, j) ∈ A, there is a capacity uij which limits the total flow of all commodities moving through that arc. Let xkij be the flow of commodity k through (i, j) ∈ A. We need to find these flows in order to minimize the network cost, which is a 2-dimensional linear function. Formally, the bicriteria multicommodity flow problem can be stated as Minimize ! P P ck xkij P 1≤k≤K P (i,j)∈A ij k k 1≤k≤K (i,j)∈A dij xij subject to X k xij − j:(i,j)∈A X k k xji = bi ∀ i ∈ N, 1 ≤ k ≤ K j:(j,i)∈A k k 0 ≤ xij ≤ uij ∀ (i, j) ∈ A, 1 ≤ k ≤ K X k xij ≤ uij ∀ (i, j) ∈ A. 1≤k≤K In the special case of two commodities, a modified version of this problem has been solved by Sedeno-Noda et al. in 2005 [13]. In their paper, negative flows are allowed and the last constraint is replaced by X |xkij | ≤ uij for all (i, j) ∈ A. 1≤k≤K The authors extended the idea of ”changing variables” in the classical paper of T.C Hu in 1963 [10] to apply to the multiobjective problem. In this way, they reduced the multicommodity problem to several single commodity subproblems. While this method is very interesting, it cannot be generalized to the case of more than two commodities. 3 Box methods for finding representative systems Geometrically, the Pareto nondominated set YN of a bicriteria minimization problem looks like a curve (however in the integer case, it is obviously not a curve). Therefore, its representative system must be a set which is good enough to approximate this curve, while it is computationally cheaper to get. A general definition and discussion of representative systems can be found in the paper of Hamacher et al. 2005 [8]. In that paper, the authors proposed 4 Horst W. Hamacher, Ky Vu a method called the box method for finding representative systems of a large class of all discrete bicriteria optimization problems. In the box method, the representative system Rep is a set of nondominated points. These points are found by sequentially solving lexicographic ε−constraint subproblems. Each point is then associated with a rectangle (or box) which represents all nondominated points within it and conversely, every such rectangle will contain a nondominated point. In the original version of the box algorithm, the predetermined accuracy α of the representative system is reached, if the largest area of all the resulting boxes is at most α. We initialize the box algorithm with the starting box R(z 1 , z 2 ) which is defined by the two lexicographical optimal solutions z 1 and z 2 . With Area or more specifically Area(R(z 1 , z 2 )) we denote its area. Then we iteratively discard unnecessary parts of the box by solving the lexicographic ε−constraint problems Pε with appropriate values ε. In this way, we generate a collection of rectangles with decreasing area. The algorithm stops when the accuracy criterion of the largest area among all boxes is met. There are two versions of the box method: the a posteriori algorithm and the a priori algorithm. The main difference between the two algorithms is the order in which we choose lexicographic ε−constraint subproblems to solve. While the latter algorithm pre-computes a priori a number of equidistant values for ε and solves subproblems associated with them, the a posteriori algorithm only decides the next subproblem after solving the previous one. Details of the two box algorithms and the following theorems can be found in Hamacher et al [8]. Theorem 31 The a posteriori algorithm yields an α−representation of YN in which all representing points are non-dominated after performing at most O(Area/α) many iterations. Theorem 32 The a priori algorithm yields an α−representation of YN after computing at most k = dA/αe − 1 solutions of lexicographic ε−constraint problems. The previous two box algorithms use the box area to measure the accuracy. However, in certain applications, we often need other criteria to assess the accuracy of the representation. The most common measure is defined by the maximum distance between all non-dominated points and the representation. Formally, we have the following definition: Definition 33 Rep ⊆ YN is a representation of YN with distance accuracy α if for any non-dominated point y ∈ YN , there is some z ∈ Rep such that ky − zk ≤ α. Improved box representation of Pareto sets 5 If we use the box area to measure the accuracy as in the previous two algorithms, we might end up with a box which has one very small and one very large side. Even if the box has small area, the representation might not useful, since there may be several nondominated points inside that box with a large difference in one of the objective function values to any of the representing points. In order to improve this, we will in this section modify the a posteriori algorithm to produce a representation with fixed distance accuracy. In the modified algorithm we maintain - as before - a set of boxes. Instead of looking at the area of boxes, we will, however, in each iteration choose and update a box with largest side (unless the stopping criterion holds). If the largest side of that box is horizontal, then we solve the lexicographical subproblem Pε1 , where f2 (x) lexmin f1 (x) (Pε1 ) s.t f1 (x) ≤ ε and x ∈ X . Otherwise, we solve the lexicographcal subproblem (Pε2 ) with f1 (x) lexmin f2 (x) (Pε2 ) s.t f2 (x) ≤ ε and x ∈ X . The algorithm stops if we have a set of boxes whose perimeters are all smaller than 2α. Definition 34 If this is the case, we say that the resulting representation Rep has perimeter accuracy α. We use the perimeter in this setting because if the perimeter of a box is small enough, then we can control “the error” of the box using other measures. For example, if a box has the perimeter less than α, then both its diagonal and large sides are less than α/2. Moreover, the area of the box is also less than α2 1 Hence any version of the box algorithm using perimeter accuracy as 16 . stopping criterion will include the area stopping criterion as special case and will be more powerful. However, in this set of boxes, there might be some box without any representative point. As was pointed out to us by Kuhn [11], this is the case when we solve two horizontal and vertical subproblems in sequence. The naive idea to overcome this problem, is to solve an additional lexicographic ε-constraint subproblem for each box without representative. However, the box accuracy is always underestimated in previous steps, and if some box R(a, b) has no representative, then the two points a and b are close to two nondominated 1 Assume that a and b are two sides of the box. Let p be its perimeter. Since 0 ≤ (a − b)2 it follows that ab ≤ (a+b)2 4 = p2 16 ≤ α2 . 16 6 Horst W. Hamacher, Ky Vu solutions which are already included in the representative system. Thus, we can try to expand each box without representative to include either of the two neighbor nondominated solutions. The two new boxes are likely to be small (with perimeters less than 2α). Only if both of the two boxes have a perimeter greater than 2α, we must compute an additional lexicographic ε-constraint problem associated with the box. Figure 1 illustrates the idea of the algorithm. In Step 1, the algorithm divides the starting box into two boxes S1 and S2 . Although the box S2 contains many nondominated points, it has a small area due to its very small vertical side (so it would keep this box, and thus a bad representation, if the area accuracy criterion is used). The horizontal lexicographic ε1 −constraint on the box S1 and the vertical lexicographic ε2 −constraint on the box S2 is applied. The algorithm continues with the resulting boxes in Step 2 until the perimeters in list of all boxes is small enough. Fig. 1 Idea of the a posteriori algorithm with perimeter accuracy The details of the algorithm are presented in Algorithm 1. Theorem 35 The a posteriori algorithm with parimeter accuracy produces a representation for YN with perimeter accuracy α after computing at most P log4/3 (2) O( 2α ) many lexicograhic ε-constraint problems, where P is the perimeter of the starting box R(z 1 , z 2 ). Proof The number of lexicograhic ε-constraint problems is bounded by the number of iterations in the While loop and For loop. Note that after each iteration in the While loop, the perimeters of the resulting boxes are at most 3/4 the perimeter of the original box. Therefore, the algorithm terminates after a finite number of iterations. Improved box representation of Pareto sets Algorithm 1: The a posteriori algorithm with perimeter accuracy Data: A discrete bicriteria optimization problem, α > 0. Result: A representation Rep ⊆ YN with perimeter accuracy 2α Initialization S := ∅ ; Rep := ∅ ; CheckBox := ∅ ; Compute the lexicographical minima z 1 and z 2 and the perimeter of R(z 1 , z 2 ) ; Set Rep := {z 1 , z 2 } ; S := {R(z 1 , z 2 )} ; while S 6= ∅ do Choose the box R(y 1 , y 2 ) ∈ S such that its larger side is maximal ; Remove R(y 1 , y 2 ) from S; if the large side of R(y 1 , y 2 ) is horizontal then y 1 +y 2 Solve Pε1 with ε = b 1 2 1 c and obtain optimal solution z ∗ ∈ YN ; p := (ε + 1, z2∗ − 1) ; Insert z ∗ to Rep; if the perimeter of R(y 1 , z ∗ ) > 2α then Insert R(y 1 , z ∗ ) to S; else Insert R(y 1 , z ∗ ) to CheckBox; end if the perimeter of R(p, y 2 ) > 2α then Insert R(p, y 2 ) to S; else Insert R(p, y 2 ) to CheckBox ; end else y 1 +y 2 Solve Pε2 with ε = b 2 2 2 c and obtain optimal solution z ∗ ∈ YN ; p = (z1∗ − 1, ε + 1) ; Insert z ∗ to Rep; if the perimeter of R(z ∗ , y 2 ) > 2α then Insert R(z ∗ , y 2 ) to S; else Insert R(z ∗ , y 2 ) to CheckBox; end if the perimeter of R(y 1 , p) > 2α then Insert R(y 1 , p) to S; else Insert R(y 1 , p) to CheckBox; end end end Remove all boxes with at least one representative from Checkbox; for R(u, v) ∈ Checkbox do Remove R(u, v) fromCheckbox ; Find two neighbor nondominated solutions w1 , w2 of u, v from Rep ; if both perimeters of R(w1 , v) and R(u, w2 ) are greater than 2α then Solve Pε1 with ε = v1 − 1 and obtain optimal solution z ∗ ∈ YN ; Insert z ∗ to Rep; end end 7 8 Horst W. Hamacher, Ky Vu All the boxes in the set CheckBox have a perimeter less than or equal to 2α. Since any non-dominated point z is contained in some box R(a, b) and the distance between z and a, b is at most one half of the perimeter of R(a, b), the distance accuracy of the representation follows. After each iteration in the While loop, the number of boxes in S increases by at most 1. We claim that, after 2k − 1 iterations, the algorithm produces no more than 2k boxes, each of which has a perimeter of at most ( 43 )k P . We prove this claim by induction. Obviously, the claim holds for k = 0, 1. Assume that the claim holds for all k < s. We prove that the claim also holds for k = s. After the first iteration, we have 2 boxes R1 , R2 with perimeters at most 3 i 4 P . Applying the algorithm to each box R , we get: After s − 1 iterations, s we have no more than 2 boxes, each of which has a perimeter of at most ( 43 )s ( 34 P ) = ( 43 )s+1 P . In total, we have at most 2s+1 boxes with the above property, i.e the claim holds for k = s. From the claim, it follows that the While loop terminates if ( 43 )k P ≤ 2α. It P means that the While loop terminates at the latest for k ∗ = dlog4/3 ( 2α )e. ∗ k Therefore, the maximum number of iterations in the While loop is 2 − 1 = P P log4/3 (2) ) . The number of iteration in the For 2dlog4/3 ( 2α )e − 1 which is O( 2α loop is less than the cardinality of the set CheckBox, which means that the P log4/3 (2) total number of iterations in both loops is at most O( 2α ) . 4 A posteriori surrogate-based algorithms In the preceding section we have shown that the accuracy measured by the area of the boxes can be replaced by a distance measure while maintaining a worst-case complexity statement for the running time of the a posteriori algorithm. In this section we will suggest numerical improvements for the area and perimiter based class of box algorithms, since both the a priori and the a posteriori algorithms have some drawbacks. The main drawback of the a priori algorithm is obvious: it underestimates the accuracy of the resulting representative system. Usually, we have to solve too many lexicographic ε-constraint subproblems to obtain a representation which is unnecessarily better than required. The reason is that, if the representation consists of k boxes, then the total area of these boxes is less than Area k , where Area is the area of the starting box. It means that, the average area value of these boxes is less than Area k2 . However, we have no information about these boxes, so we can only conclude that the accuracy of the representation is Area k , which is much larger than Area . k2 On the other hand, for the a posteriori algorithm, we can only assure the accuracy of representative systems after solving 2k subproblems, where k is integral. Therefore, it lacks the flexibility to choose the cardinality of the representative system in advance. And in fact, in order to find a representative Improved box representation of Pareto sets 9 system with specified accuracy, we often need to find the system with better accuracy (e.g smaller box area). When we face the situation where lexicographic ε-constraint subproblems are difficult and time-consuming to solve, it is necessary to improve both algorithms. Notice that in these algorithms, representative systems are found by considering the images of the two objectives independent of each other. But if we look at the Pareto nondominated set, we can consider one objective as a function of the other objective. Therefore, if we know some nondominated solutions, we can build a response curve (or a surrogate model) that interpolates the Pareto set. We can use this approximation to identify the next ε−subproblem to solve. The advantage of this method is that we can estimate the locations of the next nondominated solutions, such that we are able to divide a given box into k sub-boxes which are almost of equal size. In this way, we can use the estimate Area Area for the area of each box and the number of lexicographic k2 instead of k ε−constraint subproblems needed to solve will be reduced significantly. Our algorithms require several starting nondominated solutions, which can be found using a very crude approximation accuracy by one of the box algorithms in the previous sections. The first modification of the a posteriori algorithm is based on the following lemma. Lemma 41 Let z be an arbitrary point inside the box R(x, y) such that Area[R(x, z)] = Area[R(z, y)]. Then Area[R(x, z)] ≤ Area[R(x, y)] . 4 Proof Assume that z divides the horizontal side of R(x, y) into two sub-sides with corresponding lengths a, b. Then the area of R(x, y) is Area[R(x, y)] Area[R(x, z)] Area[R(x, z)] = (a + b) + a b 2 (a − b) = 4+ Area[R(x, z)] ab ≥ 4 Area[R(x, z)], which finishes the proof. If the point z ∗ found in Step 2 is not dominated by any nondominated solution, then the boxes generated by solving the associated lexicographic ε-constraint problem indeed have the areas smaller than the area of the box R(y 1 , x). Even if it is not the case, these areas are expected not to differ much from that value. Therefore they are most likely smaller than 41 the area of the initial box 10 Horst W. Hamacher, Ky Vu Algorithm 2: A posteriori surrogate-based algorithm with area accuracy. Data: A discrete bicriteria optimization problem, α > 0. Result: A representation Rep ⊆ YN with area accuracy α. Initialization Find a small set Rep of starting nondominated solutions and a set S of starting boxes. while S is not empty do – Step 1: Construct a surrogate model that interpolates the data {(ς1 , ς2 ) | (ς1 , ς2 )T ∈ YN }. Denote the resulting response curve by C. – Step 2: Choose the largest rectangle R(y 1 , y 2 ) ∈ S. Search along the curve C a point x between y 1 and y 2 such that Area (R(y 1 , x)) = Area (R(x, y 2 )). Remove R(y 1 , y 2 ) from S. – Step 3: Assume x = (ς1 , ς2 )T . Solve Pε1 with ε = bς1 c to obtain optimal solution z ∗ . Insert z ∗ to Rep. Set p := (ε + 1, z2∗ − 1). – Step 4: If Area(R(y 1 , z ∗ )) > α, insert R(y 1 , z ∗ ) to S. If Area(R(p, y 2 )) > α, insert R(p, y 2 ) to S. end R(y 1 , y 2 ) since the area of R(y 1 , x) is considerably smaller than (see the difference value (a−b)2 4ab Area(R(y 1 ,y 2 )) 4 in Lemma 41). Assume that S is the set of starting boxes. Using similar arguments as in the proof of Theorem 35, we conclude for each starting box Ri ∈ S that the algorithm produces after 2k −1 iterations no more than 2k rectangles inside Ri , i each of which has an approximate area of at most A . Since ki∗ = dlog4 ( Aαi )e 4k i < α, the maximum number of iterations is the smallest integer k satisfying A 4k is X ∗ 2ki − 1 < Ri ∈S X 21+log4 ( Ai α ) Ri ∈S r = X 2 Ri ∈S r ≤2 Ai α (|S|).A , α where A is the area of the first starting box as in Section 3. The last inequality is due to Cauchy-Schwarz inequality, that is X X p 2 Ai ≤ |S| Ai < |S| · A Ri ∈S Ri ∈S Improved box representation of Pareto sets 11 Compared to O( A α ) iterations in the a posteriori algorithm with area accuracy (see Theorm 2.1), the modified algorithm has reduced significantly the number of iterations. The advantages of the perimeter accuracy discussed in Section 3 can be combined with the idea of the numerical speed-up of the preceding algorithm. (See Algorithm 3) Algorithm 3: A posteriori surrogate-based algorithm with perimeter accuracy. Data: A discrete bicriteria optimization problem, α > 0. Result: A representation Rep ⊆ YN with perimeter accuracy α. Initialization Find a small set Rep of starting nondominated solutions and a set S of starting boxes. while S is not empty do – Step 1: Construct a surrogate model that interpolates the data {(ς1 , ς2 ) | (ς1 , ς2 )T ∈ Rep}. Denote the resulting response curve by C. – Step 2: Choose a rectangle R(y 1 , y 2 ) ∈ S with largest perimeter. Search along the curve C a point x between y 1 and y 2 such that Perimeter (R(y 1 , x)) = Perimeter (R(x, y 2 )). Remove R(y 1 , y 2 ) from S. – Step 3: Assume x = (ς1 , ς2 )T . Solve Pε1 with ε = bς1 c to obtain optimal solution z ∗ . Insert z ∗ to Rep. Set p := (ε + 1, z2∗ − 1). – Step 4: If Perimeter (R(y 1 , z ∗ )) > α, insert R(y 1 , z ∗ ) to S. If Perimeter (R(p, y 2 )) > α, insert R(p, y 2 ) to S. end Similar to the a posteriori surrogate-based algorithm with area accuracy, the estimated maximum number of iterations P in the a posteriori surrogate-based algorithm with perimeter accuracy is Ri ∈S 2 Pαi ≤ 2P α where Pi is the perimeter of the starting box Ri and P is the perimeter of the starting box as in Section 3. In both of the preceding algorithms, the points x in Step 2 can be found directly by solving an equation related to the surrogate model in Step 1 represented, say, by a function f . For example, for the algorithm with area accuracy, x can be found by solving the equation (f (x) − y22 )(y12 − x1 ) = 1 1 (y − y22 )(y12 − y11 ). 4 2 This kind of equation is not difficult to solve, since the surrogate f is often a simple function. Alternatively, binary seach may be used to find x approximately. 12 Horst W. Hamacher, Ky Vu For the rest of this section, we present another generic surrogate-based algorithm which is based on the concept of the a priori algorithm. The idea of the method is to find a small box immediately at each iteration, using some trust coefficients of the surrogate models. We denote by σ(R) the measure of the box R. Here σ might be the area, perimeter or any other appropriate measure. Algorithm 4: Trust coefficient surrogate-based algorithm. Data: A discrete bicriteria optimization problem, α > 0. Result: A representation Rep ⊆ YN with σ-accuracy α. Initialization Find a small set Rep of starting nondominated solutions and a set S of starting boxes. while S is not empty do – Step 1: Construct a surrogate model that interpolates the data {(ς1 , ς2 ) | (ς1 , ς2 ) ∈ Rep}. Denote the resulting response curve by C. – Step 2: Choose any rectangle R(y 1 , y 2 ) ∈ S. Choose a trust coefficient ξ > 0. Search along the curve C a point x between y 1 and y 2 such that σ(R(y 1 , x)) = α − ξ. Remove R(y 1 , y 2 ) from S. – Step 3: Assume x = (ς1 , ς2 ). Solve Pε with ε = bς1 c to obtain optimal solution z ∗ . Insert z ∗ to Rep. Set p := (ε + 1, z2∗ − 1). – Step 4: If σ(R(y 1 , z ∗ )) > α, then insert R(y 1 , z ∗ ) to S. If σ(R(p, y 2 )) > α, then insert R(p, y 2 ) to S. end The trust coefficients ξ are chosen depending on the specified surrogate models that we use. Here, the models should be able to generate confidence intervals for each point ς1 along the horizontal axis (by using, for instance, kriging techniques). The trust coefficients ξ must satisfy the property: if we have σ(R(y 1 , x)) = α − ξ, then it would be natural to predict that σ(R(y 1 , x)) ≤ α. Therefore in Step 4, the box R(y 1 , z ∗ ) will not be inserted into S. Finding suitable measures of error associated with each surrogate model is one of the interesting future research topics. The trust coefficient surrogate-based algorithm can be used to speed up the a posteriori surrogate-based algorithms, particularly when the required accurate is relatively small. In this situation, we have a large set of nondominated solutions that allows us to construct accurate surrogate models to approximate the Pareto set (thus, trust coefficients are more reliable). The box methods are based on the idea of iteratively splitting up a set of boxes into smaller ones. However, when the boxes are small, the a posteriori Improved box representation of Pareto sets 13 surrogate-based algorithms can be further improved. This can be seen by considering an example with area accuracy: Assume that we need to work with a box of area 500, while the required area accuracy is 100. If we use the a posteriori surrogate-based algorithm, we are likely to obtain two boxes with area between 100 and 125, which are not good enough for the representation. Thus we need to perform two additional iterations (the generated boxes will be too small, which is unnecessary). However, if we use the trust coefficient surrogate-based algorithm, we only need to perform at most 2 iterations. Note that in the algorithms presented above, additional time for constructing surrogate models needs to be considered. However, in the case when solving lexicographic ε−constraint problems is difficult and time-consuming, this additional time may be neglectable. 5 Application to the bicriteria multicommodity networ flow problem Due to the structure of the bicriteria multicommodity flow problem, we can find its representative systems quite fast and efficiently. Integer programming (IP) subproblems are the main building blocks of the box methods. They occur very often, in particular in the solution of the lexicographic ε-constraint problems Pε1 or Pε2 . To solve the problem Pε1 , for example, we have to optimize: Minimize f2 (x) (Pε1 [1]) s.t f1 (x) ≤ ε and x ∈ X . Assume that the minimal objective value of the problem Pε1 [1] is p. Then we continue to solve the problem: Minimize f1 (x) s.t f1 (x) ≤ ε (Pε1 [2]) f2 (x) = p and x ∈ X . This argument shows that solutions of the problem Pε1 can be found by iteratively solving two integer programming problems Pε1 [1] and Pε1 [2]. We can solve the problems Pε2 and the lexicographical problems P 1 , P 2 in the same way. The bicriteria multicommodity flow problem has the nice property that all the resulting IP subproblems have a similar block structure (as illustrated in the Figure 2), where each of the independent blocks corresponds to a (single-commodity) network flow problem. The IPs with this structure can be solved efficiently by an hybrid of the branch-and-bound and column-generation method the branch-and-price algorithm (see [2]). 14 Horst W. Hamacher, Ky Vu Fig. 2 The block structure of IP subproblems Next, we report on our first experiences with regard to the implementation and performance of the different versions of box algorithms pesented in this paper applied to the bicriteria, multicommodity network flow problem. The algorithms have been implemented in Python, using CPLEX as solver. To compare and evaluate the quality of the algorithms, we first fix the required area accuracy (the maximal area of resulting boxes) and then compute the representations given by those algorithms based on area accuracy. Since each representative point is found by solving two integer programs with almost the same size and structure, the running time of an algorithm depends significantly on the number of representative points. For the two surrogate-based algorithms, we use a cubic spline interpolation. However, for generating a set of starting representative points, we use piecewise linear functions instead. Hence the algorithms behave exactly the same as the a posteriori algorithm (with area accuracy) in their first steps. In Table 1, we present implementation results on different network instances using the following five versions of box algorithms ALG ALG ALG ALG ALG 1: 2: 3: 4: 5: the the the the the a priori algorithm with area accuracy a posteriori algorithm with area accuracy a posteriori algorithm with perimeter accuracy surrogate-based algorithm with area accuracy surrogate-based algorithm with perimeter accuracy For each algorithm, we use as evaluation criteria of the representation in addition to the number of representative points three other criteria: the maximum area of boxes, the maximum perimeter of boxes and the running time. The size of the network is in increasing order, and we choose the area accuracy in such a way that the maximum number of boxes (precomputation) in the a priori algorithm is at most 50, 100, 200 and 500. As can be seen from the table, ALG 1 runs much slower than the four other algorithms. For example, in the instance where the network consists of 50 nodes and 1185 arcs with 5 commodities, the run time of the algorithm is more than 10 times the run time of any other algorithm. This confirms our Improved box representation of Pareto sets n 50 m 462 K 5 Boxes 49 Area required 840685 50 1187 10 100 5674347.0 50 1185 5 500 244426.0 1000 3000 10 500 16172191.0 Representation Card of REP Run time Max Area Max Perimeter Card of REP Run time Max Area Max Perimeter Card of REP Run time Max Area Max Perimeter Card of REP Run time Max Area Max Perimeter 15 Alg 1 51 19.43 165612.0 4624.0 102 241.55 796770.0 12074.0 496 638.31 12948.0 2182.0 N/A N/A N/A N/A Alg 2 11 4.58 548244.0 5972.0 12 27.87 5493305.0 24836.0 26 19.40 284029.0 5920.0 27 358.96 14667100.0 43160.0 Alg 3 12 3.42 619542.0 3666.0 19 46.33 4868469.0 8852.0 37 31.97 252668.0 2104.0 38 502.07 15122688.0 15808.0 Alg 4 9 2.33 822432.0 8328.0 13 29.18 3985650.0 21528.0 27 24.07 283920.0 7222.0 26 333.47 15886680.0 53412.0 Alg 5 13 3.91 546780.0 3278.0 19 39.82 1738143.0 9516.0 37 46.46 234895.0 2106.0 43 604.49 6018600.0 15648.0 Table 1 The implementation result prediction from the previous sections with regards to the underestimation of representation accuracy. So we can conclude that the algorithm is inferior to the others and we will not consider its performance any more. The a posteriori algorithm with area accuracy (ALG 2) runs faster than the a posteriori algorithm with perimeter accuracy (ALG 3). The reason is that we have to convert the perimeter accuracy to the area accuracy, so the area accuracy is underestimated. Indeed, the maximum area criterion of boxes produced by the a posteriori algorithm with perimeter accuracy is better than the one generated by the a posteriori algorithm with area accuracy. Especially the maximal large side length is reduced significantly. Regarding the number of representative points, we can see that the a posteriori surrogate-based algorithm with perimeter accuracy (ALG 5) is the one which requires the largest number of representative points (and also the largest running time). The representations generated by the a posteriori algorithm (with area/perimeter accuracy) and the a posteriori surrogate-based algorithm (with area/perimeter accuracy) are quite similar. We think that the reason the two surrogate-based algorithms did not perform better is because of the structure of the multicommodity flow problem: the solutions of the problem is quite ”dense” along the Pareto curve. So it is easy to find a nondominated point in a relatively small box. 6 Conclusion and future research In this paper, several alternatives to the area-based box algorithm of Hamacher et al. [8] for finding representative systems of bicriteria optimization problem have been suggested. It has been argued that the distance approach, represented by the perimeter of the largest representing box, is better suited for 16 Horst W. Hamacher, Ky Vu practical applications. The resulting peripheral box algorithm has been analyzed with respect to its worst-case complexity. Using a branch-and-price methods, it has been tested on instances of the bicriteria multicommodity network flow problem and it has been shown to be competitive to the areabased algorithm with regard to running time. The theoretical discussion of using surrogate models in which the Pareto curve is interpolated and this information is used to (theoretically) speed-up the computation of a box representation with given accuracy indicates the potential of this approach. The preliminary numerical tests show, however, that several questions need to be addressed in order to make this approach useful in practice. These questions include the following – – – – Which surrogate models should be used? How to find starting nondominated points? How many points are enough to begin the surrogate-based algorithms? How to choose the trust coefficients ξ in the trust coefficient surrogatebased algorithm? Answers to these interesting questions which are on the borderline of numerical analysis and optimization are the subject of current research. Acknowledgements We would like to thank Tobias Kuhn for pointing out an error in the computation of the perimeter representation and Marc Goerigk for numerous, helpful comments on first drafts of our paper. References 1. Ravindra K. Ahuja, Thomas L. Magnanti & James B. Orlin: Network flows: Theory, algorithms, and applications, Prentice Hall, Inc., Englewood Cliffs, NJ, xvi+846 pp (1993). 2. Cynthia Barnhart, Ellis L. Johnson, George L. Nemhauser, Martin W. P. Savelsbergh and Pamela H. Vance : Branch-and-Price: Column Generation for Solving Huge Integer Programs, Operation Research, Vol 46, 316−329 (1998). 3. Chinchuluun, Altannar and Pardalos, Panos M : A survey of recent developments in multiobjective optimization, Annals of Operations Research. 154 , 2950 (2007). 4. Matthias Ehrgott: Multicriteria Optimization. Second edition, Springer-Verlag, Berlin, xiv+323 pp (2005). 5. Fulkerson, D. R: Flows in networks, Recent Advances in Mathematical Programming, edited by R. L. Graves and P. Wolfe, McGraw-Hill, New York, 319-332 (1963). 6. Horst W. Hamacher & Karl-Heinz Kuefer: Inverse radiation therapy planning - a multicriteria optimization problem, Discrete Applied Mathematics, Vol 118, issue 1,2 145-161 (2002) 7. Horst W. Hamacher, Christian R. Pedersen & Stefan Ruzika : Multiple Objective Minimum Cost Flow Problems: A Review, European Journal of Operational Research, Vol 176, 1404−1422 (2007). 8. Horst W. Hamacher, Christian R. Pedersen & Stefan Ruzika : Finding representative systems for discrete bicriterion optimization problems, Operations Research Letters, Vol 35 issue 3, 336−344 (2007). 9. Host W. Hamacher, Ruzika, S. & Tanatmis, A.: PROSEL: A Decision Support System for Projekt Portfolio Selection Based on Multi-Objective Programming, in: Multiple Criteria Decision Aiding, C. Zopounidis, M. Doumpos (edts.), ISBN 978-1-61668-231-6 (2010) . Improved box representation of Pareto sets 17 10. Hu, T. C: Multicommodity network flows. Operations Research 11 , 344-360 (1963). 11. Tobias Kuhn: personal communication, (2013). 12. Siamak Moradi : The Bi-objective Multi-Commodity Minimum Cost Flow Problem, Proceedings of the 45th Annual Conference of the ORSNZ, November 2010. 13. A. Sedeno-Noda, C. Gonzalez-Martin & J. Gutierrez : The biobjective undirected twocommodity minimum cost flow problem, European Journal of Operational Research, Vol 164, 89−103 (2005). 14. R. E. Steuer : Multiple Criteria Optimization: Theory, Computation and Application, John Wiley, New York, 546 pp, (1986) 15. Wolsey, Laurence A. : Integer Programming, John Wiley & Sons, Inc., New York, xx+264 pp, (1998). 16. Ky Vu. : Change of Variable Methods and Representative Systems for Multiobjective Multicommodity Network Flows, Master Thesis, (2012).
© Copyright 2026 Paperzz