On Using the Hypervolume Indicator to Compare Pareto Fronts: Applications to Multi-Criteria Optimal Experimental Design Yongtao Caoa, Byran J. Smuckerb, Timothy J. Robinsonc,1 a Department of Mathematics, Indiana University of Pennsylvania, Indiana, PA USA b c Department of Statistics, Miami University, Oxford, OH USA Department of Statistics, University of Wyoming, Laramie, WY USA Abstract The Pareto approach to optimal experimental design simultaneously considers multiple objectives by constructing a set of Pareto optimal designs while explicitly considering trade-offs between opposing criteria. Various algorithms have been proposed to populate Pareto fronts of designs, and evaluating and comparing these fronts—and by extension the algorithms that produce them—is crucial. In this paper, we first propose a framework for comparing algorithm-generated Pareto fronts based on a refined hypervolume indicator. We then theoretically address how the choice of the reference point affects comparisons of Pareto fronts, and demonstrate that our approach is Pareto compliant. Based on our theoretical investigation, we provide rules for choosing reference points when two-dimensional Pareto fronts are compared. Because theoretical results for three-dimensional fronts are difficult to obtain, we propose an empirical rule for the three-dimensional case by making an analogy to the rules for two dimensions. We also consider the use of our procedure in evaluating the progress of a front-constructing algorithm, and illustrate our work with two examples from the literature. Keywords: Pareto front, multi-objective optimization, design of experiments, point exchange 1 Corresponding author: Department of Statistics, University of Wyoming, Laramie, WY 82071; e-mail address: [email protected] 1 1. Introduction Most experiments are conducted with multiple, competing objectives in mind. Therefore, designing under a single criterion may be inadequate. For instance, Gilmour and Trinca (2012) show via examples that the traditional D-optimal designs allow no ability to estimate pure error; on the other hand, the best design for estimating pure error performs very poorly in terms of the D-criterion. In situations like this, the final choice of an experimental design should reflect appropriate compromise across the criteria of interest. But choosing a design based upon the simultaneous optimization of multiple design criteria is often a difficult problem. Without a priori knowledge about the interdependencies between the criteria, the conventional compound design and constrained design approach (e.g. Cook and Wong, 1994) for solving multiple-criteria optimal design problems could lead to relatively poor solutions (Coello Coello et al, 2007; Das and Dennis, 1997). Furthermore, the tradeoff between the objectives cannot be fully understood without simultaneously considering all criteria. The Pareto front approach (Park, 2009; Lu et al., 2011; Sambo et al., 2013) not only accounts for the varying interest and importance of the various objectives simultaneously but also provides the most insight about the tradeoffs between the alternative choices, which in turn enables better decision making. This procedure involves finding a set of Pareto optimal designs and then using the experimenter’s evaluation of the existing trade-offs between the designs to ultimately choose between them. The criterion vectors associated with the Pareto optimal set of designs is known as the Pareto front. The shape of the Pareto front provides useful information about the amount of tradeoff between the different criteria and how much compromise is needed from some criteria to improve others. Critical to this approach is the assumption that the Pareto front has been sufficiently populated. The true Pareto front, however, is rarely known and hence any algorithm used to generate a set of designs (e.g. the exchange algorithms of Lu et al., 2011 and Sambo et al., 2014, or the multi-objective evolutionary algorithm of Park 2009) merely results in an approximation of the true Pareto front. The quality of this approximation depends upon (1) the proximity of the points on the approximated front to the points on the true Pareto front; and (2) the diversity of the points on the approximated front, where more diversity is typically better. These characteristics are important in both offline settings, in which one compares multiple fronts produced by competing algorithms, and online settings, in which a 2 front is evaluated as it evolves with the rate of this evolution potentially used as a termination criterion. In this article, we are concerned with how Pareto fronts are evaluated and compared, rather than with algorithm development. A popular measure of the quality of an approximated Pareto front is the front’s hypervolume (Zitzler and Thiele, 1998), which measures the size of the space enclosed by all solutions on the Pareto front and a user-defined reference point (for a formal definition see Section 2). This measure of Pareto front quality has gained increasing interest in recent years and has become the standard offline indicator to evaluate the performance of multi-objective optimization algorithms (Zitzler et al., 2008). It has also been used as an online indicator to guide the optimization process (Knowles et al., 2003; Zitzler and Künzli, 2004; Emmerich et al., 2005; Beume et al., 2007; Bader and Zitzler, 2011). Its success and popularity are due to the fact that it simultaneously accounts for proximity and diversity and is strictly Pareto compliant. This means that whenever one Pareto front approximation dominates another, the hypervolume of the former is greater than that of the latter. One significant drawback to this measure is that its magnitude is dependent upon an arbitrarily chosen reference point. We will return to this point, in detail, in Sections 3 and 4. Though the hypervolume measure is a well-established indicator of a front’s quality, it has only recently been introduced to the statistics literature. Lu and Anderson-Cook (2012) develop a hypervolume-like indicator within the context of optimal experimental design, but there are several issues with their proposed measure: (1) they use different reference points for different approximate Pareto fronts which leads to unfair comparisons; (2) they choose the reference point in a way that does not permit a contribution to the hypervolume by all points; (3) Pareto compliance is not maintained because dominated points are used to compare an approximate front to the reference front; and (4) when used in an online setting, their proposed procedure can suggest decreases in Pareto front quality even as a front evolves in the context of a front construction algorithm. These issues are explained in more detail in Sections 2.2 and 4.1. In this paper we address the aforementioned issues and propose an improved hypervolume-based measure for use in Pareto optimal design. In Section 2, we review the notion of Pareto optimal design, describe the computation of the hypervolume indicator, and explain in more detail the problems with the outstanding versions of the measure as well as our solutions to those problems. We also propose an interpretable scalar metric for describing how well a Pareto 3 front is approximated and illustrate how the proposed indicator can be used in comparing competing Pareto fronts. In Section 3, we develop theoretical properties regarding the influence of the reference point on the calculation of the hypervolume indicator. Guidance is provided for choosing the reference point, in the presence of two criteria, based on our theoretical investigations, and we suggest a similar approach for the three-dimensional case. In Section 4, we illustrate our proposed procedure in both an offline and online setting in the context of multiobjective optimal experimental design, consider the influence of the reference point in a threedimensional example, and explore the reasons for unintuitive decreases in the online uses of the hypervolume measure. In Section 5 we provide a recap and discussion. Though we proceed with experimental design as the context, we note that our results and conclusions are more generally applicable to any multi-objective optimization setting in which the Pareto approach is employed and similar algorithms are used. 2. The hypervolume indicator and procedure for comparing discrete Pareto fronts Without loss of generality, assume that the goal of a general multiple-criteria design optimization f problem f1 , f 2 , fC to simultaneously denote the ' C 1 maximize C 2 design criteria. vector of criterion values for design Let . Let denote the search space of all feasible designs. A design 1 is said to dominate 2 if f j 1 f j 2 f j 1 f j 2 f 2 2 is for all j 1, ..., C and there exists at least one j 1, ..., C for which . In this case, the criterion vector f 1 is said to dominate the criterion vector and we write 1 and we write 1 2 . 2 . If f j 1 f j 2 for all 1 j C , we say 1 weakly dominates Henceforth, the criteria vector corresponding to a particular design is referred to as a point in the criterion space. A design is Pareto optimal if and only if no other design dominates it and its corresponding criterion vector is a non-dominated vector. The set of Pareto optimal designs constitutes the Pareto optimal set and the corresponding criterion vectors are said to be on the Pareto front or frontier. A good overview of the Pareto-related concepts is available in Coello Coello et al. (2007). 4 Given the experimental design setting, we treat every Pareto front as finite and discrete. We then assume that a given point on the Pareto front can be written as the ordered pair f , 1 f2 dimensions, where 2 in two dimensions or the ordered triplet f1 , f 2 f3 and f1 , f 2 , f 3 3 in three correspond to design criterion values 1, 2 and 3 respectively. In the remainder of this paper we operate in two or three dimensions, and a Pareto front with cardinality (the number of points in a Pareto front) is written as p PF f 1 1 , f2 1 , f , f PF f 1 1 , f2 1 , f3 1 , 1 p , 2 , in p two f , f , f 1 also defined on criterion vectors; e.g. p 2 P F1 p P F2 3 p dimensions in three. Moreover, if every point in and or are is dominated by at least P F2 one point in P F1 . To simplify our theoretical investigation, in the remainder of this paper we will standardize the criterion vector to 0 , 1 , rather than using the original scale. Specifically, for PF C , we standardize by scaling every criterion f j where f j * f j f j best f j w orst f j w orst , j 1, 2 , to ,C f j * 0 ,1 with (1) w o r s t corresponds to the minimum (or maximum, if the criterion is to be minimized) observed value of criterion observed value of criterion C f j j j within within PF PF and f j b e s t denotes the maximum (minimum) . Note that this scaling maps the original criterion space to 0 , 1 . C 2.1 The hypervolume indicator As its name suggests, the hypervolume of a given Pareto front measures the volume of the criterion space that is weakly dominated by the points on the Pareto front. In order to define the hypervolume indicator, a bounded space has to be made by the Pareto front and a userdefined reference point. The hypervolume indicator (Zitzler and Thiele, 1998) for P F C , 5 denoted as IH P F , is dependent upon the reference point r r1 , r2 , , rC ' C and is formalized as IH with s p a c e s , r v C r PF, r v s s P F sp a c e s, r , being the criterion space (rectangles in 2-dimension and hyper-rectangles in dimensions > 2) containing all criterion vectors, dominated by the elements s PF (1) and themselves dominate v r r1 , r2 , , rC C , that are weakly where ' ri is the ith coordinate of the reference point, and is the usual Lebesgue measure. Note that strictly speaking, the reference point could have positive elements and the individual components of r r1 , r2 , , rC need not be the same. However, such a reference point would lose practical ' meaning, and so we restrict the elements to be the same and non-positive. 2.2 Issues when comparing discrete Pareto fronts When several competing algorithms exist for populating the Pareto front, an important question is how to make comparisons among the algorithms. This is especially difficult when, as is typically the case, the shape and cardinality of the true Pareto front is unknown. In Section 2.3 we propose a framework for algorithm comparison that avoids the problems described in the Introduction. First, however, we present these problems in more detail. The first two problems raised in the Introduction concern the reference point. Indeed, the primary and overarching issue is that the hypervolume of a Pareto front is dependent upon a user-defined reference point, and the subsequent ranking of competing Pareto fronts is dependent upon the location of this point. When the true Pareto front is unknown, the reference point is usually chosen as either the nadir point of the investigated Pareto front (Zitzler et al., 2007; Lu and Anderson-Cook, 2012) or a point that is slightly worse than the nadir (Zitzler et al., 2008; Sambo et al., 2013). (Note that the nadir point is defined as the vector with the worst values of each criterion as its elements.) Auger et al. (2009, 2010, 2012), Brockhoff (2010), and Friedrich et al. (2013) consider reference point selection but the assumptions made are unrealistic for design optimization. Specifically, these works assume that (1) the user is only interested in a prespecified number of solutions and (2) that one objective can be explicitly formulated as a 6 continuous and differentiable function of the remaining objectives. While these works provide insight into the general problem by superimposing a pre-knowledge of the theoretical form of the true, unknown Pareto front, they are of little practical use to solve the design selection problem. Another issue regarding Pareto front comparison is accounting for point dominance in the formulation of a reference front. Knowles et al. (2006) suggest that such a reference front can be obtained by combining all the competing fronts based upon Pareto dominance. However, when comparing the individual Pareto fronts to the reference front or to each other, neither the multiobjective optimization community (Coello Coello et al., 2007) nor the statistics community (Lu and Anderson-Cook, 2012; Sambo et al., 2013) has noticed that if points on the individual Pareto fronts that are dominated by the reference front are not removed, the comparison will not be Pareto compliant. In other words, if this adjustment is not made, the comparison procedure cannot reliably distinguish between Pareto fronts in terms of quality as measured by the hypervolume indicator. This issue is explored more concretely in Section 4.1. The aforementioned problems deal fundamentally with making fair comparisons among two or more front approximations, which is of interest in offline settings in which the fronts have been fully constructed and their quality is awaiting adjudication. Evaluation and comparison in online settings (i.e. situations where an algorithm is evaluated as it produces an increasingly dense approximation of the true front) is also of interest to indicate the extent to which the front’s quality is increasing. This allows the algorithm’s progress to be tracked and terminated when substantial improvements cease. Conceptually, any measure reflecting front quality should increase in magnitude as the front evolves. However, recent empirical studies (Judt et al., 2012, 2013) along with our empirical results using the method of Lu and Anderson-Cook (2012) suggest that existing hypervolume measures can exhibit decreases in magnitude while the Pareto front quality itself is increasing. We will take this issue up further in Section 4.4. 2.3 Comparing Pareto Fronts Using the Contribution Rate Indicator In this section, we propose a method for comparing competing Pareto fronts utilizing the contribution rate indicator. For a given multiple criteria optimal experimental design problem, we assume that the true Pareto front exists but is unknown, n approximation Pareto fronts, 7 P F1 , P F2 , ..., P Fn with n , are obtained by either executing n competing algorithms or the same algorithm different input settings. We propose comparing Pareto fronts via the following procedure: (1) Combine all the Pareto fronts, P F1 , P F2 , ..., P Fn , according to the definition of Pareto domination into a single set of non-dominated criterion vectors, denoted by Since the true Pareto front is generally unknown and approximation, P F1, P F2 , , P Fn vectors from (2) Standardize P FS P FS P FS . is the best available is regarded as a surrogate for the true Pareto front. Obtain such that P Fi P Fi P FS . That is, P Fi only contains those criterion P Fi which are not collectively dominated by P FS . P FS by employing equation (1). Note in this step each P Fi is also standardized. (3) Calculate the hypervolume indicator for each of the approximation Pareto fronts, denoted by I H P Fi, r , using the standardized the standardized P FS , denoted by IH P Fi along with the hypervolume for P FS , r . (4) Compute the contribution rate associated with each of the approximations as: C R P F i , P FS Note that values of C R P F i , P FS IH P Fi, r IH P FS , r . (2) close to 1 suggest that the ith approximation front is very close to the surrogate Pareto front. The contribution rate for Pareto front ‘approximating efficiency’ associated with 1 i j n , P Fi P Fi is a better approximation than PFj . If P Fi can be thought of as the C R P F i , P FS C R P F j , P F S with , and the comparison of the contribution rate gives a sense of the magnitude of the difference. This contribution rate indicator, of course, is itself only an approximation since P FS is not necessarily the true Pareto front. We present an application of the proposed procedure in Section 4.1. 8 3. The choice of the reference point As a starting point for the investigation of the reference point’s influence on the comparison of Pareto fronts in higher dimensions, we first consider how the choice of the reference point will change the hypervolume indicator when there are just two criteria. 3.1 Choice of reference point in two dimensions The first result shows how the value of the hypervolume indicator for a 2-dimensional Pareto front changes as the reference point changes. Lemma 1. For a 2-dimensional discrete Pareto front in 0 , 1 of size p 2 reference point r1 r 0 , r2 0 PF, r p 1 f1 i f 2 i i 1 Proof. Let with p 2 PF f 1 r1 f 1 i f 2 i 1 r1 f 2 1 r2 f 1 p r1 r2 . (3) f2 1 , 1 , , f , f 0 , 1 1 p 2 2 p be a 2-dimensional Pareto front solutions. Without loss of generality, assume that the solutions are sorted by criterion 0 , r2 0 , IH i 1 f1 i f1 i 1 1 in ascending order, i.e., r , with respect to a , the hypervolume indicator can be written as p IH 2 PF, r i 1, , p 1 . Then for a given reference point the definition of the hypervolume indicator in equation (2) can be written as f r1 f f1 2 1 1 1 3 f 2 1 r2 i 1 f1 i f 2 i f f 2 3 r2 p 1 p for 1 2 f 1 1 f 2 2 r2 f f f 1 p 1 p 1 2 p r2 f 1 i f 2 i 1 r1 f 2 1 r2 f 1 p r1 r2 i 1 Two implications are clear from Lemma 1. First, since the solutions are ordered by criterion 1, f1 p Therefore, when is the largest value of criterion 1 and r f2 1 is changed, the change in the value of is the largest value of criterion 2. IH P F , r will depend upon the reference point and the largest value of each criterion. This is illustrated in Figure 1, where a five point Pareto front is displayed for a hypothetical two-criterion problem. If we define the 9 boundary points as the optimal solutions with respect to a single criterion, then in Figure 1 the blue area represents the contribution of the left boundary point with respect to the reference point r r1 , r2 , the yellow area represents the contribution of the right boundary point, and the red area the contribution of the reference point alone. The second implication is that when r 0, 0 the contributions of the two boundary points as well as the reference point itself will be eliminated entirely. Figure 1. Illustration of the hypervolume for a hypothesized 2-dimensional Pareto front consisting of 5 points with nadir point (0,0) and reference point r r1 , r2 . The blue area is the hypervolume contributed by the left extreme point; the red area is the hypervolume contributed by the reference point, the yellow area is the hypervolume contributed by the right extreme point and the green area is the hypervolume contributed by the interior of the Pareto front. Lemma 2. P F1 0 , 1 2 The difference in hypervolume between two 2-dimensional Pareto fronts and P F2 0 , 1 2 , with respect to a common reference point r r1 0 , r2 0 , is given by IH P F1 , r I H P F2 , r p1 f1 i i 1 r1 i 1 p2 i 1 p 2 1 f2 i f 1 i f 2 i 1 f 1 i f 2 i p1 1 i 1 f 2 1 f 2 1 r2 f1 i f 2 i 1 (4) f f 1 p1 2 p2 10 , where i , i 1, p1 are designs in P F1 and i, i 1, Proof. Equation (4) is established directly by writing p2 are designs in IH P F1 , r and . P F2 P F2 , r using IH □ Equation (3). Lemma 2 implies that when using the hypervolume indicator to compare two Pareto fronts, the difference will not only depend on the fronts but also upon the reference point. Now we turn to several results relevant to the on-line and off-line use of hypervolume to compare Pareto fronts. Theorem 1. If two 2-dimensional Pareto fronts maximum IH criterion P F1 , r 1 value and the P F2 , r is independent of IH Proof. As in Lemma 1 we assume that r same IH P F1 , r IH 2 and maximum P F2 0 , 1 criterion 2 2 have the same value, then . f1 p is the maximum criterion 1 value and f1 p the maximum criterion 2 value. By Lemma 2, if equation (5), P F1 0 , 1 1 P F2 , r does not depend on r f 1 p 2 and f2 1 f 2 1 f 2 1 is then by □ . Theorem 1 implies that if two Pareto fronts have the same boundary points (e.g. both find the same optimal design for each of the criteria individually), the reference point plays no role in the comparison of the fronts. Theorem 2. A1 A 2 and Partition B1 B2 P F1 0 , 1 then IH 2 as P F1 A 1 P F1 , r IH and B1 P F2 , r P F2 0 , 1 holds 2 for as any P F2 A 2 reference B2 . If point r r1 0 , r2 0 . Proof. The proof can be broken into three cases: (i) A1 A 2 and B1 B2 . In this case, P F1 P F2 of Pareto dominance, we have that the set bounded by subset of the set bounded by r A1 A 2 and B1 B2 r and P F2 is a proper and P F1 . Taking the Lebesgue measure (i.e., the hypervolume) of the two sets, we obtain (ii) and by the definition IH , where neither P F1 , r B1 nor B2 IH P F2 , r . are empty. In this case P F1 and P F2 share a certain number of solutions, but of the remaining solutions in each 11 front, each solution in P F2 is dominated by at least one solution in P F1 . The proof of this case is very similar to the arguments given in case (i), therefore we have IH (iii) P F1 , r A1 A 2 portion P F1 , r For case (iii), P F2 , r . B2 and B1 of B1 I H IH IH P F1 . B1, r IH , where and P F2 , r . r r1 0 , r2 0 is is not empty. In this case, A1 A 2 IH Since 0 B2 B2 IH B1, r 0 IH 0 , is only a IH A 2, r we have 0 , that □ required. Otherwise, if points for which one criterion value is 0, such as 0 , IH B2, r A 1, r P F2 B 2 , r which yields I H P F1 , r IH r 0, 0 f2 P F2 , r . and and/or B1 contains only f1 , 0 , then This clearly violates Pareto compliance. The three cases are illustrated graphically in Figure 2. With respect to a common reference point, we see that P F1 has the maximum hypervolume indicator because it dominates P F2 , P F3 P F3 and P F4 . We also have that P F2 , r IH P F3 , r because solutions 4, 5, and 6 in are collectively dominated by the solutions 4 and 5 in first three solutions. Furthermore, than IH P F4 IH P F2 , r IH P F2 though the two fronts share the P F4 , r because P F2 has two more solutions does. Theorems 1 and 2 focus on situations in which the comparison of Pareto fronts by using the hypervolume indicator is independent of the choice of the reference point. There remains an exceedingly important practical question to be answered. How should we select the reference point when none of the conditions in Theorems 1 and 2 hold? The answer to this question has implications for the off-line setting in which a researcher wishes to compare multiple Pareto fronts and also in the on-line setting where the hypervolume indicator may be used as a stopping criterion. To demonstrate the confusion that might result if the reference point is not chosen carefully, consider P F1 and P F2 shown in the upper half of Figure 3, where both of the fronts are made up of 5 points in 0 , 1 . If the two fronts are compared with respect to a common 2 12 reference point r r 0, r 0 according to Lemma 2. Thus, (a) (b) P F1 is as good as P F2 if , we will have P F1 is superior to r 0 .1 7 5 , and (c) the ranks differ depending on the position of r P F1 IH P F2 P F1 , r IH P F2 , r 0 .2 r 0 .0 3 5 r 0 .1 7 5 in terms of hypervolume if is inferior to P F2 if 0 r 0 .1 7 5 , . Clearly, . Consequently, consider Theorem 3 along with an illustrative example. Figure 2. Three hypothetical Pareto fronts: P F1 dominates P F2 , P F3 and P F4 . P F2 and P F3 have common solutions 1, 2, and 3, but solution 4 in P F3 is dominated by solution 4 in P F2 ; solutions 5 and 6 in P F3 are dominated by solution 5 in P F2 . P F2 and P F4 have common solutions 1, 2, and 3, but P F2 has two more solutions, i.e., solution 4 and 5. Theorem 3. For a 2-dimensional maximization problem with the criterion space 0 , k f 1 and f 2 be the two extreme solutions, i.e., f 1 exist two other solutions, f 1 and f 2 such that r must be greater than k for IH f 1 , f 2 , r 0, k and f 2 k, 0 , let 2 . Let there 0 f 1 1 , f 1 2 , f 2 1 , f 2 2 k r, r IH . Then f , f , r r , r 1 2 to hold. Proof. By Lemma 1, IH f 1 , f 2 , r r, r f1 1 f 2 1 f1 2 r f 2 1 r f1 2 r 2k r r f 2 2 f1 1 f 2 2 2 2 13 f1 1 f 2 2 0 since f 2 1 f1 2 k and . Similarly, f , f , r r , r IH 1 2 f 1 1 f 2 1 f 1 2 f 2 2 f 1 1 f 2 2 r f 2 1 r f 1 2 r . 2 Therefore, IH f 1 , f 2 , r r, r IH f , f , r r , r 1 2 2 k r f 1 1 f 2 1 f 1 2 f 2 2 f 1 1 f 2 2 (6) r f 2 1 r f 1 2 Note 0 f 1 1 , f 1 2 , f 2 1 , f 2 2 k implies 0 f 1 1 f 2 1 , f 1 2 f 2 2 , f 1 1 f 2 2 k 2 so that 2k 2 x r f 1 1 f 2 1 f 1 2 f 2 2 f 1 1 f 2 2 r f 2 1 r f 1 2 where 0 x f 2 1 f 1 2 2 k Then, (6) holds if 2k r 2k 2 x r . which implies r 2k 2 2k x k , i.e., r k . □ This result, though a simplification that considers only fronts of size two, is suggestive regarding the choice of the reference point. We outline some guidelines, then illustrate with an example. Intuitively, we prefer a set of representative solutions which are as close as possible to the true Pareto front while being uniformly distributed along the whole front. However, if a judgment between several fronts is desired such that the two extreme solutions 0 , 1 and 1, 0 in 0 , 1 are most emphasized, then Theorem 3 suggests that one should choose 2 r k 1. On the other hand, if points are desired to be uniformly distributed along the front consider the following. Suppose the true Pareto front in 0 , 1 is the line connecting the two 2 extremes—certainly a simplifying assumption—or, alternatively, assume an arbitrary true Pareto front and consider the projection of this front onto the line, which we call the Projected Uniform 14 Pareto front (PUPf). Consider a surrogate for the true front that is uniformly distributed along the PUPf, and further consider any pair of adjacent points in this approximation along with the isosceles right triangle formed by taking the line connecting the pair of points as the hypotenuse. Then for the coordinate system implied by these two points (with the origin at the intersection opposite the hypotenuse), P FS k 1 and the distance between the points is P FS 1 2 P FS 1 where denotes the number of elements in the surrogate front. For these two points in isolation, then, Theorem 3 suggests r k 1 P FS 1 . Since this applies to any two adjacent points, we broaden this suggestion as a tentative guideline for the reference point in the original coordinate system. Therefore, if extreme points are favored, we suggest r 1; with uniformly distributed points are preferred, the suggestion is that 1 P FS 1 if, as is more likely, fronts r . Since the length of the line connecting the extremes is between each adjacent point on the PUPf is 2 r1 r2 , P FS 1 2 , where P FS 2 P FS 1 1 P FS 1 be somewhat larger than 2 1 and the distance , we recommend choosing is the number of designs in the surrogate Pareto front. This strategy provides a way for the decision maker to explicitly incorporate preferences regarding the desired distributions of points in the Pareto front and also account for the number of points in the Pareto front. As an illustration of our proposed strategy, we compare the four Pareto fronts in Figure 3. Since there are 5 points in each, we choose distributed points, but r 2 1 .4 1 r 2 51 0 .3 5 if we wish to emphasize uniformly if we prefer the two extreme solutions. The hypervolume indicators for the four fronts with respect to the two reference points are presented in Table 1. Since P F1 includes the two extreme solutions and is uniformly distributed along the PUPf, it has the highest hypervolume regardless of the reference point. PF2 is better than PF3 15 when r 0 .3 5 , because this reference point favors uniformly distributed points. When PF3 is better PF4 has one extreme point, it is inferior to PF2 r 1 .4 1 since this reference point prefers the extreme points. However, even though PF2 when is quite far from 1, 0 . In contrast, the two ends of P F2 r 1 .4 1 because its other end, 0 .2 , 0 .8 , are very close to the extreme points. Figure 3. Comparing four 2-dimensional Pareto fronts to demonstrate how the reference point can affect the ranks of the fronts. Table 1. The hypervolume indicators for the four 2-dimensional Pareto fronts in Figure 3 with respect to two different reference points. Pareto front r 0 .3 5 r 1 .4 1 PF1 PF2 PF3 PF4 1.1975 1.1625 0.9950 0.7175 5.1831 4.9361 4.9806 3.8551 3.2 Three dimensions and the choice of reference point A generalization of Theorems 1 and 3 from two dimensions to three is not straightforward, since a point’s hypervolume contribution no longer possesses a simple geometric shape, as opposed to the two-criterion case where it is always rectangular. As such, the hypervolume indicator for a higher dimensional Pareto front is more intricately dependent upon on the choice of the reference point. In order to investigate how to choose a reference point, we consider a 16 straightforward method for computing the hypervolume of a Pareto front in d 2 dimensions: the Hypervolume by Slicing Objectives (HSO) algorithm (While et al. 2006; Fonseca et al. 2006). This procedure assumes a non-dominated set, and consists of the following steps: (1) Sort the points in decreasing order of the coordinate values of a chosen dividing criterion. (2) Sweep the set by a d 1 -dimensional hyperplane along the dividing criterion, defining d-dimensional slices between consecutive points. (3) Calculate the hypervolume of each single slice by multiplying its height (measured along the dividing criterion) by the hypervolume of its next lower dimensional base. (4) Steps 1-3 are recursively repeated until in Step 3 the hypervolume can be calculated in two dimensions. (5) Add up the hypervolumes of each individual slice to produce the total hypervolume. It may be helpful to visualize the above procedure in the 3-dimensional case. As shown in Figure 4, the seven points in a hypothetical 3-dimensional Pareto front are sorted in descending order along the z-coordinate, i.e., criterion 3. Since these seven points can be classified into four distinct groups, the Pareto front is divided into four 3-dimensional slices. It can be seen that points 5, 6 and 7 form the bottom slice, point 1 forms the top slice, point 4 and points 2 and 3 form the middle two slices. The base of each slice is a 2-dimensional Pareto front and the height of each slice is the distance between the two consecutive z-values. The hypervolume of a slice is then the hypervolume of the base 2-dimensional Pareto front multiplied by the height of the slice. Finally, as illustrated in Figure 5, the hypervolume of a 3-dimensional Pareto front is the sum of the hypervolume of all the slices. The relationship between the reference point and the hypervolume indicator is much more complicated in three- or higher-dimensions than in two dimensions. For instance, in Supplementary Material A, a result is given which shows the relationship between the reference point, the points on the front, and the hypervolume for the situation given in Figure 4. In this case, the hypervolume depends on the reference point as well as the two boundary points (those points that are on the edge of the Pareto front); i.e. the ones which have criterion values and f 1 ip i for i 1, , s , where s f 2 i1 is the number of slices and assuming that criterion 3 is the dividing criterion of the 2-dimensional Pareto fronts defined by each of the slices. For an example in which s 4 , see Figure 4. 17 Figure 4. A hypothetical 3-dimensional Pareto front consisting of 7 points. The coordinates x, y, z correspond to criterion 1, criterion 2, and criterion 3, respectively. Figure 1. An illustration of the HSO algorithm. The hypervolume of a 3-dimensional Pareto front breaks into four 3-dimensional slices. The area of the bottom of each slice is a 2-dimensional Pareto front and the height of each slice is obtained along the third criterion. Establishing theoretical results regarding the reference point in three dimensions is difficult; to date, only one paper in the literature has attempted it (Auger et al., 2010). Perhaps even harder is to develop a theory that would clearly guide experimenters in making this important selection. Instead, we suggest some strategies analogous the two-dimensional case. If uniformly distributed solutions along the whole Pareto front are preferred, then we would expect each slice to have the same height. If there are criterion, there are s 1 s different levels of the dividing such slices, so that each has a height of bottom layer as much weight as the others, we would set r3 1 s 1 1 s 1 . If we wish to give the which would give this slice the same height as the others. More generally, if we assume the typical scenario, that the three criteria are equally important and a uniform distribution of solutions is preferred, we suggest 18 r1 r2 r3 choose 1 s 1 . On the other hand, if the individually optimal designs are to be emphasized r1 r2 r3 1 . We note again that these guidelines are not theoretically supported, but are instead suggestions extrapolated from the two-dimensional case. In Section 4.3 we will provide some results to illustrate the above choices. 4. Applications with numerical evaluations In this section, we apply the comparison procedure proposed in Section 2 and the theoretical results and empirical rules developed in Section 3 to two published examples. First, we review the Pareto front comparison procedure given by Lu and Anderson-Cook (2012) in Section 4.1 and compare it with our approach in the case of a simple hypothetical 2-dimensional Pareto front. Then, we apply our method to a 3-criterion design problem which is solved using the Pareto Aggregate Point Exchange (PAPE) algorithm of Lu et al. (2011). This problem is used to demonstrate the our procedure in both an offline and online setting because of its welldeveloped baseline Pareto front. 4.1 Comparing 2-dimensional Pareto fronts As mentioned earlier, Lu and Anderson-Cook (2012) (henceforth referred to as LA) also proposed the use of a hypervolume-like indicator for comparing Pareto fronts. LA’s approach is similar to what we have proposed in Section 2.3 though there are two important differences: (1) LA does not compute P Fi as in our first step; (2) in our third step LA computes what they call the “Hypervolume Under the Pareto Front” (HVUPF) for P FS , P F1 , P F2 , ..., and P Fn , which always uses the front’s nadir point as the reference point. To illustrate these differences, we present an example taken from LA. Consider the seven-solution Pareto front in Figure 6, scaled as usual to 0 , 1 , where the goal is to maximize criteria 1 and 2. Since we are comparing two fronts, this exemplifies the offline usage of the hypervolume procedure. The solutions in P F1 are denoted by the six triangles and the solutions 19 in P F2 are denoted by the five diamonds. The combined front, P FS , is composed of seven solutions, each represented by a red dot. Figure 2: A Pareto front involving two criteria where the objective is to simultaneously maximize both criteria. P FS has 7 solutions (red dots); P F1 has 6 solutions (open triangles); P F1 has 4 solutions (triangles with red dots); and P F2 P F2 has 5 solutions (diamonds with red dots). The computation of H V U P FPF S proceeds by taking the sum of the five rectangle areas R1-R5 as illustrated in Figure 7 (a). The two boundary points in this Pareto front are 0 , 1 and 1, 0 , so that the nadir point is located at 0 , 0 . Since HVUPF uses the nadir as the reference point, the two boundary points contribute nothing to the calculation. More explicitly, using criterion 1 as the dividing criterion the computation proceeds as follows: Rectangle 1 (R1) C rite ria 1 0 , 0 .4 Rectangle 2 (R2) C rite ria 1 0 .4 , 0 .5 : 0 .7 5 0 .5 0 .4 0 .0 7 5 Rectangle 3 (R3) C rite ria 1 0 .5 , 0 .7 : 0 .7 0 .7 0 .5 0 .1 4 , Rectangle 4 (R4) C rite ria 1 0 .7 , 0 .8 : 0 .6 0 .8 0 .7 0 .0 6 , Rectangle 5 (R5) C r ite r ia 1 0 .8 , 0 .9 : 0 .9 0 .4 0 0 .3 6 : , 0 .4 0 .9 0 .8 0 .0 4 , . 20 Therefore, H V U P F P F 0 .6 7 5 S . Figure 7. The areas associated with (a) H V U P FPF S and (b) I H P FS , r .C S R . The reference point for the LA approach is the nadir point 0 , 0 and denoted by r . L A . The reference point for computing IH P FS , r is 0 .2 4 , 0 .2 4 The computation of and is denoted by r .C S R . H V U P FPF 1 and H V U P FPF 2 proceed in a similar fashion, though it is important to note that the reference points for each of these two fronts are their individual nadir points. For H V U P FPF 1 we have: H V U P F P F 0 .5 0 .1 0 .7 5 0 .7 0 .5 0 .7 0 .7 5 0 .7 0 .5 0 .9 0 .7 5 0 .4 0 .5 2 5 1 H V U P F P F 0 .3 1 . Similarly, The areas associated with 2 H V U P FPF 1 and H V U P FPF 2 are shown in Figure 8(a) and Figure 8(c), respectively. We outlined four drawbacks of the LA procedure in the Introduction, and expanded upon them in Section 2.2. The first three are apparent in an offline setting such as this, and we illustrate them here. The first problem is that different reference points are used for different surrogate Pareto fronts. Consider H V U P FPF reference point for H V U P FPF 2 1 and H V U P FPF 1 H V U P FPF 2 pictured in Figure 8(a) and Figure 8(c). Note that the is 0 .1, 0 , the nadir point of is 0 , 0 .4 , the nadir point of P F2 . P F1 , while the reference point for Calculating the hypervolume and making 21 comparisons in this case cannot be expected to give a fair and meaningful comparison between the two Pareto fronts. The second issue also relates to the reference point: Choosing the nadir as the reference point does not permit a contribution from the extreme points in the front. Consider the computation of H V U P F P F displayed in Figure 8(a). Since the nadir point is 0 .1, 0 , the 1 points 0 .1, 0 .8 and 1, 0 are excluded in the computation of . The third issue arises H V U P FPF 1 because dominated points are used when a surrogate front is compared to the reference front. Consider the first rectangle R1 in Figure 8(a). LA limits the width of this rectangle by the dominated point 0 .1, 0 .8 , despite the fact that it is collectively dominated by P FS , and thus not a Pareto optimal solution. We now illustrate the computation of IH P F and how it rectifies the aforementioned problems. Recall the seven-solution Pareto front in Figure 6 where the goal is to maximize criteria 1 and 2. Also recall that there are two competing Pareto fronts, that while P F1 contains five solutions, only four of these are within P F1 P FS and and P F2 , and note P F1 P F1 P FS denotes these 4 solutions (triangles with red circles). For P F2 , all five solutions exist within and hence P F2 = P F2 P FS P F2 used for the computation of IH P FS (diamonds with red circles). Also note that the reference point IH P F1 and P FS , r .C S R . This is chosen because P F2 is IH r .C S R = 0 .2 4 , 0 .2 4 , the same as for has 7 points and we prefer to prioritize uniformly P FS distributed points along the front. The computation of IH P FS , r .C S R proceeds by taking the sum of the 7 rectangle areas (height×width) using criterion 1 as the dividing criterion and r .C S R = 0 .2 4 , 0 .2 4 as the reference point. Rectangle 1 (R1) C r ite r ia 1 0 .2 4 , 0 : Rectangle 2 (R2) C rite ria 1 0 , 0 .4 Rectangle 3 (R3) C r ite r ia 1 0 .4 , 0 .5 : 0 .7 5 0 .2 4 0 .5 0 .4 0 .0 9 9 Rectangle 4 (R4) C r ite r ia 1 0 .5, 0 .7 : 0 .7 0 .2 4 0 .7 0 .5 0 .1 8 8 : 1 0 .2 4 0 0 .2 4 0 .2 9 7 6 0 .9 0 .2 4 0 .4 0 0 .4 5 6 , , , , 22 Rectangle 5 (R5) C r ite r ia 1 0 .7 , 0 .8 : Rectangle 6 (R6) C r ite r ia 1 0 .8 , 0 .9 : Rectangle 7 (R7) C rite ria 1 0 .9 , 1 Figure 8. (a) H V U P FPF 1 : ; (b) I H P F1, r .C S R ; (c) 0 .6 0 .2 4 0 .8 0 .7 0 .0 8 4 , 0 .4 0 .2 4 0 .9 0 .8 0 .0 6 4 0 0 .2 4 1 0 .9 0 .0 2 4 H V U P FPF 2 , . ; and (d) I H P F2 , r .C S R areas for the two competing Pareto fronts P F1 and P F2 . We then have I H P FS , r .C S R 1 .2 1 2 6 , with this area pictured in Figure 7(b). In comparing Figure 7(a) and Figure 7(b) and observing the calculations above, note that extreme solutions to contribute area to the computation of The computation of P F1, r .C S R IH 1 .0 0 8 6 and IH P FS , r .C S R . IH P F2 , r .C S R r .C S R 1 .1 8 3 6 allows the proceed in a similar fashion. Note that the areas for the rectangles in Figure 8(b) and Figure 8(d) contain all points that are dominated by the associated point in the front. For instance, in Figure 8(d) the rectangle determined by the point 0 .9 , 0 .4 contains all points that are dominated by this point whereas the HVUPF area associated with this point, shown in Figure 8(c), does not include any of the dominated points below the horizontal line running through 0 .9 , 0 .4 . Using the LA procedure, one would conclude that H V U P F P F 0 .5 2 5 H V U P F P F 0 .3 1 1 2 . P F1 is better than P F2 This, is unintuitive and misleading, because since P F2 23 contributes five well distributed solutions to the surrogate of the true Pareto front whereas only contributes four well distributed solutions. IH P F2 , r .C S R 1 .1 8 3 6 I H P F1, r .C S R 1 .0 0 8 6 contribution rate of the first Pareto front as P F1 Using our procedure, we observe . We are also able to calculate the C R P F1 , P FS IH P F1, r .C S R IH P FS , r .C S R 1 .0 0 8 6 0 .8 3 1 8 , 1 .2 1 2 6 and the contribution rate of the second as 0.9761. Specifically, then, Pareto front 2 is about 15% more efficient in approximating the true Pareto front than Pareto front 1. 4.2 Comparing 3-dimensional Pareto fronts A 3-criterion design problem is presented in Lu et al. (2011) where the experimenter wishes to obtain a 14-run screening design for 5 factors, 1 ). The user-defined model is covariance interactions I 2 , X1 Y X 1β 1 ε where X 1, X 2 , ε ,X5, each at 2 levels ( 1 and has mean vector 0 and variance- contains the intercept, all main effects and the particular two-factor X 1 X 2 , X 1 X 3, X 2 X 4 , and X 3 X 5 design to estimate the parameters in that the true model is β1 . Though the experimenter wishes to have an efficient , there is also the desire to protect against the possibility Y X 1β 1 X 2 β 2 ε where X 2 is the 14 6 matrix containing the remaining six two-factor interactions. Consequently, the experimenter wishes to find a design that is efficient in terms of the D-, tr A A -, and tr R R -criterion. Note that the D-criterion focuses upon the precision of the model coefficient estimates, the tr A A -criterion seeks to minimize the effect of model mis-specification upon the coefficient estimates and the tr R R - criterion seeks to minimize the effect of model mis-specification up on the error variance estimate. For more details on these criteria, see Lu et al. (2011). This example serves as a nice benchmark for comparing algorithms in three dimensions because previous work has established the fact that the true Pareto front contains 351 designs, and this Pareto front is denoted as P F3 5 1 . In what follows, we use the Pareto Aggregate Point Exchange (PAPE) algorithm of Lu et al. (2011) to construct several Pareto front approximations, and make online comparisons using the methods developed in this paper. Note that we discuss 24 the PAPE algorithm because we are comparing Pareto fronts for multiple-criteria optimal experiment design problems, though the methods proposed herein could be used to compare Pareto fronts in other domains, produced by other methods, as well. The PAPE algorithm (Lu et al., 2011) is an elaboration of a classic point exchange algorithm (e.g. Fedorov, 1972; Cook and Nachtsheim, 1980) that builds up a Pareto front by considering exchanges between current design points and candidate points. It requires the number of random starts to be specified, so we specify 10000 random starts and observe the Pareto fronts at 50, 100, 500, 1000, 5000 and 10000 random starts. We then assess the results using both r .C S R criterion, our procedure as well as that of 0 .0 0 3 8, 0 .0 0 3 8, 0 .0 0 3 8 based on the fact that if we use P F3 5 1 r1 r2 r3 r .C S R proposed LA. tr R R We take as the dividing is divided into 267 slices and our empirical rule in Section 3.2 suggests 1 267 1 0 .0 0 3 8 . The hypervolume indicator for 0 .0 0 3 8, 0 .0 0 3 8, 0 .0 0 3 8 is I H P F3 5 1 , r .C S R 0 .6 1 5 7 P F3 5 1 with respect to . Results presented in Table 2 demonstrate drawbacks for the LA procedure when used online in this way: the HVUPF obtained with 50 random starts (0.4828) is larger than that for 100 random starts (0.4267) and even bigger than that of 500 random starts (0.4784). This is problematic since in this online setting additional random starts should never make the Pareto front worse. In contrast, the CSR procedure gives monotonic results in keeping with our intuition. Additionally, Table 2 gives the following information: (1) Pareto-optimal solutions found; (2) C R P F i , P FS IH P Fi , the number of true P Fi, r .C S R , the hypervolume measure; and (3) , the proportion of the true Pareto front that has been found. Table 2. Performance assessment of PAPE with different input settings. Number of random starts 50 100 500 1000 5000 10000 P Fi P Fi 118 166 264 290 315 322 61 78 215 252 299 309 P Fi, r .C S R HVUPF IH 0.4828 0.4267 0.4784 0.5245 0.5867 0.5874 0.3274 0.3327 0.5774 0.5802 0.5974 0.5978 C R P Fi , P F 3 5 1 53.17% 54.04% 93.78% 94.23% 97.03% 97.09% 25 4.3 Empirical assessment of reference point recommendations for three dimensions Continuing with the three-dimensional example from the preceding section, we perform a brief empirical study of our reference point recommendations from Section 3.2. Specifically, we compare three surrogates of the ostensibly true Pareto front, PF351: 1. PF61, the Pareto front generated by PAPE with 50 random starts; 2. PF61R, a Pareto front generated by randomly selecting 61 solutions from PF351; 3. PF61E, an “extreme” Pareto front generated by choosing, from PF351, the 20 solutions with the largest values of criterion 1 (D-optimality), the 20 solutions with the largest values of criterion 2 ( tr A A -optimality), and the 21 solutions with the largest values of criterion 3 ( tr R R -optimality). While PF61 is constructed directly via a front-populating algorithm, PF61R represents a relatively uniform front and PF61E represents one in which the extremes of the fronts are most thoroughly explored. Based on Section 3.2, if we favor uniformly distributed fronts, we would choose the reference point to be | r | 0 .0 1 8 , | r | 0 .0 1 7 , and | r | 0 .0 1 8 , respectively, based on the number of slices for each of the fronts (58, 61, and 56, respectively, using criterion 3 as the dividing criterion). Here we abuse notation slightly and use reference point r ( r1 , r2 , r3 ) |r | as shorthand for the entire . If we wish to compare the three fronts, we must choose a single reference point and since a uniform front is favored here, we might choose the smallest of the three, | r | 0 .0 1 7 . We would expect, in this case, that PF61R is judged to be the best and indeed, Table 3 indicates that it is. Alternatively, if we choose | r | 1 we would expect the more extreme front, PF61E, to be superior and Table demonstrates that this is the case. Indeed, somewhere between | r | 0 .1 7 and | r | 0 .3 4 , the more uniform front becomes inferior to the more extreme one. Though we must be careful not to make sweeping conclusions based upon a single instance, this example along with the two-dimensional example in Section 3.1 lend some evidence to the general conclusion that as favors extreme points, while smaller |r | |r | increases, the hypervolume indicator increasingly prefers solutions that are more spread out. 26 Table 3. For the three dimensional example from Section 4.2, an evaluation of several 61-solution surrogate fronts for various reference points. PFs PF61 r 0 .0 1 7 r 0 .1 7 r 0 .3 4 r 1 r 1 .7 0.3486 0.6579 1.1559 5.1681 14.4925 PF61R 0.5787 1.0231 1.7048 6.7641 17.7797 PF61E 0.5656 1.0187 1.7157 6.8925 18.1332 4.4 Decreases and fluctuations in the hypervolume measure As has already been noted in Section 4.2, when the LA procedure is used online, decreases in HVUPF can occur even while the number of random starts increases. However, a similar problem can occur when naively comparing P Fi and PFj instead of P Fi and P F j , a common problem with the hypervolume indicator in the on-line setting due to the lack of a consistent surrogate for the true Pareto front. The consequence of doing so is that the reference points will shift from comparison to comparison and this violates a condition of Theorem 2 which requires a common reference point to ensure Pareto compliance. Figure 9 plots the hypervolume growth of the Pareto fronts produced sequentially for the first 20 random starts using the PAPE algorithm for the 3-criterion design problem given in Section 4.2, using the procedure proposed in this paper. Although the general trend is increasing, the hypervolume measure is nonmonotonic. In the online usage of the hypervolume indicator, the reference point depends upon the current nadir point, which changes as the Pareto fronts are updated during the optimization process. Though there are several possible ways to avoid this problem, we suggest making pairwise comparisons of fronts as the front evolves. For instance, one might compare the front after the first random start with that of the second; the front after the second random start with that of the third; etc. Or, the comparison might be made after a particular increment (e.g. comparing the front after 50 random starts with that after 100; 100 with 150; etc.). Instead of using the hypervolume measure directly to make the comparisons, they are made by plotting the percentage improvement, i.e., I H PF j , r IH PF i PFj, r IH PF i PFj , r 100% for j i. In this way, Pareto compliance is maintained because a common reference point can be used for 27 each comparison. In Figure 10, we give an illustration. Note that any improvement shows up as positive on the graph, and a cessation of improvement is indicated by a flat line at 0. For applications such as this, where thousands of random starts can be executed, we suggest making comparisons every 50 or 100 random starts to smooth the measure of improvement. If only hundreds or dozens of random starts are feasible, then perhaps comparisons can be made every 5 or 10 random starts. Either way, this can be used as part of an algorithm termination strategy. For instance, the algorithm could be terminated after the first batch of 50 random starts for which no improvement in hypervolume is made. Figure 9. The hypervolume growth plot for the PAPE algorithm to solve the 3-criterion design problem given in Section 4.2. 28 Figure 10. Example of suggested online measure of algorithm progress. 5. Discussion In this paper we have proposed an improved version of the hypervolume indicator and applied it in the context of multiobjective optimal experiment design. We give a procedure to compare competing Pareto fronts that largely avoids the pitfalls of recent work by Lu and Anderson-Cook (2012). In particular, we have studied the relationship between the hypervolume indicator and its reference point, given conditions for two-criteria problems that ensure that the reference point will not affect comparisons between Pareto fronts, suggest rules to guide the selection of the reference point for two dimensions when these conditions are not satisfied, and give guidance for the selection of the reference point in three dimensions. We also ensure that our procedure is Pareto compliant by removing criterion vectors that are dominated by points in the reference Pareto front, and illustrate our methods in both an offline and online setting. For online applications, we show how the improved hypervolume procedure can be used to evaluate the progress of an algorithm that is generating a Pareto front of designs. Because the front is evolving in this case, reference points will shift and the hypervolume indicator may not increase monotonically. We propose an approach that avoids this problem by making pairwise comparisons of Pareto fronts and measuring the percent improvement for each comparison. This allows the hypervolume indicator to be used as an algorithm termination criterion. 29 Though our goal is to present methods of Pareto front comparison that are practically useful, we have made a number of assumptions that may limit this work’s applicability. For instance, Theorem 3 is used to motivate our recommendations for the selection of the reference point in two dimensions but is based upon a special case in which there are only two elements in the Pareto front. We also assume a projection of the true Pareto front onto the line connecting the extremes. These assumptions are, at this point, necessary simplifications that allow concrete guidance to the end-user. Further work might be undertaken to weaken or eliminate these assumptions. In addition, our tentative recommendations in three dimensions are not based upon any theoretical result, but instead are an analogy to the guidelines for two dimensions. Furthermore, in the hypervolume calculation in three dimensions, there is an implicit selection of an initial dividing criterion. The number of slices associated with this criterion then drives our recommendation. There are at least two potential issues with this. First, the number of slices is dependent upon the level of rounding. For instance, in the example of Sections 4.2 and 4.3, P F3 5 1 has 267 slices with respect to criterion 3 if rounding to three decimal places, and 321 slices if rounding to 4. We have used 3 decimal places in this work. Second, the dividing criterion chosen may change the number of slices used to calculate the reference point. For instance, PF61E of Section 4.3 has 56 slices if the third criterion is used to initially divide, but only 40 if the first is used. We have not found these issues to make a difference in the ultimate ordering of fronts, though it is likely that pathological cases could be constructed for which Pareto front orderings could change based upon the level of rounding or the chosen dividing criterion. If this is a concern, we recommend that the practitioner compare the fronts using several scenarios (e.g. with each criterion as the dividing criterion) to see if the ordering of the fronts change. In the unlikely case that they do, the user might use the general principle that a smaller uniform fronts and a larger |r | |r | favors more favors the extremes to guide the ultimate selection of the reference point. The work in this paper has implications for multiobjective experiment design, but also beyond. Recent work in experiment design has focused on incorporating multiple measures of design quality into the decision making process, and the developments in this article improve the tools available to evaluating sets of candidate designs. We emphasize, however, that this work 30 can be applied to a wide variety of optimization settings in which Pareto fronts are used to evaluate trade-offs between opposing criteria. Both offline and online usages of the hypervolume indicator are crucial to the process of constructing Pareto fronts of experiment designs. This procedure can help researchers compare and evaluate competing algorithms, as well as various versions of a single algorithm, in order to determine which are most effective in populating a front. It can also guide the use of a particular algorithm by providing information on its progress toward populating the front. The multiobjective optimization problem has three main questions that need to be answered: (1) How do we compare Pareto fronts? (2) How do we populate the Pareto front? (3) How do we choose a single solution to use? We have addressed the first question, but leave the second and third to the literature (Lu et al., 2011, Sambo et al., 2014, Park, 2009 for the second; Lu et al., 2011 and Zio and Bazzo, 2012 for the third) or future work. Acknowledgements The authors would like to thank Drs. Christine Anderson-Cook and Lu Lu for their suggestions and comments throughout this process. We also wish to express gratitude to the reviewers and associate editor who reviewed this work and allowed us the opportunity to improve the paper. References Albrecht, M. C., Nachtsheim, C. J., Albrecht, T. A., Cook, R. D., 2013. Experimental design for engineering dimensional analysis (with discussions). Technometrics, 55(3), 257-295. Auger, A., Bader, J., Brockhoff, D., 2010. Theoretical investigation optimal distributions for the hypervolume indicator: first results for three objectives. Parallel Problem Solving from Nature, PPSN XI 586-596. Springer Berlin Heidelberg. 31 Auger, A., Bader, J., Brockhoff, D., Zitzler, E., 2009. Theory of the hypervolume indicator: optimal distributions and the choice of the reference point. In Foundations of Genetic Algorithms (FOGA 2009) 87-102. ACM, New York, NY, USA. Auger, A., Bader, J., Brockhoff, D., Zitzler, E., 2012. Hypervolume-based multiobjective optimization: theoretical foundations and practical implications." Theoretical Computer Science 425, 75-103. Bader, J., Zitzler, E., 2011. HypE: An algorithm for fast hypervolume-based many-objective optimization. Evolutionary Computation 19(1), 45-76. Beume, N., Naujoks, B., Emmerich, M., 2007. SMS-EMOA: multiobjective selection based on dominated hypervolume. European Journal of Operational Research 181 (3), 1653-1669. Brockhoff, D., 2010. Optimal μ-distributions for the hypervolume indicator for problems with linear bi-objective fronts: exact and exhaustive Results. In Simulated Evolution and Learning. Springer Berlin Heidelberg, 24-34. Coello Coello, C.A., Lamont, G.B., Van Veldhuizen, D.A., 2007. Evolutionary algorithms for solving multi-objective problems. 2nd edition. Springer. Cook, R. D., Nachtsheim, C. J., 1980. A Comparison of Algorithms for Constructing Exact DOptimal Designs. Technometrics 22, 315-324. Emmerich, M., Beume, N., Naujoks, B., 2005. An EMO algorithm using the hypervolume measure as selection criterion. In Evolutionary Multi-Criterion Optimization. Springer Berlin Heidelberg, 62-76. Fedorov, V. V., 1972. Theory of Optimal Exeriments. New York, NY: Academic Press. 32 Fleischer, M., 2003. The measure of Pareto optima: applications to multi-objective metaheuristics. In Evolutionary multi-criterion optimization. Springer Berlin Heidelberg, 519533. Fonseca, C.M., Knowles, J.D., Thiele, L., Zitzler, E., 2005. A tutorial on the performance assessment of stochastic multiobjective optimizers. In Third International Conference on Evolutionary Multi-Criterion Optimization (EMO 2005), 216. Fonseca, C.M., Paquete, L., López-Ibánez, M., 2006. An improved dimension-sweep algorithm for the hypervolume indicator. In IEEE Congress on Evolutionary Computation, 2006. CEC 2006. IEEE, 1157-1163. Friedrich, T., Neumann, F., Thyssen, C., 2013. Multiplicative approximations, optimal hypervolume distributions, and the choice of the reference point. arXiv preprint, http://arxiv.org/abs/1309.3816. Gilmour, S.G., Trinca, L.A., 2012. Optimum design of experiments for statistical inference. Journal of the Royal Statistical Society: Series C (Applied Statistics) 61(3), 345-401. Goel, T., Haftka, R.T., Shyy, W., Watson, L.T., 2008. Pitfalls of using a single criterion for selecting experimental designs. International Journal for Numerical Methods in Engineering, 75(2), 127-155. Judt, L., Mersmann, O., Naujoks, B., 2013a. Non-monotonicity of obtained hypervolume in 1greedy S-metric selection. Journal of Multi-Criteria Decision Analysis 20(5-6), 277-290. Judt, L., Mersmann, O., Naujoks, B., 2013b. Do hypervolume regressions hinder EMOA performance? surprise and relief. In Evolutionary Multi-Criterion Optimization, Springer Berlin Heidelberg, 96-110. 33 Knowles, J.D., Corne, D.W., Fleischer, M., 2003. Bounded archiving using the Lebesgue measure. In The 2003 Congress on Evolutionary Computation, 2003. CEC '03. IEEE, 4, 24902497. Lu, L., Anderson-Cook, C.M., 2012. Adapting the hypervolume quality indicator to quantify trade-offs and search efficiency for multiple criteria decision making using Pareto fronts. Quality and Reliability Engineering International 29(8), 1117-1133. Lu, L., Anderson-Cook, C.M., Robinson, T.J., 2011. Optimization of designed experiments based on multiple criteria utilizing a Pareto frontier. Technometrics 61, 353–365. Myers, R. H., Montgomery, D. C. and Anderson-Cook, C. M., 2009. Response surface methodology (process and product optimization using designed experiments). John Wiley & Sons. New Jersey. Park, Y.-J., 2009. Multi-optimal designs for second-order response surface model. Communication of the Korean Statistical Society 16(1), 195-208. Sambo, F., Borrotti, M., Mylona, K., 2014. A coordinate-exchange two-phase local search algorithm for the D- and I-optimal designs of split-plot experiments. Computational Statistics & Data Analysis 71, 1193-1207. Steinberg, D. M. and Bursztyn, D., 2006. Comparison of designs for computer experiments. Journal of Statistical planning and Inference. 163, 1103-1119. While, L., Hingston, P., Barone, L., Huband, S., 2006. A faster algorithm for calculating hypervolume. IEEE Transactions on Evolutionary Computation 10(1), 29-38. Zio, E., & Bazzo, R. (2012). A Comparison of Methods For Selecting Preferred Solutions in Multiobjective Decision Making. In Computational Intelligence Systems in Industrial Engineering (pp. 23-43). Atlantis Press. 34 Zitzler, E., Brockhoff, D., Thiele, L., 2007. The hypervolume indicator revisited: on the design of Pareto-compliant indicators via weighted integration. In Evolutionary Multi-Criterion Optimization. Spring Berlin Heidelberg, 862-876. Zitzler, E., Knowles, J., Thiele, L., 2008. Quality assessment of Pareto set approximations. In Multiobjective Optimization. Springer Berlin Heidelberg, 373-404. Zitzler, E., Künzli, S., 2004. Indicator-based selection in multiobjective search. In Parallel Problem Solving from Nature-PPSN VIII. Springer Berlin Heidelberg, 832-842. Zitzler, E. and Thiele, L., 1998. Multiobjective optimization using evolutionary algorithms—a comparative case study. In Parallel problem solving from Nature—PPSN V. Spring Berlin Heidelberg, 292-301. 35
© Copyright 2026 Paperzz