ISBN 978-609-95241-4-6 L. Sakalauskas, A. Tomasgard, S. W.Wallace (Eds.): Proceedings. Vilnius, 2012, pp. 136–141 © The Association of Lithuanian Serials, Lithuania, 2012 doi:10.5200/stoprog.2012.24 International Workshop “Stochastic Programming for Implementation and Advanced Applications” (STOPROG-2012) July 3–6, 2012, Neringa, Lithuania ON THE ARITHMETIC OF INFINITY ORIENTED IMPLEMENTATION OF THE MULTI-OBJECTIVE P-ALGORITHM Antanas Žilinskas Vilnius University, Institute of Mathematics and Informatics Akademijos str. 4, LT-08663 Vilnius, Lithuania E-mail: [email protected] Abstract. The single-objective P-algorithm is a global optimization algorithm based on a statistical model of objective functions and the axiomatic theory of rational decisions. It has been proven quite suitable for optimization of black-box expensive functions. Recently the P-algorithm has been generalized to multi-objective optimization. In the present paper, the implementation of that algorithm is considered using the new computing paradigm of the arithmetic of infinity. A strong homogeneity of the multi-objective P-algorithm is proven, thus enabling rather a simple application of the algorithm to the problems involving infinities and infinitesimals. Keywords: arithmetic of infinity, multi-objective optimization, global optimization. 1. Introduction New computing paradigms and projects of hardware, which potentially enable the achievement of enormous performance and/or super-high precision, urge the development of new algorithms, and the adaptation of well-respectable conventional algorithms to the new prospects. A recently proposed computing paradigm, the arithmetic of infinity, lays the foundation of computations involving infinities and infinitesimals [9, 13]. Although the hardware for implementing the arithmetic of infinity at present is not available, in principle such hardware can be designed and built as shown in [12]. The arithmetic of infinity is attractive for the development of algorithms for numerical problems in various fields of applied mathematics, e.g. mathematical modeling [11], operations research and mathematical programming [2], global optimization [20], quantitative analysis of fractals [11], and others. In the present paper, multiobjective optimization problems are considered where the computation of objective vectors, using the standard computer arithmetic, is problematic because of either underflows or overflows. Besides the fundamentally new problems of minimization, where the computation of values of the objective functions involves either infinities or infinitesimals, the arithmetic of infinity can also be helpful in tackling optimization problems with objective functions computable using the numbers that differ in many orders. For example, the objective functions of the optimization problems of statistics considered in [20, 22] are computed operating with the numbers that differ more than 10200 times. The arithmetic of infinity can be applied in the optimization of challenging objective functions in two ways. First, the optimization algorithm is implemented in the arithmetic of infinity. Second, the arithmetic of infinity is applied only to scale the objective function values to be suitable for processing in the conventional computer arithmetic, and a conventionally implemented optimization algorithm is used to process the scaled values. The second implementation is considerably simpler than the first one, since the arithmetic of infinity is applied only to scale the function values. However, the second implementation can be recognized correct only in case both implementations generate the same sequence of points to compute the vectors of objectives. It has been shown in [20] that not all single-objective global optimization algorithms are appropriate to the correct second type implementation. In the present paper, we formulate a sufficient condition of the second-type correct implementation, called the strong homogeneity, and show that the multi-objective P-algorithm satisfies that condition. The multi-objective P-algorithm, like its single-objective prototype, is oriented to black box expensive functions which comprise a class of global optimization problems most difficult to tackle. Both versions of algorithms are based on the similar statistical models of objective functions. For the axiomatic essentials of the approach to global optimization, based on the statistical models of objective functions, we refer to [15, 16]. Recently it has been proven that the single-objective P-algorithm is strongly homogeneous [20]. Here we generalize that result showing that the multiobjective P-algorithm is strongly homogeneous as well. In the next section, the property of strong 136 ON THE ARITHMETIC OF INFINITY ORIENTED IMPLEMENTATION OF THE MULTI-OBJECTIVE P-ALGORITHM homogeneity of multi-objective optimization algorithms is formulated, which is sufficient for the correct second-type implementation of the algorithm meant for solving the problems related to the arithmetic of infinity. Section 3 presents brief description of the multi-objective P-algorithm. The proof of the strong homogeneity of the multi-objective P-algorithm is presented in Section 4. 2. The definition of strong homogeneity The minimization problem min X ∈A F ( X ), F(X) = (f1(X),..., fm(X))T, A ⊂ R d , (1) is considered, where the vector objective function F(X) is defined over a simple feasible region, e.g., for concreteness it can be assumed that A is a hyper rectangle set. For definition of the solution to the multiobjective optimization problem with nonlinear objectives we refer to [7]. Depending on the properties of the problem, different approaches to its solution can be applied. However, in the most general statement any algorithm can be described as a sequence of mappings πn : An × (Rm)n →A, which define the points of the current computation of objectives, depending on the results of the previous iterations: X n +1 = π n ( X1 , ..., X n , Y1 , ..., Yn ), Yi = F ( X i ), i = 1, ..., n. (2) Definition 1. Let us consider two vector valued objective functions F(X) and H(X), X in scales of function values, i.e. H(X) = (c1f1(X),..., cmfm(X))T + B A, differing only (3) where C and B are constant vectors in R that can assume not only finite, but also infinite and infinitesimal values expressed by the numerals defined in [9, 13]. The sequences of points generated by an algorithm, when applied to these functions, are denoted by Xi , i = 1, 2, …, and Vi , i = 1, 2,…, respectively. The algorithm that generates the identical sequences, Xi = Vi , i = 1, 2,…, is called strongly homogeneous. A weaker property of algorithms is considered in [3], where the algorithms that generate the identical sequences for the scalar functions f(X) and h(X) = f(X)+b are called homogeneous. Since the proper scaling of function values by translation alone is not always possible, we consider here the invariance of optimization results with respect to a more general (affine) transformation of the objective function values. The concept of strong homogeneity related to the algorithms of the single-objective optimization was introduced in [20]. m 3. A brief description of the multi-objective P-algorithm To validate the selection of a site for current computation/observation of the vector of objectives from the decision theory perspective, a model of objective functions is needed. The considered approach is based on statistical models. In the present paper, we assume that the objective functions considered are random realizations of the stochastic function, chosen for a model. However, the P-algorithm considered below can also be constructed using more general statistical models proposed in [16] in conformity with the ideas of the theory of subjective probabilities. To facilitate the implementation, the Gaussian stochastic functions normally are chosen for statistical models [1]. Some authors call the optimization, based on statistical models of objective functions, as kriging; the term “kriging” was coined in geostatistics to describe the statistical model-based interpolation methods [14]. The global single-objective optimization algorithms have been proved well suited for the problems with black box expensive objective functions. It is interesting to note that the other global optimization approach aimed at that class of problems, namely, the radial basis functions approach [3], corroborates the algorithms coincident with the P-algorithm based on the Gaussian model [18]. Recently the statistical model-based global optimization approach has attracted a considerable attention of experts in the multi-objective optimization; see e.g. [5, 6, 8, 21]. The chosen statistical models of particular objectives fj(X) comprise a vector-valued Gaussian random field Ξ(X), and it is accepted for the statistical model of F(X). In many real-world applied problems, thus in test problems as well, the objectives are not (or weakly) interrelated. Accordingly, in the present paper, the components of Ξ(X) are supposed to be independent. The correlation between the components of Ξ(X) could be included into the model; however it would imply some numerical and statistical inference problems that require a further investigation. 137 A. Žilinskas It is assumed that a priori information on the expected behavior (a form of variation over A) of objective functions is scarce. The heuristic assumption on the lack of a priori information is formalized as an assumption that ߦj(X); j = 1,…,m; are homogeneous isotropic random fields, i.e., that their mean values µj and variances ߪj2 are constants, and that the correlation between ξj(Xi) and ξj(Xk) depends only on ||Xi – Xk||; here and further ||·|| denotes the Euclidean norm in the vector spaces Rd and Rm considered. The choice of the exponential correlation function ρ(t) = exp(–ct), c>0, is motivated by the previous experience of use of single-objective global optimization algorithms based on such a statistical model. The parameters of the statistical model should be estimated using the data on F(·); since the components of Ξ(X) are assumed to be independent, the parameters of each ߦj(X) can be estimated separately. The minimization is considered at the n+1 minimization step. The points where the objective functions were computed are denoted by Xi i = 1,…,n, and the corresponding objective vectors are denoted by Yi = F(Xi). The vector-valued Gaussian random field with values in Rm: Ξ ( X ) ∈ Rm , X ∈ A ⊂ Rd , (4) Φ nX (Y ) = P{Ξ ( X ) ≤ Y | Ξ ( X i ) = Yi , i = 1, ..., n}. (5) X n+1 = arg max X ∈A P{Ξ ( X ) ≤ Y on | Ξ ( X _ i ) = Yi , i = 1, ..., n}, (6) is accepted as a model of the vector objective function considered. In the frame of that model an unknown vector of objectives F(X), X ≠ Xi , i = 1,…,n, is interpreted as a random vector the distribution of which is defined by the conditional distribution function of Ξ(X) The choice of the current observation point, i.e. of the point where to compute the vector of objectives at the current minimization step, is a decision under uncertainty. The statistical model (4) represents the uncertainty with respect to the result of that decision, therefore the choice of the current observation point, X n+1 ∈ A , means a choice of a distribution function from the set of distribution functions Φ nX (Y ) , X ∈ A . The postulates of rational decision making under uncertainty, when applied to substantiate such a choice, imply the P-algorithm [17, 21] a multi-objective version of which is defined by the following expression where Y on = ( y1on , y2on , ..., ymon )T is a vector not dominated by Yi = 1, ..., n ; for the sake of explicitness it is assumed that y on j = min1≤ h≤ n y jh j = 1, ..., m . The implementation of the multi-objective P-algorithm is similar to that of the single objective P-algorithm [15]. Since the components of the vector random field are assumed independent, the probability in (6) is computed as the product of particular probabilities related to the components of Ξ(X) P{Ξ ( X ) ≤ Y on | Ξ ( X i ) = Yi , i = 1, ..., n} = ∏ j =1 P{ξ j ( X ) ≤ ξ onj | ξ j ( X i ) = y ji , m ∏ ∫−∞ m yon j j =1 ∏ i = 1, ..., n} = (m j ( X | X i , y ji , i = 1, ..., n) − t )2 1 exp − dt = 2 s j ( X | X i , y ji , i = 1, ..., n) 2π m j =1 y on j − m j ( X | X i , y ji , i = 1, ..., n) , G s j ( X | X i , y ji , i = 1, ..., n) (7) where m j ( X | X i , y ji , i = 1, ..., n) and s j ( X | X i , y ji , i = 1, ..., n) denote the conditional mean and conditional standard deviation of the random field ξj(X) at the point X, and G(· ) stands for the Gaussian cumulative distribution function. 138 ON THE ARITHMETIC OF INFINITY ORIENTED IMPLEMENTATION OF THE MULTI-OBJECTIVE P-ALGORITHM 4. Strong homogeneity of the multi-objective P-algorithm To evaluate the influence of data scaling on the whole optimization process, two objective functions are considered (8) F(X) and H(x) = (c1f1(X),…,cmfm(X))T + B, m where C and B are constant vectors in R that can assume not only finite, but also infinite and infinitesimal values, expressed by the numerals defined in [9,13]. Let the first n function values be computed for both functions at the same points Xi, i = 1,…,n, and the corresponding function values be denoted as Yi = F(Xi) and Zi = H(Xi). The next points of computation of the values of F(·) and H(·) are denoted by Xn+1 and Vn+1. We are interested in the strong homogeneity of the P-algorithm, i.e. in the equality Xn+1 = Vn+1. To apply the P-algorithm to a particular problem, the parameters of the statistical model should be estimated accordingly. A sample of observations is comprised of the values of objective functions computed at the points randomly with a uniform distribution, generated in A; let the sample size be denoted by k < n. The parameters of the stochastic function are estimated using these observations which are also taken into account in a further optimization process. A Gaussian homogeneous isotropic random field is specified by the average, variance, and correlation function. Normally the correlation function is chosen a priori, depending on the supposed properties of the considered problem. For example, frequently the correlation function ρ(t) = exp(-ct) is chosen with the parameter 3 ≤ c ≤ 7 for A scaled to a unit hypercube. The larger the number of local minimizers, the larger value of c is chosen. The mean and variance are estimated using the methods of statistics. Let us consider the estimation of the vectors of mean values and variances of the components of the random Gaussian vector ζ . In case a small number of observations k is available, the following formulae frequently are used to obtain rough estimates of mean and variance of the components of ζ , although they are well justified only for independent observations γj = 1 k k ζ ji , ϑ 2j = ∑ i =1 (ζ ji − γ j )2 , ∑ i = 1 k (9) where ζ j = (ζ j1 , ..., ζ jk )T are the observed values of components of random vector ζ. ~ and The estimates of µ and σ2, obtained according to (9) using the data Yi and Zi are denoted as µ , σ 2 , respectively. It is obvious that the following equalities hold µ = ( c1 µ~1, ..., cm µ~m )T + B, σ j = c jσ j , j = 1, ..., m. (10) The maximum likelihood estimates of mean and variance of the components of the Gaussian random vector ζ in the case of the correlated observations are defined by the following formulae ∑ ∑ ζ jiτ ih , γ j = i =1k h=k1 ∑ i=1 ∑ h=1τ ih k k 1 ϑ 2j = (ζ j − γ j I )T K −1 (ζ j − γ j ), k 12) where the correlation coefficients between ζ ji and ζ jh are equal to ρih , and τ ih are elements of the inverted correlation matrix ρ11 ... ρ1k −1 K = ... ... ... ρ k1 ... ρkk −1 τ11 ... τ1k = ... ... ... . τ k1 ... τ kk (13) It can be proved that for the estimates of µ and σ2, obtained according to (11–12), using data Yi and Zi correspondingly, the equalities (10) hold. Below we assume that the estimates of the mean value and variance of the homogeneous isotropic Gaussian random field, used for a statistical model, satisfy equalities (10). 139 A. Žilinskas Theorem. The Gaussian model based multi-objective P-algorithm is strongly homogeneous. Proof. As shown in Section 4, the current point of computation of the vector value of H(X) by the P-algorithm (denoted by Vn+1) is defined by the following formula Vn+1 = arg max X ∈A ∏ m j =1 z on j − m j ( X | X i , z ji , i = 1, ..., n) . G s j ( X | X i , z ji , i = 1, ..., n) (14) The conditional mean and conditional variance in (14) are dependent on the data Xi; Yi, i = 1,…,n, as follows m j ( X | X i , z ji , i = 1, ..., n) = µ j + ( z j − µ j Ι )T Σ−1ϒ = c j µ j + b j + ( c j y j + b j Ι − ( c j µ j + b j )Ι )T Σ −1ϒ = c j m j ( X | X i , y ji , i = 1, ..., n) + b j , (15) ~ c 2j σ 2j − (c j y j − c j µ j I )T Σ −1 (c j y j − c j µ j I ) = c 2j s 2j ( X | X i , y ji , i = 1, ..., n), (16) where Σ is the matrix composed of ρ(||Xi–Xh||) – correlation coefficients between ξj(Xi) and ξj(Xh), i=1,...,n, and ϒ = ( ρ ( | X − X1 | ), ..., ρ ( | X − X n | ))T . The replacement of m j ( X | X i , z ji , i = 1, ..., n) and s 2j ( X | X i , z ji , i = 1, ..., n) in (14) by the expres- sions (15) and (16) implies the equality m z on j − m j ( X | X i , z ji , i = 1, ..., n) = Vn+1 = arg max ∏ G X ∈A s j ( X | X i , z ji , i = 1, ..., n) j =1 m y on j − m j ( X | X i , y ji , i = 1, ..., n ) = X n+1. arg max ∏ G X ∈A s j ( X | X i , y ji , i = 1, ..., n) j =1 The equality between the current point of computing the vector-value of H(·), i.e. Vn+1 and the current point of computing the vector-value of F(·), i.e. X n +1 , completes the proof of the strong homogeneity of the multi-objective P-algorithm. 5. Conclusions The multi-objective P-algorithm is strongly homogeneous. Therefore the computationally advantageous second type implementation of the multi-objective P-algorithm in the arithmetic of infinity is correct. Acknowledgement The research was supported by the Research Council of Lithuania under Grant No. MIP-063/2012. References 1. Calvin, J. M.; Zilinskas, A. (2000). A one-dimensional P-algorithm with convergence rate O(n–3+δ) for smooth functions, Journal of Optimization Theory and Applications 106: 297–307. http://dx.doi.org/10.1023/A:1004699313526 2. De Cosmis, S.; De Leone, R. (2012). The use of grossone in mathematical programming and operations research, J. Appl. Math. Comput. 218(16): 8029–8038. http://dx.doi.org/10.1016/j.amc.2011.07.042 140 ON THE ARITHMETIC OF INFINITY ORIENTED IMPLEMENTATION OF THE MULTI-OBJECTIVE P-ALGORITHM 3. Elsakov, S. M.; Shiryaev, V. I. (2010). Homogeneous algorithms for multiextremal optimization, Computational Mathematics and Mathematical Physics 50(10): 1642–1654. http://dx.doi.org/10.1134/S0965542510100027 4. Gutmann, H. (2001). A radial basis function method for global optimization, Journal of Global Optimization 19: 201–227 http://dx.doi.org/10.1023/A:1011255519438 5. Keane, A.; Scalan, J. (2007). Design search and optimization in aerospace engineering, Phil. Trans. R. Soc. A, 365, 2501–2529. http://dx.doi.org/10.1098/rsta.2007.2019 6. Knowles, J. (2006). ParEGO: A hybrid algorithm with on-line landscape approximation for expensive multiobjective optimization problems, IEEE Trans. Evolutionary Computation 10(1): 50–66. http://dx.doi.org/10.1109/TEVC.2005.851274 7. Miettinen, K. (1999). Nonlinear multiobjective optimization, Springer. 8. Nakayama, H. (2009). Sequential approximate multiobjective optimization using computational intelligence, Springer. 9. Sergeyev, Ya. D. (2005). A few remarks on philosophical foundations of a new applied approach to infnity, Scheria, 26–27, 63–72 10. Sergeyev, Ya. D. (2007). Blinking fractals and their quantitative analysis using infinite and infinitesimal numbers, Chaos, Solitons & Fractals 33(1): 50–75. http://dx.doi.org/10.1016/j.chaos.2006.11.001 11. Sergeyev, Ya. D. (2009). Numerical computations and mathematical modeling with infinite and infinitesimal numbers, J. Appl. Math. Comput. 29: 177–195. http://dx.doi.org/10.1007/s12190-008-0123-7 12. Sergeyev, Ya. D. (2009). Computer system for storing infinite, infinitesimal, and finite quantities and executing arithmetical operations with them. EU patent 1728149. 13. Sergeyev, Ya. D. (2010). Lagrange lecture: methodology of numerical computations with infinities and infinitesimals, Rendiconti del Seminario Matematico dell'Universit e del Politecnico di Torino, 68(2): 95–113. 14. Stein, M. (1999). Interpolation of spatial data, some theory of kriging, Springer. http://dx.doi.org/10.1007/978-1-4612-1494-6 15. Torn, A.; Zilinskas A. (1989). Global optimization, Lecture Notes in Computer Science 350: 1–255. http://dx.doi.org/10.1007/3-540-50871-6 16. Zilinskas, A. (1982). Axiomatic approach to statistical models and their use in multimodal optimization theory, Mathematical Programming 22: 104–116. http://dx.doi.org/10.1007/BF01581029 17. Zilinskas, A. (1985). Axiomatic characterization of a global optimization algorithm and investigation of its search strategies, Operations Research Letters 4: 35–39. http://dx.doi.org/10.1016/0167-6377(85)90049-5 18. Zilinskas, A. (2010). On Similarities between two Models of global optimization: statistical models and radial basis functions, Journal of Global Optimization 48(1): 173–182. http://dx.doi.org/10.1007/s10898-009-9517-9 19. Zilinskas, A. (2011). Small sample estimation of parameters for Wiener process with noise, Communications in Statistics, Theory and Methods 40: 3020–3028. http://dx.doi.org/10.1080/03610926.2011.562788 20. Zilinskas, A. (2012). On strong homogeneity of two global optimization algorithms based on statistical models of multimodal objective functions, J. Appl. Math. Comput. 218(16): 8131–8136. http://dx.doi.org/10.1016/j.amc.2011.07.051 21. Zilinskas, A. (2012). A statistical model-based algorithm for black-box multi-objective optimization, International Journal of System Science, accepted. 22. Zilinskas, A.; Zilinskas, J. (2010). Interval arithmetic based optimization in nonlinear regression, Informatica 21(1): 149–158. 141
© Copyright 2026 Paperzz