Numerical studies of space filling designs: optimization algorithms and subprojection properties G. Damblin, M. Couplet and B. Iooss EDF R&D, 6 Quai Watier, F-78401, Chatou, France Submitted to: Journal of Simulation for the special issue “Input & Output Analysis for Simulation” Correspondance: B. Iooss ; Email: [email protected] Phone: +33-1-30877969 ; Fax: +33-1-30878213 Abstract Quantitative assessment of the uncertainties tainting the results of computer simulations is nowadays a major topic of interest in both industrial and scientific communities. One of the key issues in such studies is to get information about the output when the numerical simulations are expensive to run. This paper considers the problem of exploring the whole space of variations of the computer model input variables in the context of a large dimensional exploration space. Various properties of space filling designs are justified: interpoint-distance, discrepancy, minimal spanning tree criteria. A specific class of design, the optimized Latin Hypercube Sample, is considered. Several optimization algorithms, coming from the literature, are studied in terms of convergence speed, robustness to subprojection and space filling properties of the resulting design. Some recommendations for building such designs are given. Finally, another contribution of this paper is the deep analysis of the space filling properties of the design 2D-subprojections. Keywords: discrepancy, optimal design, Latin Hypercube Sampling, computer experiment. 1 Introduction Many computer codes, for instance simulating physical phenomena and industrial systems, are too time expensive to be directly used to perform uncertainty, sensitivity, optimization or robustness analyses [5]. A widely accepted method to circumvent this problem consists in replacing such computer models by cpu time inexpensive mathematical functions, called metamodels (author?) [19], built from a limited number of simulation outputs. Some commonly used metamodels are: polynomials, splines, generalized linear models, or learning statistical models like neural networks, regression trees, support vector machines and Gaussian process models 1 [8]. In particular, the efficiency of Gaussian process modelling has been proved for instance by [33, 35, 25]. It extends the kriging principles of geostatistics to computer experiments by considering that the code responses are correlated according to the relative locations of the corresponding input variables. A necessary condition to a successful metamodelling is to explore the whole space X ⊂ Rd of the input variables X ∈ X (called the inputs) in order to capture the non linear behaviour of some output variables Y = G(X) ∈ Rq (where G refers to the computer code). This step, often called the Design of Computer Experiments (DoCE), is the subject of this paper. In many industrial applications, we are faced with the harsh problem of the high dimensionality of the space X to explore (several tens of inputs). Some authors [37, 8] have shown that the Space Filling Designs (SFD) are well suited to this task. A SFD aims at obtaining the best coverage of the space of the inputs. Moreover, the SFD approach appears natural in an first exploratory phase, when very few pieces of information are available about the numerical model, or if the design is expected to serve different objectives (for example, providing a metamodel usable for several uncertainty quantifications relying on different hypothesis about the uncertainty on the inputs). However, the class of SFD is large, including the well known Latin Hypercube Samples (LHS)1 low discrepancy sequences [29, 8], maximum entropy designs [36], minimax and maximin designs [18] or point process designs [10]. Here, the purpose is to shed new light on the practical issue of building a SFD. d In the following, X is assumed to be [0, 1] , up to a bijection. Such a bijection is never unique and two ones generally lead to non equivalent ways of filling the space of the inputs. In practice, during an exploratory phase, only pragmatic answers can be given to the question of the choice of the bijection: maximum and minimum bounds are generally given to each scalar input so that d X is an hypercube which can be mapped over [0, 1] through a linear transformation. It can be noticed that, if there is a sufficient reason to do so, considering what the computer code actually models, it is always possible to apply simple changes of input variables if it seems relevant (e.g. considering the input z = exp(x) ∈ [exp(a), exp(b)] instead of x ∈ [a, b]). Furthermore, if a joint probability distribution is given to the inputs, we argue that it remains interesting to define a SFD to explore X as soon as a bijection such that the input image distribution is uniform over d Ud = [0, 1] can be handled (it suffices to inverse the marginal cumulative distribution functions in the case of independent scalar inputs; see section 3.1). In what follows, the fact that the problem finally comes to the “homogeneous” filling of Ud is postulated and the main question addressed is how to build or to select a DoCE of a given (and small) size N (N ∼ 100, typically) within Ud (where d > 10, typically). We keep also in 1 In the following, LHS may refer to Latin Hypercube Sampling as well. 2 mind the well-known and empirical relation N ∼ 10d [22, 24] which gives the approximative minimum number of computations needed to get an accurate metamodel. A first consideration is to obtain the best global coverage rate of Ud . It requires the definition of criteria based on distance, geometrical or uniformity measures [18, 20]. A second prescription is to uniformly cover the variation domain of each scalar input. Indeed, it often happens that, among a large number of inputs, only a small one is active, that is significantly impacts the outputs (sparsity principle). Then, in order to avoid useless computations (different values for inactive inputs but same values for active ones), we have to ensure that all the values for each input are different, which can be achieve by using LHS. A last important property of a SFD is its robustness to projection over subspaces. This property is particularly studied in this paper and the corresponding results can be regarded as the main contributions. Litterature about the application of the physical experimental design theory shows that, in most of the practical cases, effects of small degree (involving few factors, that is few inputs) dominates effects of greater degree. Therefore, it seems judicious to favour a SFD whose subprojections offer some good coverages of the low-dimensional subspaces. A first view, which is adopted here, is to explore two-dimensional (2D) subprojections. In the following section, two industrial examples are described in order to motivate our concerns about SFD. Section 3 gives a review about coverage criteria and natures of SFD studied in the next section. In fact, the preceding considerations conduct us to focus our attention on optimized LHS. Various optimization algorithms for LHS have been previously proposed (main works are [27] and [17]). We adopt a numerical approach to compare the performance of different LHS, in function of their interpoint-distance and L2 -discrepancies. Section 4 focuses on their 2D subprojection properties and numerical tests support some recommendations. A conclusion synthesizes this work. 2 2.1 Motivating examples Nuclear safety simulation studies Assessing the performance of nuclear power plants during accidental transient conditions has been the main purpose of the thermal-hydraulic safety research for decades. Sophisticated computer codes have been developed and are now widely used. They can calculate time trends of any variable of interest during any transient in Light Water Reactors (LWR). However, the reliability of the predictions cannot be evaluated directly due to the lack of suitable measurements in plants. The capabilities of the codes can consequently only be assessed by comparison of calculations with experimental data recorded in small-scale facilities. Due to this difficulty, but 3 also the “best-estimate” feature of the codes quoted above, uncertainty quantification should be performed when using them. In addition to uncertainty quantification, sensitivity analysis is often carried out in order to identify the main contributors to uncertainty. Those thermal-hydraulic codes enable, for example, to simulate a large-break loss of primary coolant accident (see Figure 1). This scenario is part of the Benchmark for Uncertainty Analysis in Best-Estimate Modelling for Design, Operation and Safety Analysis of Light Water Reactors [4] proposed by the Nuclear Energy Agency of the Organisation for Economic Cooperation and Development (OCDE/NEA). It has been implemented on the french computer code CATHARE2 developed at the Commissariat à l’Energie Atomique (CEA). Figure 2 illustrates 100 Monte Carlo simulations (by randomly varying the inputs of the accidental scenario), given by CATHARE2, of the cladding temperature in function of time, whose first peak is the main output of interest in safety studies. Figure 1: Illustration of a large-break loss of primary coolant accident on a nuclear Pressurized Water Reactor (a particular but common type of LWR). Severe difficulties arise when carrying out a sensitivity analysis or a uncertainty quantification involving CATHARE2: • Physical models involve complex phenomena (non linear and subject to threshold effects), with strong interactions between inputs. A first objective is to detect these interactions. Another one is to fully explore the combinations of the input to obtain a good idea of the possible transient curves [1]. • Computer codes are cpu time expensive: no more than several hundreds of simulations can be performed. • Numerical models take as inputs a large number of uncertain variables (d = 50, typically): physical laws essentially, but also initial conditions, material properties and geometrical 4 Figure 2: 100 output curves of the cladding temperature in function of time from CATHARE2. parameters. Truncated normal or log-normal distributions are given to them. Such a number of inputs is extremely large for the metamodelling problem. • The first peak of the cladding temperature can be related to rare events: problems turn to the estimation of a quantile [2] or the probability that the output exceeds a treshold [28]. All of these four difficulties underline the fact that great care is required to define a effective DoCE over the CATHARE2 input space. The high dimension of the input space remains a challenge for building a SFD with good subprojection properties. 2.2 Prey-predator simulation chain In ecological effect assessments, risks imputable to chemicals are usually estimated by extrapolation of single-species toxicity tests. With increasing awareness of the importance of indirect effects and keeping in mind limitations of experimental tools, a number of ecological food-web models have been developed. Such models are based on a large set of bioenergetic equations describing the fundamental growth of each population, taking into account grazing and predatorprey interactions, as well as influence of abiotic factors like temperature, light and nutrients. They can be used for several purposes, for instance: • to test various contamination scenarios or recovery capacity of contaminated ecosystem, • to quantify the important sources of uncertainty and knowledge gaps for which additional 5 data are needed, and to identify the most influential parameters of population-level impacts, • to optimize the design of field or mesocosm tests by identifying the appropriate type, scale, frequency and duration of monitoring. Following this rationale, an aquatic ecosystem model, MELODY2 , was built so as to simulate the functioning of aquatic mesocosms as well as the impact of toxic substances on the dynamics of their populations. A main feature of this kind of ecological models is, however, the great number of parameters involved in the modelling: MELODY has a total of 13 compartments and 219 parameters; see Figure 3. These are generally highly uncertain because of both natural variability and lack of knowledge. Thus, sensitivity analyses appears an essential step to identify non-influential parameters [3]. These can then be fixed at a nominal value without significantly impacting the output. Consequently, the calibration of the model becomes less complex [34]. Figure 3: Representation of the module chain of the aquatic ecosystem model MELODY. By a preliminary sensitivity analysis of the periphyton-grazers submodel (representative of processes involved in dynamics of primary producers and primary consumers and involving 20 input parameters), [16] concludes that significant interactions of large degrees (more than three) exist in this model. Therefore, a DoCE has to possess excellent subprojection properties to capture the interaction effects. 2 modelling MEsocosm structure and functioning for representing LOtic DYnamic ecosystems 6 3 Space filling criteria and designs (i) Building a DoCE consists in generating a matrix XN d = (xj )i=1..N,j=1..d , where N is the number of experiments and d the number of scalar inputs. Let us recall that, here, the purpose is to d design N experiments3 x(i) to fill as “homogeneously” as possible the set [0, 1] ; even if, in an exploratory phase, a joint probability distribution is not explicitely given to the inputs, one can consider that these are morally independent and uniformly distributed over [0, 1]. The most common sampling method is indiscutably the classical Monte Carlo (Simple Random Sampling, SRS), mainly because of its simplicity and generality [12], but also because of the difficulty to sample in a more efficient manner when d is large as well. In our context, it would consist in randomly, uniformly and independently sampling d × N draws in [0, 1]. Yet, it is known to possess poor space filling properties: SRS leaves wide unexplored regions and can propose very close points. The next sections are mainly dedicated to (optimized) LHS, owing to their property of parsimony mentioned in the introduction, and do not refer to other interesting classes of SFD, neither maximum entropy designs nor point process designs in particular. The former is based on a criteria (Shannon entropy) to maximize which could be used to optimize LHS. The resulting SFD is similar to point-distance optimized LHS (section 3.2.2), as shown by theoretical works [31] and confirmed by our own experiments. The latter way of sampling seems hardly compatible with LHS and suffers from the lack of efficient rules to set its parameters [9]. In this section, criteria used hereafter to make diagnosis of DoCE or to optimize LHS are defined. Computing optimized LHS requires an efficient, in fact specialized, optimization algorithm. Since the literature provides numerous ones, a brief overview of a selection of such algorithms is proposed. The section ends with our feedback on their behaviours, with an emphasis on maximin DoCE. 3.1 Latin Hypercube Sampling Latin Hypercube Sampling, which is an extension of stratified sampling, aims at ensuring that each of the scalar input has the whole of its range well scanned, according to a probability distribution4 [26]. Even if our starting assumption is simply the relevance to fill “homogeneously” Ud , which is achieved by using LHS in supposing independent uniform distributions in the remainder of the paper (see comments of sections 1 and 3), LHS is introduced in a broader context below (independent but potentially non-uniform scalar inputs). Let the range I of each scalar input Xj , j = 1 . . . d, be partitioned into N equally probable 3 4 In the following, a “point” corresponds to an “experiment” (at least a subset of experimental conditions). The range is the support of the distribution. 7 (k,∗) intervals Ik . A LHS of size N is obtained from a random draw of N values xj for each Xj , k = 1 . . . N , one per interval Ik (according to the truncated distribution of Xj over Ik ). Then, d permutations πj of {1, . . . , N } are randomly chosen (uniformly among the N ! possibilities) and (i) (k,∗) applied to the N -tuples: xj = xj XN d = (i) (xj )i=1..N,j=1..d iff i = πj (k). Thus we obtain the matrix of experiments of a LHS: the ith line x(i) of this matrix will correspond to the inputs of the ith code execution. See Figure 4 for an illustration. Another way to get a LHS is to draw (1) xj (2) inside I, then to draw xj such that (2) xj (1) inside I\Il such that xj (3) ∈ Il , then xj inside I\ (Il ∪ Im ) ∈ Im , and so on (according to the truncated distributions). Eventually, if the Xj are mutually independent random variables with invertible cumulative distribution functions (CDF) Fj , then the LHS i-th draw for the j-th input can be created as (i) (i) xj = Fj−1 (i) πj − ξj N ! , (1) where the πj are independent uniform random permutations of the integers {1, 2, . . . , N }, and (i) the ξj are independent U([0, 1]) random numbers independent from the πj . X X X X1 X X X X X X X X X X2 Figure 4: Three examples of LHS of size N = 4 over U2 = [0, 1]2 (with regular intervals): each of the N rows (each of the N columns, respectively), which corresponds to an interval of X1 (of X2 , resp.), contains one and only one draw x(i) (cross). When building a LHS, another possibility is to select the center of the stratum instead of drawing randomly. However, this discretized version of LHS is inadequate for the LHS optimization process: it leads to dicrete values of the space filling criteria, rending the convergence more difficult [23]. We then only consider “randomized” LHS in this paper. 3.2 Space filling criteria As stated previously, LHS is a relevant way to design experiments, considering one-dimensional projection. Nevertheless, LHS does not ensure to fill the input space properly. Some LHS can indeed be really unsatisfactory, like the first design of Figure 4 which is almost diagonal. 8 LHS may consequently perform poorly in metamodel estimation and prediction of the model output [15]. Therefore, some authors have proposed to enhance LHS not to only fill space in one-dimensional projection, but also in higher dimensions [30]. One powerful idea is to adopt some optimality criterion applied to LHS, such as entropy, discrepancy, minimax and maximin distances, etc. This leads to avoid undesirable situations, such as designs with close points. The next sections propose some quantitative indicator of space filling useful i) to optimize LHS or ii) to assess the quality of a design. Section 3.2.1 introduces some discrepancy measures which are relevant for both purposes. Section 3.2.2 and 3.2.3 introduce some criteria based on the distances between the points of the design. The former is about the minimax and maximin criteria, which are relevant for i) but not for ii), and the latter is about a poorly criteria (the MST one), which gives an interesting insight of the filling characteristics of a design, but cannot reasonnably be used for i). 3.2.1 Uniformity criteria Discrepancy measures consist in judging the uniformity quality of the design. Discrepancy can be seen as a measure of the gap between the considered configuration and the uniform one. The (i) star discrepancy of a design XN d = (x )i=1...N over Ud is defined as N 1 X ∗ N 1 {x(i) ∈[0,y]} − Volume([0, y]) where [0, y] = [0, y1 ] × · · · [0, yd ]. D (Xd ) = sup y∈Ud N i=1 (2) As an interpretation, the discrepancy measure lies on a comparison between the volume of intervals and the number of points within these intervals [13]. In fact, definition (2) corresponds to the greater difference between the value of the CDF of the uniform distribution over Ud (right term) and the value of the empirical CDF of the design (left term). In practice, the star discrepancy is not computable because of the L∞ -norm used in formula (2). Hence L2 -norms are used [8, 21]. For example, the star L2 -discrepancy can be written as follows: Z " #2 12 N X 1 D2∗ (XN ) = − Volume([0, y]) dy 1 . (i) d N i=1 {x ∈[0,y]} (3) χ Different discrepancy definitions exist, by using different forms of intervals5 or different norms in the functional space. Discrepancy measures based on L2 -norms are the most popular in practice because they can be analytically expressed and are easy to compute. Among them, two measures have shown remarkable properties [17, 7, 8]. Indeed, Fang has defined seven properties for uniformity measures including subprojections uniformity (a particularity of the so-called modified discrepancies) and invariance by coordinate rotation as well. Both centered discrepancy (C 2 ) and wrap-around discrepancy (W 2 ) follow this way: 5 That is intervals [z, y] such that z 6= 0. 9 • the centered L2 -discrepancy C 2 (XN d ) N d 2 XY 1 (i) 1 (i) 2 − = 1 + |xk − 0.5| − |xk − 0.5| N i=1 2 2 k=1 d N Y X 1 1 (j) 1 (i) 1 (i) (j) + 2 1 + |xk − 0.5| + |xk − 0.5| − |xk − xk | , N i,j=1 2 2 2 13 12 d (4) k=1 • the wrap-around L2 -discrepancy W 2 (XN d ) d N d 1 X Y 3 4 (i) (j) (i) (j) + 2 − |xk − xk |(1 − |xk − xk |) , = 3 N i,j=1 2 (5) k=1 which allows to suppress bound effects (by wrapping the unit cube for each coordinate). 3.2.2 Point-distance criteria [18] introduced two distance-based criteria. The first idea consists in minimizing the distance between a point of the input domain and the points of the design. The corresponding criterion to minimize is called the minimax criterion φmM (·): φmM XN = max min kx − x(i) kLp , d x∈χ x(i) (6) with p = 2 (euclidian distance), typically. A small value of φmM for a design means that there is no point of the input domain too distant from a point of the design. This appears important from the point of view of the Gaussian process metamodel, which is typically based on the assumption of decreasing correlation of outputs with the distance between the corresponding inputs. However, this criterion needs the computations of all the distances between every points of the domain and every points of the design. In practice, an approximation of φM m is obtained via a fine discretization of the input domain. However, this approach becomes impracticable for input dimension d larger than three [31]. φmM could be also derived from the Delaunay tessellation that allows to reduce the computationnal cost [31], but dimensions d larger than four or five remain an issue. A second proposition is to maximize the minimal distance separating two design points. Let us note dij = ||x(i) − x(j) ||Lp , with p = 2, typically. The so-called mindist criterion φM m (refering for example to mindist() routine from DiceDesign R package) is worth φ M m XN = d min i,j=1...N,i6=j dij . (7) For a given dimension d, a large value of φM m tends to separate the design points from each other, so allows a better space coverage. The mindist criterion has been shown to be easily computable but difficult to optimize. Regularized versions of mindist have been listed in [31], allowing to carry out more efficient 10 optimization in the class of LHS. In this paper, we use the φp criterion: p1 d−p ij X φp X N = d . (8) i,j=1...N,i<j The following inequality, proved in [31], shows the asymptotic link between φM m and φp . If one defines ξ ?p as the design which maximizes φp and ξ ? as the one which maximizes φM m , then: 1≥ φM m (ξ ?p ) φM m (ξ ? ) ≥ −1/p n . 2 (9) Let be a treshold, then (9) implies: φM m (ξ ?p ) φM m (ξ ? ) ≥ 1 − for p ' 2 ln n . (10) Hence, when p tends to infinity, minimizing φp is equivalent to maximizing φM m . Therefore, in practice, a large value of p is taken. p = 50, which is proposed in [27], has been shown to be sufficient up to d = 20 in our numerical experiments. The commonly so-called maximin design (x(i) )i=1...N ) is the one which maximizes φM m , and minimizes the number of pairs of points exactly separated by the minimal distance. In the following, let us now call a maximin LHS as an optimized LHS relative to the φp criterion or the φM m one. 3.2.3 MST criteria The Minimum Spanning Tree (MST) criteria [6], recently introduced for studying SFD [11, 9], enables to analyze the geometrical profile of designs according to the distances between points. Regarding design points as vertices, a MST is a tree which connects all the vertices together and whose sum of edge lengths is minimal. Once one built a MST for a design, mean m and standard deviation σ of edge lengths can be calculated. Designs described as quasi-periodic are characterized by large mean m and small σ ([11]) compared to random designs or standard LHS (see some examples in [14]). Such quasi-periodic designs fill the space efficiently from the point-distance perspective: large m is related to large interpoint-distance and small σ means that the minimal interpoint-distances between all couples of points are similar. Moreover, one can introduce a partial order relation for designs based on MST: a design D1 fills better the space than a design D2 if m(D1 ) > m(D2 ) and σ(D1 ) < σ(D2 ). MST is a relevant approach focusing on design arrangement and, because m and σ are global characteristics, it makes much more robust diagnosis than the mindist criterion does. If a design with high mindist implies a quasi periodic distribution, the reciprocal is false as illustrated in Figure 5 (on the left). Besides, the MST criteria appear rather difficult to optimize using 11 stochastic algorithms unlike the previous ones (see the next section). However, our numerical experiments lead to conclude that producing maximin LHS is equivalent to building a quasiperiodic distribution in the LHS design class. Figure 5: Illustration of two quasi-periodic LHS (left: design with good (large) mindist, right: design with bad (small) mindist). 3.3 Optimization of Latin Hypercube Sample Within the class of latin hypercube arrangements, optimizing a space-filling criterion in order to avoid undesirable arrangements (such as the diagonal ones which are the worst cases, see also Figure 4) appear very relevant. Optimization can be performed following different approaches, the most natural being the choice of the best LHS (according to the chosen criterion) among a large number (e.g. one thousand). Due to the extremely large number of possible LHS ((!N )d for discretized LHS and infinite for randomized LHS), this method is rather unefficient. Other methods have been developed, based on columnwise-pairwise exchange algorithms, genetic algorithms, Simulated Annealing (SA), etc.: see [39] for a review. Thus, some practical issues are which algorithm to use to optimize LHS and how setting its numerical parameters. Since an exhaustive benchmark of the available methods (with different parameterizations for the more flexible ones) is hardly possible, we choose to focus on a limited number of specialized stochastic algorithms: the Morris and Mitchell (MM) version of SA [27], a simple variant of MM developed in [23] (Boussouf algorithm) and a stochastic algorithm called ESE (“Enhanced Stochastic Algorithm”, [17]). We compare their performance in terms of different space-filling 12 criteria of the resulting designs. 3.3.1 SA algorithms SA is a probabilistic metaheuristic to solve global optimization problems. The approach can provide a good optimizing point in a large search space. Here, we would like to explore the space of LHS. In fact, the optimization is carried out from an initial LHS (standard random LHS) which is (expected to be) improved through elementary random changes. An elementary change of a LHS XN d is done in switching two randomly chosen coordinates from a randomly chosen column, which keeps the latin hypercube nature of the sample. The re-evaluation of the criterion after each elementary change could be very costly (in particular for discrepancy). Yet, taking into account that only two coordinates are involved in an elementary change leads to cheap expressions for the re-evaluation. In [17], formula to re-evaluate in a straightfoward way φp and the C 2 discrepancy have been established. We have extended it to any L2 -discrepancies (including W 2 and star L2 -discrepancy). The main ideas of SA are the following ones. Designs which do not improve the criterion (bad designs) can be accepted to avoid to get trapped around a local optimum. At each iteration, elementary change of the current design is proposed, then accepted with a probability which depends on a quantity T called temperature which evolves from an initial temperature T0 acoording to a certain temperature profile. The temperature decreases with the iterations and less and less bad designs are accepted. The main issue of SA is to properly set the initial temperature and the parameters which define the profile to get a good trade-off between a sufficiently wide exploration of the space and a quick convergence of the algorithm. Finally, a stopping criterion must be specified. The experiments hereafter are based on a maximum number of iterations (useful to compare the different algorithms), but more sophisticated criteria could be more relevant to save computations. The Boussouf SA algorithm has been introduced in [23]. The temperature is decreasing following a geometrical profile T = ci × T0 at the ith iteration with 0 < c < 1. Therefore, the temperature decreases exponentially with the iterations and c must be set very close to 1 when the dimension d is high. In this case, SA can sufficiently explore the LHS designs space if enough iterations are performed and this criterion tends rapidly to a correct approximation of the optimum. The MM (Morris and Mitchell) SA algorithm [27] was initially proposed to generate maximin LHS. It can be used to optimize discrepancy criteria, or others as well. In opposition to Boussouf SA, its temperature profile is linear and the temperature does not change at every iterations. Moreover, its down grade is a function of a parameter Imax : T decreases only if the criterion is 13 not improved during a row of Imax iterations. Morris & Mitchell proposed some heuristic rules to set the different parameters of the algorithm from N and d. We noticed that these rules do not always perform well: some settings can conduct to relatively slow convergences. 3.3.2 Enhanced Stochastic Evolutionary algorithm (ESE) ESE is an efficient and flexible stochastic algorithm to optimize LHS [17]. It relies on a precise control of a quantity similar to the temperature of SA through an exploration step, then an improving process. Unlike SA, the temperature can increase from an iteration to the next. Futhermore, M new LHS are randomly built from the current one at each step (M = 1 for the SA). The authors expect that ESE can improve LHS using less elementary perturbations than SA (see our test in the next paragraph). Futhermore, the default algorithm parameters (used in figure 6) suggested by the authors appeared efficient whatever d is. 3.3.3 Feedback on LHS optimization Boussouf SA holds a geometric profile which is well-adapted for d = 2 and 3. When d raises, it becomes more and more difficult to approach the global optimum. Indeed, the algorithm is rather sensitive to T0 and c which are delicate to set. The linear profile is actually preferable to perform an efficient exploration of the space. To compare the performances of MM SA and ESE through an example (dimension d = 5), figure 6 is focusing on the evolution of mindist values. We can see that with only 4 iterations, optimized LHS from ESE looks over 0.5. Let us recall that for MM SA, an iteration corresponds to Imax elementary perturbations, while for ESE, an iteration corresponds to M elementary perturbations. Regarding the structure of the algorithms, we can easily compute the corresponding number of elementary permutations. ESE produces 20000 permutations in 4 iterations. It is much smaller than SA which needs approximately 60000 to exceed a mindist of 0.5. As a consequence, ESE seems a powerful routine to quickly perform LHS with excellent mindist value. Such results could also be seen with any L2 -discrepancy measures as well. We perform an additional test to show the interest of the regularization of the mindist criterion (see section 3.2.2). We produce some optimized LHS through both mindist criterion and φp criterion (see Figure 7). If the optimizations are performed from φp , a clear improvement is noted. Hence, we use the φp criterion in the following to build maximin LHS. As we need an algorithm with a fast convergence, we use Boussouf SA in the following section to carry out LHS comparisons. Our results do not suffer of the non-optimality of the resulting designs because our purpose is just to compare space filling criteria. 14 Figure 6: For maximin LHS designs (n = 50, d = 5, p = 50) obtained by MM SA and ESE algorithms: mindist criterion in function of the algorithm iteration number. Boxplots are produced through 30 optimizations at each iteration. Left: MM SA with T0 = 2, Imax = 100, c = 0.9. Right: ESE with T0 = 0.005 × φp (LHS), M = 500. Figure 7: For maximin LHS designs (n = 100, d = 10) obtained by Boussouf SA (T0 = 10, c = 0.95): mindist criterion in function of the algorithm iteration number. For each iteration, mindist criterion is taken as the mean from 30 optimizations. Optimization criteria are φp (red curve) and mindist (green curve). 15 4 Robustness to projections over 2D subspaces 4.1 Motivations An important characteristic of a SFD XN d over Ud is its robustness to projections over lowerdimensional subspaces, which means that the k-dimensional subsamples of the SFD, k < d, obtained by deleting d − k columns of the matrix XN d , fill Uk efficiently (according to a space filling criteria). A LHS structure for the SFD is not sufficient because it only guarantees good repartitions for one-dimensional projections, and not for projections of greater dimensions. Indeed, to capture precisely an interaction effect between some inputs, a good coverage is required onto the subspace spanned by these inputs (see section 2.2). Another reason why that property of robustness really matters is that a metamodel fitting can be made in a smaller dimension than d (see an example in [2]). In practice, this is often the case because the output analysis of an initial design (“screening step”) may reveal some useless (i.e. non influent) input variables that can be neglected during the metamodel fitting step [32]. Moreover, when a selection of input variables is made during the metamodel fitting step (as for example in [25]), the new sample, solely including the retained input variables, has to keep good space filling properties. In the remainder of the paper, we focus on the space filling quality of the 2D projections of optimized LHS. Moreover, most of the times, interaction effects of order two (i.e. between two inputs) are significantly larger than interaction of order three, and so on. Then, we only consider in this first study the 2D subprojections. Discrepancy and point-distance based criteria can be regarded as relevant measures to quantify the quality of space-filling designs. Unfortunately, it has been shown that they are incompatible in high dimension in the sense that a quasi-periodic design (see section 3.2.3) does not reach the lowest value of any discrepancy measure (or a LHS of same size N ). If we derive the MST criterion for large-dimensional optimized LHS refered to both criteria, we observe a difference for the mean (m) values and standard deviation (σ) values (see section 3.2.3). Moreover, we observe that the mean (resp. σ) of C 2 -discrepancy optimized LHS is larger (resp. smaller) than the mean (resp. σ) of W 2 -discrepancy optimized LHS (see Figure 8). This is the reason why we will mainly focus on C 2 -discrepancy instead of W 2 -discrepancy (in the view of the partial order relation defined in section 3.2.3). Below, we perform some tests to underline subprojections uniformity of LHS optimized using different space-filling criteria. Quality of optimized LHS are analyzed using discrepancies, then considering the MST criteria. All of the tests of this section are made with N = 100 design points and a design dimension d ranging from 2 to 54. To our knowledge, this is the first numerical study with a such range of design dimensions. To capture any variability due to the 16 Figure 8: m and σ MST criteria of C 2 and W 2 discrepancy optimized LHS (N = 100). optimization process, some boxplots are built from all 2D subsamples from five optimized LHS per dimension. 4.2 Analysis according to L2 discrepancy measures First, let us look at the discrepancies of 2D subsamples of C 2 -optimized and W 2 -optimized LHS of dimension d. One can observe in Figure 9 that the optimized LHS built from these criteria criteria are robust in the sense that the 2D projections get a reduced discrepancy value (by looking at the median of the boxplots). This is fully consistent with Fang’s requirement for these modified discrepancy designs (see section 3.2.1). Moreover, the discrepancy increase is regular with the dimension increase, and it looks to tend to an asymptotic value. When you perform the same experiment with the L2 -star discrepancy (which is an unmodified discrepancy), results are different (see Figure 10). In this case, the discrepancy increase is rather sharp and the optimization has no more influence on 2D subprojections from dimension equals to 10. In Figure 11 (left), we perform the same experiment with standard LHS. Obviously, as no optimization process is realized, all 2D subsamples discrepancy values are the same. It allows us to have a reference C 2 value (approximately 0.017) for which the optimization process is no influent. It reveals, by the way of Figure 9 (left), that the optimization of C 2 is efficient for large dimension d (larger than 50), because convergence to 0.017 value has not been reached. In Figure 11 (right), we perform the same experiment with a classical low-discrepancy sequence, the Sobol’ one, with the Owen scrambling in order to break the alignments created by this deterministic sequence. It confirms a well-known fact: the Sobol’ sequence has several poor 2D 17 Figure 9: Discrepancy values of 2D subsamples of the discrepancy optimized LHS: C 2 discrepancy (left) and W 2 discrepancy (right). Figure 10: L2 -star (left) and C 2 (right) discrepancy values of 2D subsamples of L2 -star discrepancy optimized LHS. subprojections in terms of discrepancy criteria, specially in high dimension. Finally, the same tests are performed on maximin designs. Previous works ([23, 15]) have shown that the mindist design are not robust in terms of mindist criterion on its 2D subprojections. The Figure 12 shows at present that the mindist design are not robust in terms of two different discrepancy criteria on its 2D subprojections. As for the L2 -star design, the discrepancy increase is rather sharp and the optimization has no more influence on 2D subprojections from dimension equals to 10. It strongly confirms previous conclusions: the maximin design 18 Figure 11: C 2 discrepancy values of 2D subsamples of: the standard LHS (left) and the scrambled Sobol’ sequence (right). is not recommended if one of the objective is to have some good space filling coverage on the subprojections of the design. Figure 12: Discrepancy values of 2D subsamples of maximin LHS: L2 discrepancy (left) and C 2 discrepancy (right). 4.3 Analysis according to the MST criteria Due to the lack of robustness of the mindist criterion mentioned in section 3.2.3, only the MST criteria are regarded in the point-distance perspective. We compute the MST criteria over all 19 Figure 13: m and σ MST criteria of 2D subsamples of the maximin LHS. 2D projections. One can note that the MST built over 2D subsamples of maximin LHS have small m and high σ because of inherent rows in the entire design (see Figure 13). Regarding optimized LHS from C 2 discrepancy criterion, one can note unlike previously a gradually decline of m and σ (see Figure 14). As a consequence, we conclude that such designs are robust in the sense that thay are less subject to the presence of clustering points over subprojection spaces. Figure 14: m and σ MST criteria of 2D subsamples of the C 2 -discrepancy optimized LHS 5 Conclusions and perspectives This paper has considered several issues in building design of compter experiments, in the class of SFD. Some industrial needs have first been given as challenges: high dimensional SFD 20 are often required (several tens of variables), while preserving good space filling properties on the design subprojections. Recalls have been made on two common measures of space filling (interpoint-distance criteria as the mindist and L2 discrepancy criteria) and recently introduced criteria based on the MST of the design points. For comparison studies, we have shown that MST criteria are preferable to the well-known mindist criterion. Focusing on the class of LHS, some clarification details have been given on several common optimization algorithms of LHS. On numerical tests, we have shown that the stochastic algorithm ESE converges more rapidly than the MM SA algorithm. We have also numerically confirmed that the maximin LHS have to be obtained using the regularized criterion φp (with p = 50 for instance) instead of the mindist criterion. Intensive numerical tests have been performed in order to compare optimized LHS in terms of space filling criteria quality (L2 discrepancy and minimal spanning tree criteria). With designs of size N = 100, the dimensions range from 2 to 54, which is to our knowledge, the first numerical study with a such range of design dimensions. Another contribution of this paper is the deep analysis of the space filling properties of the design 2D-subprojections. Among the tested designs (LHS, maximin LHS, several L2 discrepancy optimized LHS, Sobol’ sequence), only the centered (C 2 ) and wrap-around (W 2 ) discrepancy optimized LHS have shown some strong robustness properties in high dimension. This result numerically confirms the theoretical considerations of [8]. The other tested designs are no more robust in subprojection when their dimension is larger than 10. Tests, not shown here, on other types of designs bring the same conclusions. Moreover, we have shown that the C 2 -discrepancy optimized LHS give more regular design than the W 2 -discrepancy optimized LHS. As a perspective, a such analysis can be extended to subprojections of larger dimensions. For example, in a preliminary study, [23] has confirmed the same conclusions on the design subprojection properties by considering 3D subsamples of the designs. Another future work would be to carry out a more exhaustive and deeper benchmark of optimization algorithms of LHS. For example, an idea would be to look at the convergence of the maximin LHS to the exact solutions. These solutions are known in several cases (small N and small d) for the non randomized maximin LHS (see [38]). Finally, all our numerical tests have been computed within the R environment, by using partially the DiceDesign package. We hope to soonly include in this free package the calculations of the MST criteria and the three LHS optimization algorithms used in this paper. 21 6 Acknowledgments Part of this work has been backed by French National Research Agency (ANR) through COSINUS program (project COSTA BRAVA noANR-09-COSI-015). We thank Luc Pronzato for helpful discussions and Catalina Ciric for providing the prey-predator model example. References [1] B. Auder, A. de Crécy, B. Iooss, and M. Marquès. Screening and metamodeling of computer experiments with functional outputs. Application to thermal-hydraulic computations. Reliability Engineering and System Safety, in press. [2] C. Cannamela, J. Garnier, and B. Iooss. Controlled stratification for quantile estimation. Annals of Apllied Statistics, 2:1554–1580, 2008. [3] C. Ciric, P. Ciffroy, and S. Charles. Use of sensitivity analysis to discriminate non-influential and influential parameters within an aquatic ecosystem model. Ecological Modelling, in press. [4] A. de Crécy, P. Bazin, H. Glaeser, T. Skorek, J. Joucla, P. Probst, K. Fujioka, B.D. Chung, D.Y. Oh, M. Kyncl, R. Pernica, J. Macek, R. Meca, R. Macian, F. D’Auria, A. Petruzzi, L. Batet, M. Perez, and F. Reventos. Uncertainty and sensitivity analysis of the LOFT L2-5 test: Results of the BEMUSE programme. Nuclear Engineering and Design, 12:3561–3578, 2008. [5] E. de Rocquigny, N. Devictor, and S. Tarantola, editors. Uncertainty in industrial practice. Wiley, 2008. [6] C. Dussert, G. Rasigni, M. Rasigni, and J. Palmari. Minimal spanning tree: A new approach for studying order and disorder. Physical Review B, 34(5):3528–3531, 1986. [7] K-T. Fang. Wrap-around L2 -discrepancy of random sampling, Latin hypercube and uniform designs. Journal of Complexity, 17:608–624, 2001. [8] K-T. Fang, R. Li, and A. Sudjianto. Design and modeling for computer experiments. Chapman & Hall/CRC, 2006. [9] J. Franco. Planification d’expériences numériques en phase exploratoire pour la simulation des phénomènes complexes. Thèse de l’Ecole Nationale Supérieure des Mines de SaintEtienne, 2008. 22 [10] J. Franco, X. Bay, B. Corre, and D. Dupuy. Strauss processes: A new space-filling design for computer experiments. In Proceedings of Joint Meeting of the Statistical Society of Canada and the Société Française de Statistique, Ottawa, Canada, may 2008. [11] J. Franco, O. Vasseur, B. Corre, and M. Sergent. Minimum spanning tree: A new approach to assess the quality of the design of computer experiments. Chemometrics and Intelligent Laboratory Systems, 97:164–169, 2009. [12] J.E. Gentle. Random number generation and Monte Carlo methods. Springer, 2003. [13] F.J. Hickernell. A generalized discrepancy and quadrature error bound. Mathematics of Computation, 67:299–322, 1998. [14] B. Iooss. Space filling designs: some algorithms and numerical results on industrial problems. Workshop Accelerating productivity via deterministic computer experiments and stochastic simulation experiments, Isaac Newton Institute, Cambridge, UK, September 2011. http://www.newton.ac.uk/programmes/DAE/seminars/090610001.html. [15] B. Iooss, L. Boussouf, V. Feuillard, and A. Marrel. Numerical studies of the metamodel fitting and validation processes. International Journal of Advances in Systems and Measurements, 3:11–21, 2010. [16] B. Iooss, A-L. Popelin, G. Blatman, C. Ciric, F. Gamboa, S. Lacaze, and M. lamboni. Some new insights in derivative-based global sensitivity measures. In Proceedings of the ESREL 2012 Conference, Helsinki, Finland, june 2012. [17] R. Jin, W. Chen, and A. Sudjianto. An efficient algorithm for constructing optimal design of computer experiments. Journal of Statistical Planning and Inference, 134:268–287, 2005. [18] M.E. Johnson, L.M. Moore, and D. Ylvisaker. Minimax and maximin distance design. Journal of Statistical Planning and Inference, 26:131–148, 1990. [19] J.P.C. Kleijnen and R.G. Sargent. A methodology for fitting and validating metamodels in simulation. European Journal of Operational Research, 120:14–29, 2000. [20] J.R. Koehler and A.B. Owen. Computer experiments. In S. Ghosh and C.R. Rao, editors, Design and analysis of experiments, volume 13 of Handbook of statistics. Elsevier, 1996. [21] C. Lemieux. Monte Carlo and quasi-Monte Carlo sampling. Springer, 2009. [22] J.L. Loeppky, J. Sacks, and W.J. Welch. Choosing the sample size of a computer experiment: A practical guide. Technometrics, 51:366–376, 2009. 23 [23] A. Marrel. Mise en oeuvre et exploitation du métamodèle processus gaussien pour l’analyse de modèles numériques - Application à un code de transport hydrogéologique. Thèse de l’INSA Toulouse, 2008. [24] A. Marrel, B. Iooss, B. Laurent, and O. Roustant. Calculations of the Sobol indices for the Gaussian process metamodel. Reliability Engineering and System Safety, 94:742–751, 2009. [25] A. Marrel, B. Iooss, F. Van Dorpe, and E. Volkova. An efficient methodology for modeling complex computer codes with Gaussian processes. Computational Statistics and Data Analysis, 52:4731–4744, 2008. [26] M.D. McKay, R.J. Beckman, and W.J. Conover. A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics, 21:239–245, 1979. [27] M.D. Morris and T.J. Mitchell. Exploratory designs for computationnal experiments. Journal of Statistical Planning and Inference, 43:381–402, 1995. [28] M. Munoz-Zuniga, J. Garnier, E. Remy, and E. de Rocquigny. Adaptive directional stratification for controlled estimation of the probability of a rare event. Reliability Engineering and System Safety, in press. [29] H. Niederreiter. Random number generation and quasi-Monte Carlo methods. SIAM, 1992. [30] J-S. Park. Optimal Latin-hypercube designs for computer experiments. Journal of Statistical Planning and Inference, 39:95–111, 1994. [31] L. Pronzato and W. Müller. Design of computer experiments: space filling and beyond. Statistics and Computing, 22:681–701, 2012. [32] G. Pujol. Simplex-based screening designs for estimating metamodels. Reliability Engineering and System Safety, 94:1156–1160, 2009. [33] J. Sacks, W.J. Welch, T.J. Mitchell, and H.P. Wynn. Design and analysis of computer experiments. Statistical Science, 4:409–435, 1989. [34] A. Saltelli, M. Ratto, S. Tarantola, and F. Campolongo. Sensitivity analysis practices: Strategies for model-based inference. Reliability Engineering and System Safety, 91:1109– 1125, 2006. [35] T. Santner, B. Williams, and W. Notz. The design and analysis of computer experiments. Springer, 2003. 24 [36] M.C. Schwery and H.P. Wynn. Maximum entropy sampling. Journal of Applied Statistics, 14:165–170, 1987. [37] T.W. Simpson, J.D. Peplinski, P.N. Kock, and J.K. Allen. Metamodel for computer-based engineering designs: Survey and recommandations. Engineering with Computers, 17:129– 150, 2001. [38] E.R. van Dam, B. Husslage, D. den Hertog, and H. Melissen. Maximin Latin hypercube designs in two dimensions. Operations Research, 55:158–169, 2007. [39] F.A.C. Viana, G. Venter, and V. Balabanov. An algorithm for fast optimal Latin hypercube design of experiments. International Journal for Numerical Methods in Engineering, 82:135–156, 2010. 25
© Copyright 2026 Paperzz