Numerical studies of space filling designs: optimization algorithms

Numerical studies of space filling designs: optimization
algorithms and subprojection properties
G. Damblin, M. Couplet and B. Iooss
EDF R&D, 6 Quai Watier, F-78401, Chatou, France
Submitted to: Journal of Simulation
for the special issue “Input & Output Analysis for Simulation”
Correspondance: B. Iooss ; Email: [email protected]
Phone: +33-1-30877969 ; Fax: +33-1-30878213
Abstract
Quantitative assessment of the uncertainties tainting the results of computer simulations is
nowadays a major topic of interest in both industrial and scientific communities. One of the key
issues in such studies is to get information about the output when the numerical simulations are
expensive to run. This paper considers the problem of exploring the whole space of variations
of the computer model input variables in the context of a large dimensional exploration space.
Various properties of space filling designs are justified: interpoint-distance, discrepancy, minimal
spanning tree criteria. A specific class of design, the optimized Latin Hypercube Sample, is
considered. Several optimization algorithms, coming from the literature, are studied in terms
of convergence speed, robustness to subprojection and space filling properties of the resulting
design. Some recommendations for building such designs are given. Finally, another contribution
of this paper is the deep analysis of the space filling properties of the design 2D-subprojections.
Keywords: discrepancy, optimal design, Latin Hypercube Sampling, computer experiment.
1
Introduction
Many computer codes, for instance simulating physical phenomena and industrial systems, are
too time expensive to be directly used to perform uncertainty, sensitivity, optimization or robustness analyses [5]. A widely accepted method to circumvent this problem consists in replacing such computer models by cpu time inexpensive mathematical functions, called metamodels
(author?) [19], built from a limited number of simulation outputs. Some commonly used
metamodels are: polynomials, splines, generalized linear models, or learning statistical models
like neural networks, regression trees, support vector machines and Gaussian process models
1
[8]. In particular, the efficiency of Gaussian process modelling has been proved for instance
by [33, 35, 25]. It extends the kriging principles of geostatistics to computer experiments by
considering that the code responses are correlated according to the relative locations of the
corresponding input variables.
A necessary condition to a successful metamodelling is to explore the whole space X ⊂ Rd
of the input variables X ∈ X (called the inputs) in order to capture the non linear behaviour
of some output variables Y = G(X) ∈ Rq (where G refers to the computer code). This step,
often called the Design of Computer Experiments (DoCE), is the subject of this paper. In many
industrial applications, we are faced with the harsh problem of the high dimensionality of the
space X to explore (several tens of inputs). Some authors [37, 8] have shown that the Space
Filling Designs (SFD) are well suited to this task. A SFD aims at obtaining the best coverage
of the space of the inputs. Moreover, the SFD approach appears natural in an first exploratory
phase, when very few pieces of information are available about the numerical model, or if the
design is expected to serve different objectives (for example, providing a metamodel usable for
several uncertainty quantifications relying on different hypothesis about the uncertainty on the
inputs). However, the class of SFD is large, including the well known Latin Hypercube Samples
(LHS)1 low discrepancy sequences [29, 8], maximum entropy designs [36], minimax and maximin
designs [18] or point process designs [10]. Here, the purpose is to shed new light on the practical
issue of building a SFD.
d
In the following, X is assumed to be [0, 1] , up to a bijection. Such a bijection is never unique
and two ones generally lead to non equivalent ways of filling the space of the inputs. In practice,
during an exploratory phase, only pragmatic answers can be given to the question of the choice
of the bijection: maximum and minimum bounds are generally given to each scalar input so that
d
X is an hypercube which can be mapped over [0, 1] through a linear transformation. It can be
noticed that, if there is a sufficient reason to do so, considering what the computer code actually
models, it is always possible to apply simple changes of input variables if it seems relevant (e.g.
considering the input z = exp(x) ∈ [exp(a), exp(b)] instead of x ∈ [a, b]). Furthermore, if a joint
probability distribution is given to the inputs, we argue that it remains interesting to define a
SFD to explore X as soon as a bijection such that the input image distribution is uniform over
d
Ud = [0, 1] can be handled (it suffices to inverse the marginal cumulative distribution functions
in the case of independent scalar inputs; see section 3.1).
In what follows, the fact that the problem finally comes to the “homogeneous” filling of Ud
is postulated and the main question addressed is how to build or to select a DoCE of a given
(and small) size N (N ∼ 100, typically) within Ud (where d > 10, typically). We keep also in
1
In the following, LHS may refer to Latin Hypercube Sampling as well.
2
mind the well-known and empirical relation N ∼ 10d [22, 24] which gives the approximative
minimum number of computations needed to get an accurate metamodel.
A first consideration is to obtain the best global coverage rate of Ud . It requires the definition
of criteria based on distance, geometrical or uniformity measures [18, 20]. A second prescription
is to uniformly cover the variation domain of each scalar input. Indeed, it often happens that,
among a large number of inputs, only a small one is active, that is significantly impacts the
outputs (sparsity principle). Then, in order to avoid useless computations (different values for
inactive inputs but same values for active ones), we have to ensure that all the values for each
input are different, which can be achieve by using LHS. A last important property of a SFD is
its robustness to projection over subspaces. This property is particularly studied in this paper
and the corresponding results can be regarded as the main contributions. Litterature about the
application of the physical experimental design theory shows that, in most of the practical cases,
effects of small degree (involving few factors, that is few inputs) dominates effects of greater
degree. Therefore, it seems judicious to favour a SFD whose subprojections offer some good
coverages of the low-dimensional subspaces. A first view, which is adopted here, is to explore
two-dimensional (2D) subprojections.
In the following section, two industrial examples are described in order to motivate our
concerns about SFD. Section 3 gives a review about coverage criteria and natures of SFD studied
in the next section. In fact, the preceding considerations conduct us to focus our attention on
optimized LHS. Various optimization algorithms for LHS have been previously proposed (main
works are [27] and [17]). We adopt a numerical approach to compare the performance of different
LHS, in function of their interpoint-distance and L2 -discrepancies. Section 4 focuses on their
2D subprojection properties and numerical tests support some recommendations. A conclusion
synthesizes this work.
2
2.1
Motivating examples
Nuclear safety simulation studies
Assessing the performance of nuclear power plants during accidental transient conditions has
been the main purpose of the thermal-hydraulic safety research for decades. Sophisticated computer codes have been developed and are now widely used. They can calculate time trends of
any variable of interest during any transient in Light Water Reactors (LWR). However, the reliability of the predictions cannot be evaluated directly due to the lack of suitable measurements
in plants. The capabilities of the codes can consequently only be assessed by comparison of
calculations with experimental data recorded in small-scale facilities. Due to this difficulty, but
3
also the “best-estimate” feature of the codes quoted above, uncertainty quantification should
be performed when using them. In addition to uncertainty quantification, sensitivity analysis is
often carried out in order to identify the main contributors to uncertainty.
Those thermal-hydraulic codes enable, for example, to simulate a large-break loss of primary coolant accident (see Figure 1). This scenario is part of the Benchmark for Uncertainty
Analysis in Best-Estimate Modelling for Design, Operation and Safety Analysis of Light Water
Reactors [4] proposed by the Nuclear Energy Agency of the Organisation for Economic Cooperation and Development (OCDE/NEA). It has been implemented on the french computer
code CATHARE2 developed at the Commissariat à l’Energie Atomique (CEA). Figure 2 illustrates 100 Monte Carlo simulations (by randomly varying the inputs of the accidental scenario),
given by CATHARE2, of the cladding temperature in function of time, whose first peak is the
main output of interest in safety studies.
Figure 1: Illustration of a large-break loss of primary coolant accident on a nuclear Pressurized
Water Reactor (a particular but common type of LWR).
Severe difficulties arise when carrying out a sensitivity analysis or a uncertainty quantification
involving CATHARE2:
• Physical models involve complex phenomena (non linear and subject to threshold effects),
with strong interactions between inputs. A first objective is to detect these interactions.
Another one is to fully explore the combinations of the input to obtain a good idea of the
possible transient curves [1].
• Computer codes are cpu time expensive: no more than several hundreds of simulations
can be performed.
• Numerical models take as inputs a large number of uncertain variables (d = 50, typically):
physical laws essentially, but also initial conditions, material properties and geometrical
4
Figure 2: 100 output curves of the cladding temperature in function of time from CATHARE2.
parameters. Truncated normal or log-normal distributions are given to them. Such a
number of inputs is extremely large for the metamodelling problem.
• The first peak of the cladding temperature can be related to rare events: problems turn to
the estimation of a quantile [2] or the probability that the output exceeds a treshold [28].
All of these four difficulties underline the fact that great care is required to define a effective
DoCE over the CATHARE2 input space. The high dimension of the input space remains a
challenge for building a SFD with good subprojection properties.
2.2
Prey-predator simulation chain
In ecological effect assessments, risks imputable to chemicals are usually estimated by extrapolation of single-species toxicity tests. With increasing awareness of the importance of indirect
effects and keeping in mind limitations of experimental tools, a number of ecological food-web
models have been developed. Such models are based on a large set of bioenergetic equations
describing the fundamental growth of each population, taking into account grazing and predatorprey interactions, as well as influence of abiotic factors like temperature, light and nutrients.
They can be used for several purposes, for instance:
• to test various contamination scenarios or recovery capacity of contaminated ecosystem,
• to quantify the important sources of uncertainty and knowledge gaps for which additional
5
data are needed, and to identify the most influential parameters of population-level impacts,
• to optimize the design of field or mesocosm tests by identifying the appropriate type, scale,
frequency and duration of monitoring.
Following this rationale, an aquatic ecosystem model, MELODY2 , was built so as to simulate
the functioning of aquatic mesocosms as well as the impact of toxic substances on the dynamics
of their populations. A main feature of this kind of ecological models is, however, the great
number of parameters involved in the modelling: MELODY has a total of 13 compartments
and 219 parameters; see Figure 3. These are generally highly uncertain because of both natural
variability and lack of knowledge. Thus, sensitivity analyses appears an essential step to identify
non-influential parameters [3]. These can then be fixed at a nominal value without significantly
impacting the output. Consequently, the calibration of the model becomes less complex [34].
Figure 3: Representation of the module chain of the aquatic ecosystem model MELODY.
By a preliminary sensitivity analysis of the periphyton-grazers submodel (representative of
processes involved in dynamics of primary producers and primary consumers and involving
20 input parameters), [16] concludes that significant interactions of large degrees (more than
three) exist in this model. Therefore, a DoCE has to possess excellent subprojection properties
to capture the interaction effects.
2
modelling MEsocosm structure and functioning for representing LOtic DYnamic ecosystems
6
3
Space filling criteria and designs
(i)
Building a DoCE consists in generating a matrix XN
d = (xj )i=1..N,j=1..d , where N is the number
of experiments and d the number of scalar inputs. Let us recall that, here, the purpose is to
d
design N experiments3 x(i) to fill as “homogeneously” as possible the set [0, 1] ; even if, in an
exploratory phase, a joint probability distribution is not explicitely given to the inputs, one can
consider that these are morally independent and uniformly distributed over [0, 1].
The most common sampling method is indiscutably the classical Monte Carlo (Simple Random Sampling, SRS), mainly because of its simplicity and generality [12], but also because of
the difficulty to sample in a more efficient manner when d is large as well. In our context, it
would consist in randomly, uniformly and independently sampling d × N draws in [0, 1]. Yet, it
is known to possess poor space filling properties: SRS leaves wide unexplored regions and can
propose very close points.
The next sections are mainly dedicated to (optimized) LHS, owing to their property of
parsimony mentioned in the introduction, and do not refer to other interesting classes of SFD,
neither maximum entropy designs nor point process designs in particular. The former is based
on a criteria (Shannon entropy) to maximize which could be used to optimize LHS. The resulting
SFD is similar to point-distance optimized LHS (section 3.2.2), as shown by theoretical works
[31] and confirmed by our own experiments. The latter way of sampling seems hardly compatible
with LHS and suffers from the lack of efficient rules to set its parameters [9].
In this section, criteria used hereafter to make diagnosis of DoCE or to optimize LHS are
defined. Computing optimized LHS requires an efficient, in fact specialized, optimization algorithm. Since the literature provides numerous ones, a brief overview of a selection of such
algorithms is proposed. The section ends with our feedback on their behaviours, with an emphasis on maximin DoCE.
3.1
Latin Hypercube Sampling
Latin Hypercube Sampling, which is an extension of stratified sampling, aims at ensuring that
each of the scalar input has the whole of its range well scanned, according to a probability distribution4 [26]. Even if our starting assumption is simply the relevance to fill “homogeneously”
Ud , which is achieved by using LHS in supposing independent uniform distributions in the remainder of the paper (see comments of sections 1 and 3), LHS is introduced in a broader context
below (independent but potentially non-uniform scalar inputs).
Let the range I of each scalar input Xj , j = 1 . . . d, be partitioned into N equally probable
3
4
In the following, a “point” corresponds to an “experiment” (at least a subset of experimental conditions).
The range is the support of the distribution.
7
(k,∗)
intervals Ik . A LHS of size N is obtained from a random draw of N values xj
for each Xj ,
k = 1 . . . N , one per interval Ik (according to the truncated distribution of Xj over Ik ). Then, d
permutations πj of {1, . . . , N } are randomly chosen (uniformly among the N ! possibilities) and
(i)
(k,∗)
applied to the N -tuples: xj = xj
XN
d
=
(i)
(xj )i=1..N,j=1..d
iff i = πj (k). Thus we obtain the matrix of experiments
of a LHS: the ith line x(i) of this matrix will correspond to the inputs
of the ith code execution. See Figure 4 for an illustration. Another way to get a LHS is to draw
(1)
xj
(2)
inside I, then to draw xj
such that
(2)
xj
(1)
inside I\Il such that xj
(3)
∈ Il , then xj
inside I\ (Il ∪ Im )
∈ Im , and so on (according to the truncated distributions). Eventually, if the Xj
are mutually independent random variables with invertible cumulative distribution functions
(CDF) Fj , then the LHS i-th draw for the j-th input can be created as
(i)
(i)
xj
= Fj−1
(i)
πj − ξj
N
!
,
(1)
where the πj are independent uniform random permutations of the integers {1, 2, . . . , N }, and
(i)
the ξj
are independent U([0, 1]) random numbers independent from the πj .
X
X
X
X1
X
X
X
X
X
X
X
X
X
X2
Figure 4: Three examples of LHS of size N = 4 over U2 = [0, 1]2 (with regular intervals): each of
the N rows (each of the N columns, respectively), which corresponds to an interval of X1 (of X2 ,
resp.), contains one and only one draw x(i) (cross).
When building a LHS, another possibility is to select the center of the stratum instead of
drawing randomly. However, this discretized version of LHS is inadequate for the LHS optimization process: it leads to dicrete values of the space filling criteria, rending the convergence
more difficult [23]. We then only consider “randomized” LHS in this paper.
3.2
Space filling criteria
As stated previously, LHS is a relevant way to design experiments, considering one-dimensional
projection. Nevertheless, LHS does not ensure to fill the input space properly. Some LHS
can indeed be really unsatisfactory, like the first design of Figure 4 which is almost diagonal.
8
LHS may consequently perform poorly in metamodel estimation and prediction of the model
output [15]. Therefore, some authors have proposed to enhance LHS not to only fill space in
one-dimensional projection, but also in higher dimensions [30]. One powerful idea is to adopt
some optimality criterion applied to LHS, such as entropy, discrepancy, minimax and maximin
distances, etc. This leads to avoid undesirable situations, such as designs with close points.
The next sections propose some quantitative indicator of space filling useful i) to optimize
LHS or ii) to assess the quality of a design. Section 3.2.1 introduces some discrepancy measures
which are relevant for both purposes. Section 3.2.2 and 3.2.3 introduce some criteria based on
the distances between the points of the design. The former is about the minimax and maximin
criteria, which are relevant for i) but not for ii), and the latter is about a poorly criteria (the
MST one), which gives an interesting insight of the filling characteristics of a design, but cannot
reasonnably be used for i).
3.2.1
Uniformity criteria
Discrepancy measures consist in judging the uniformity quality of the design. Discrepancy can
be seen as a measure of the gap between the considered configuration and the uniform one. The
(i)
star discrepancy of a design XN
d = (x )i=1...N over Ud is defined as
N
1 X
∗
N
1 {x(i) ∈[0,y]} − Volume([0, y]) where [0, y] = [0, y1 ] × · · · [0, yd ].
D (Xd ) = sup y∈Ud N i=1
(2)
As an interpretation, the discrepancy measure lies on a comparison between the volume of
intervals and the number of points within these intervals [13]. In fact, definition (2) corresponds
to the greater difference between the value of the CDF of the uniform distribution over Ud
(right term) and the value of the empirical CDF of the design (left term). In practice, the star
discrepancy is not computable because of the L∞ -norm used in formula (2). Hence L2 -norms
are used [8, 21]. For example, the star L2 -discrepancy can be written as follows:
Z "
#2  12
N


X
1
D2∗ (XN
)
=
−
Volume([0,
y])
dy
1
.
(i)
d


N i=1 {x ∈[0,y]}
(3)
χ
Different discrepancy definitions exist, by using different forms of intervals5 or different
norms in the functional space. Discrepancy measures based on L2 -norms are the most popular
in practice because they can be analytically expressed and are easy to compute. Among them,
two measures have shown remarkable properties [17, 7, 8]. Indeed, Fang has defined seven
properties for uniformity measures including subprojections uniformity (a particularity of the
so-called modified discrepancies) and invariance by coordinate rotation as well. Both centered
discrepancy (C 2 ) and wrap-around discrepancy (W 2 ) follow this way:
5
That is intervals [z, y] such that z 6= 0.
9
• the centered L2 -discrepancy
C
2
(XN
d )
N
d 2 XY
1 (i)
1 (i)
2
−
=
1 + |xk − 0.5| − |xk − 0.5|
N i=1
2
2
k=1
d N Y
X
1
1 (j)
1 (i)
1 (i)
(j)
+ 2
1 + |xk − 0.5| + |xk − 0.5| − |xk − xk | ,
N i,j=1
2
2
2
13
12
d
(4)
k=1
• the wrap-around L2 -discrepancy
W
2
(XN
d )
d
N
d 1 X Y 3
4
(i)
(j)
(i)
(j)
+ 2
− |xk − xk |(1 − |xk − xk |) ,
=
3
N i,j=1
2
(5)
k=1
which allows to suppress bound effects (by wrapping the unit cube for each coordinate).
3.2.2
Point-distance criteria
[18] introduced two distance-based criteria. The first idea consists in minimizing the distance
between a point of the input domain and the points of the design. The corresponding criterion
to minimize is called the minimax criterion φmM (·):
φmM XN
= max min kx − x(i) kLp ,
d
x∈χ x(i)
(6)
with p = 2 (euclidian distance), typically. A small value of φmM for a design means that there
is no point of the input domain too distant from a point of the design. This appears important
from the point of view of the Gaussian process metamodel, which is typically based on the
assumption of decreasing correlation of outputs with the distance between the corresponding
inputs. However, this criterion needs the computations of all the distances between every points
of the domain and every points of the design. In practice, an approximation of φM m is obtained
via a fine discretization of the input domain. However, this approach becomes impracticable
for input dimension d larger than three [31]. φmM could be also derived from the Delaunay
tessellation that allows to reduce the computationnal cost [31], but dimensions d larger than
four or five remain an issue.
A second proposition is to maximize the minimal distance separating two design points.
Let us note dij = ||x(i) − x(j) ||Lp , with p = 2, typically. The so-called mindist criterion φM m
(refering for example to mindist() routine from DiceDesign R package) is worth
φ M m XN
=
d
min
i,j=1...N,i6=j
dij .
(7)
For a given dimension d, a large value of φM m tends to separate the design points from each
other, so allows a better space coverage.
The mindist criterion has been shown to be easily computable but difficult to optimize.
Regularized versions of mindist have been listed in [31], allowing to carry out more efficient
10
optimization in the class of LHS. In this paper, we use the φp criterion:
 p1


d−p
ij
X
φp X N
=
d
.
(8)
i,j=1...N,i<j
The following inequality, proved in [31], shows the asymptotic link between φM m and φp . If one
defines ξ ?p as the design which maximizes φp and ξ ? as the one which maximizes φM m , then:
1≥
φM m (ξ ?p )
φM m (ξ ? )
≥
−1/p
n
.
2
(9)
Let be a treshold, then (9) implies:
φM m (ξ ?p )
φM m
(ξ ? )
≥ 1 − for p ' 2
ln n
.
(10)
Hence, when p tends to infinity, minimizing φp is equivalent to maximizing φM m . Therefore, in
practice, a large value of p is taken. p = 50, which is proposed in [27], has been shown to be
sufficient up to d = 20 in our numerical experiments.
The commonly so-called maximin design (x(i) )i=1...N ) is the one which maximizes φM m ,
and minimizes the number of pairs of points exactly separated by the minimal distance. In the
following, let us now call a maximin LHS as an optimized LHS relative to the φp criterion or
the φM m one.
3.2.3
MST criteria
The Minimum Spanning Tree (MST) criteria [6], recently introduced for studying SFD [11, 9],
enables to analyze the geometrical profile of designs according to the distances between points.
Regarding design points as vertices, a MST is a tree which connects all the vertices together
and whose sum of edge lengths is minimal. Once one built a MST for a design, mean m and
standard deviation σ of edge lengths can be calculated. Designs described as quasi-periodic
are characterized by large mean m and small σ ([11]) compared to random designs or standard
LHS (see some examples in [14]). Such quasi-periodic designs fill the space efficiently from the
point-distance perspective: large m is related to large interpoint-distance and small σ means
that the minimal interpoint-distances between all couples of points are similar. Moreover, one
can introduce a partial order relation for designs based on MST: a design D1 fills better the
space than a design D2 if m(D1 ) > m(D2 ) and σ(D1 ) < σ(D2 ).
MST is a relevant approach focusing on design arrangement and, because m and σ are global
characteristics, it makes much more robust diagnosis than the mindist criterion does. If a design
with high mindist implies a quasi periodic distribution, the reciprocal is false as illustrated in
Figure 5 (on the left). Besides, the MST criteria appear rather difficult to optimize using
11
stochastic algorithms unlike the previous ones (see the next section). However, our numerical
experiments lead to conclude that producing maximin LHS is equivalent to building a quasiperiodic distribution in the LHS design class.
Figure 5: Illustration of two quasi-periodic LHS (left: design with good (large) mindist, right:
design with bad (small) mindist).
3.3
Optimization of Latin Hypercube Sample
Within the class of latin hypercube arrangements, optimizing a space-filling criterion in order
to avoid undesirable arrangements (such as the diagonal ones which are the worst cases, see also
Figure 4) appear very relevant. Optimization can be performed following different approaches,
the most natural being the choice of the best LHS (according to the chosen criterion) among
a large number (e.g. one thousand). Due to the extremely large number of possible LHS
((!N )d for discretized LHS and infinite for randomized LHS), this method is rather unefficient.
Other methods have been developed, based on columnwise-pairwise exchange algorithms, genetic
algorithms, Simulated Annealing (SA), etc.: see [39] for a review. Thus, some practical issues
are which algorithm to use to optimize LHS and how setting its numerical parameters. Since
an exhaustive benchmark of the available methods (with different parameterizations for the
more flexible ones) is hardly possible, we choose to focus on a limited number of specialized
stochastic algorithms: the Morris and Mitchell (MM) version of SA [27], a simple variant of
MM developed in [23] (Boussouf algorithm) and a stochastic algorithm called ESE (“Enhanced
Stochastic Algorithm”, [17]). We compare their performance in terms of different space-filling
12
criteria of the resulting designs.
3.3.1
SA algorithms
SA is a probabilistic metaheuristic to solve global optimization problems. The approach can
provide a good optimizing point in a large search space. Here, we would like to explore the
space of LHS. In fact, the optimization is carried out from an initial LHS (standard random
LHS) which is (expected to be) improved through elementary random changes. An elementary
change of a LHS XN
d is done in switching two randomly chosen coordinates from a randomly
chosen column, which keeps the latin hypercube nature of the sample. The re-evaluation of the
criterion after each elementary change could be very costly (in particular for discrepancy). Yet,
taking into account that only two coordinates are involved in an elementary change leads to
cheap expressions for the re-evaluation. In [17], formula to re-evaluate in a straightfoward way
φp and the C 2 discrepancy have been established. We have extended it to any L2 -discrepancies
(including W 2 and star L2 -discrepancy).
The main ideas of SA are the following ones. Designs which do not improve the criterion (bad
designs) can be accepted to avoid to get trapped around a local optimum. At each iteration,
elementary change of the current design is proposed, then accepted with a probability which
depends on a quantity T called temperature which evolves from an initial temperature T0
acoording to a certain temperature profile. The temperature decreases with the iterations and
less and less bad designs are accepted. The main issue of SA is to properly set the initial
temperature and the parameters which define the profile to get a good trade-off between a
sufficiently wide exploration of the space and a quick convergence of the algorithm. Finally, a
stopping criterion must be specified. The experiments hereafter are based on a maximum number
of iterations (useful to compare the different algorithms), but more sophisticated criteria could
be more relevant to save computations.
The Boussouf SA algorithm has been introduced in [23]. The temperature is decreasing
following a geometrical profile T = ci × T0 at the ith iteration with 0 < c < 1. Therefore,
the temperature decreases exponentially with the iterations and c must be set very close to 1
when the dimension d is high. In this case, SA can sufficiently explore the LHS designs space if
enough iterations are performed and this criterion tends rapidly to a correct approximation of
the optimum.
The MM (Morris and Mitchell) SA algorithm [27] was initially proposed to generate maximin
LHS. It can be used to optimize discrepancy criteria, or others as well. In opposition to Boussouf
SA, its temperature profile is linear and the temperature does not change at every iterations.
Moreover, its down grade is a function of a parameter Imax : T decreases only if the criterion is
13
not improved during a row of Imax iterations. Morris & Mitchell proposed some heuristic rules
to set the different parameters of the algorithm from N and d. We noticed that these rules do
not always perform well: some settings can conduct to relatively slow convergences.
3.3.2
Enhanced Stochastic Evolutionary algorithm (ESE)
ESE is an efficient and flexible stochastic algorithm to optimize LHS [17]. It relies on a precise
control of a quantity similar to the temperature of SA through an exploration step, then an
improving process. Unlike SA, the temperature can increase from an iteration to the next.
Futhermore, M new LHS are randomly built from the current one at each step (M = 1 for the
SA). The authors expect that ESE can improve LHS using less elementary perturbations than
SA (see our test in the next paragraph). Futhermore, the default algorithm parameters (used
in figure 6) suggested by the authors appeared efficient whatever d is.
3.3.3
Feedback on LHS optimization
Boussouf SA holds a geometric profile which is well-adapted for d = 2 and 3. When d raises,
it becomes more and more difficult to approach the global optimum. Indeed, the algorithm is
rather sensitive to T0 and c which are delicate to set. The linear profile is actually preferable
to perform an efficient exploration of the space. To compare the performances of MM SA and
ESE through an example (dimension d = 5), figure 6 is focusing on the evolution of mindist
values. We can see that with only 4 iterations, optimized LHS from ESE looks over 0.5. Let
us recall that for MM SA, an iteration corresponds to Imax elementary perturbations, while
for ESE, an iteration corresponds to M elementary perturbations. Regarding the structure of
the algorithms, we can easily compute the corresponding number of elementary permutations.
ESE produces 20000 permutations in 4 iterations. It is much smaller than SA which needs
approximately 60000 to exceed a mindist of 0.5. As a consequence, ESE seems a powerful
routine to quickly perform LHS with excellent mindist value. Such results could also be seen
with any L2 -discrepancy measures as well.
We perform an additional test to show the interest of the regularization of the mindist
criterion (see section 3.2.2). We produce some optimized LHS through both mindist criterion
and φp criterion (see Figure 7). If the optimizations are performed from φp , a clear improvement
is noted. Hence, we use the φp criterion in the following to build maximin LHS.
As we need an algorithm with a fast convergence, we use Boussouf SA in the following section
to carry out LHS comparisons. Our results do not suffer of the non-optimality of the resulting
designs because our purpose is just to compare space filling criteria.
14
Figure 6: For maximin LHS designs (n = 50, d = 5, p = 50) obtained by MM SA and ESE
algorithms: mindist criterion in function of the algorithm iteration number. Boxplots are produced
through 30 optimizations at each iteration. Left: MM SA with T0 = 2, Imax = 100, c = 0.9. Right:
ESE with T0 = 0.005 × φp (LHS), M = 500.
Figure 7: For maximin LHS designs (n = 100, d = 10) obtained by Boussouf SA (T0 = 10, c = 0.95):
mindist criterion in function of the algorithm iteration number. For each iteration, mindist criterion
is taken as the mean from 30 optimizations. Optimization criteria are φp (red curve) and mindist
(green curve).
15
4
Robustness to projections over 2D subspaces
4.1
Motivations
An important characteristic of a SFD XN
d over Ud is its robustness to projections over lowerdimensional subspaces, which means that the k-dimensional subsamples of the SFD, k < d,
obtained by deleting d − k columns of the matrix XN
d , fill Uk efficiently (according to a space
filling criteria). A LHS structure for the SFD is not sufficient because it only guarantees good
repartitions for one-dimensional projections, and not for projections of greater dimensions.
Indeed, to capture precisely an interaction effect between some inputs, a good coverage is
required onto the subspace spanned by these inputs (see section 2.2). Another reason why that
property of robustness really matters is that a metamodel fitting can be made in a smaller
dimension than d (see an example in [2]). In practice, this is often the case because the output
analysis of an initial design (“screening step”) may reveal some useless (i.e. non influent) input
variables that can be neglected during the metamodel fitting step [32]. Moreover, when a
selection of input variables is made during the metamodel fitting step (as for example in [25]),
the new sample, solely including the retained input variables, has to keep good space filling
properties.
In the remainder of the paper, we focus on the space filling quality of the 2D projections
of optimized LHS. Moreover, most of the times, interaction effects of order two (i.e. between
two inputs) are significantly larger than interaction of order three, and so on. Then, we only
consider in this first study the 2D subprojections.
Discrepancy and point-distance based criteria can be regarded as relevant measures to quantify the quality of space-filling designs. Unfortunately, it has been shown that they are incompatible in high dimension in the sense that a quasi-periodic design (see section 3.2.3) does not reach
the lowest value of any discrepancy measure (or a LHS of same size N ). If we derive the MST
criterion for large-dimensional optimized LHS refered to both criteria, we observe a difference
for the mean (m) values and standard deviation (σ) values (see section 3.2.3). Moreover, we
observe that the mean (resp. σ) of C 2 -discrepancy optimized LHS is larger (resp. smaller) than
the mean (resp. σ) of W 2 -discrepancy optimized LHS (see Figure 8). This is the reason why
we will mainly focus on C 2 -discrepancy instead of W 2 -discrepancy (in the view of the partial
order relation defined in section 3.2.3).
Below, we perform some tests to underline subprojections uniformity of LHS optimized using
different space-filling criteria. Quality of optimized LHS are analyzed using discrepancies, then
considering the MST criteria. All of the tests of this section are made with N = 100 design
points and a design dimension d ranging from 2 to 54. To our knowledge, this is the first
numerical study with a such range of design dimensions. To capture any variability due to the
16
Figure 8: m and σ MST criteria of C 2 and W 2 discrepancy optimized LHS (N = 100).
optimization process, some boxplots are built from all 2D subsamples from five optimized LHS
per dimension.
4.2
Analysis according to L2 discrepancy measures
First, let us look at the discrepancies of 2D subsamples of C 2 -optimized and W 2 -optimized LHS
of dimension d. One can observe in Figure 9 that the optimized LHS built from these criteria
criteria are robust in the sense that the 2D projections get a reduced discrepancy value (by
looking at the median of the boxplots). This is fully consistent with Fang’s requirement for
these modified discrepancy designs (see section 3.2.1). Moreover, the discrepancy increase is
regular with the dimension increase, and it looks to tend to an asymptotic value. When you
perform the same experiment with the L2 -star discrepancy (which is an unmodified discrepancy),
results are different (see Figure 10). In this case, the discrepancy increase is rather sharp and
the optimization has no more influence on 2D subprojections from dimension equals to 10.
In Figure 11 (left), we perform the same experiment with standard LHS. Obviously, as no
optimization process is realized, all 2D subsamples discrepancy values are the same. It allows
us to have a reference C 2 value (approximately 0.017) for which the optimization process is no
influent. It reveals, by the way of Figure 9 (left), that the optimization of C 2 is efficient for
large dimension d (larger than 50), because convergence to 0.017 value has not been reached. In
Figure 11 (right), we perform the same experiment with a classical low-discrepancy sequence,
the Sobol’ one, with the Owen scrambling in order to break the alignments created by this
deterministic sequence. It confirms a well-known fact: the Sobol’ sequence has several poor 2D
17
Figure 9: Discrepancy values of 2D subsamples of the discrepancy optimized LHS: C 2 discrepancy
(left) and W 2 discrepancy (right).
Figure 10: L2 -star (left) and C 2 (right) discrepancy values of 2D subsamples of L2 -star discrepancy
optimized LHS.
subprojections in terms of discrepancy criteria, specially in high dimension.
Finally, the same tests are performed on maximin designs. Previous works ([23, 15]) have
shown that the mindist design are not robust in terms of mindist criterion on its 2D subprojections. The Figure 12 shows at present that the mindist design are not robust in terms of two
different discrepancy criteria on its 2D subprojections. As for the L2 -star design, the discrepancy increase is rather sharp and the optimization has no more influence on 2D subprojections
from dimension equals to 10. It strongly confirms previous conclusions: the maximin design
18
Figure 11: C 2 discrepancy values of 2D subsamples of: the standard LHS (left) and the scrambled
Sobol’ sequence (right).
is not recommended if one of the objective is to have some good space filling coverage on the
subprojections of the design.
Figure 12: Discrepancy values of 2D subsamples of maximin LHS: L2 discrepancy (left) and C 2
discrepancy (right).
4.3
Analysis according to the MST criteria
Due to the lack of robustness of the mindist criterion mentioned in section 3.2.3, only the MST
criteria are regarded in the point-distance perspective. We compute the MST criteria over all
19
Figure 13: m and σ MST criteria of 2D subsamples of the maximin LHS.
2D projections. One can note that the MST built over 2D subsamples of maximin LHS have
small m and high σ because of inherent rows in the entire design (see Figure 13). Regarding
optimized LHS from C 2 discrepancy criterion, one can note unlike previously a gradually decline
of m and σ (see Figure 14). As a consequence, we conclude that such designs are robust in the
sense that thay are less subject to the presence of clustering points over subprojection spaces.
Figure 14: m and σ MST criteria of 2D subsamples of the C 2 -discrepancy optimized LHS
5
Conclusions and perspectives
This paper has considered several issues in building design of compter experiments, in the
class of SFD. Some industrial needs have first been given as challenges: high dimensional SFD
20
are often required (several tens of variables), while preserving good space filling properties on
the design subprojections. Recalls have been made on two common measures of space filling
(interpoint-distance criteria as the mindist and L2 discrepancy criteria) and recently introduced
criteria based on the MST of the design points. For comparison studies, we have shown that
MST criteria are preferable to the well-known mindist criterion. Focusing on the class of LHS,
some clarification details have been given on several common optimization algorithms of LHS.
On numerical tests, we have shown that the stochastic algorithm ESE converges more rapidly
than the MM SA algorithm. We have also numerically confirmed that the maximin LHS have to
be obtained using the regularized criterion φp (with p = 50 for instance) instead of the mindist
criterion.
Intensive numerical tests have been performed in order to compare optimized LHS in terms of
space filling criteria quality (L2 discrepancy and minimal spanning tree criteria). With designs of
size N = 100, the dimensions range from 2 to 54, which is to our knowledge, the first numerical
study with a such range of design dimensions. Another contribution of this paper is the deep
analysis of the space filling properties of the design 2D-subprojections. Among the tested designs
(LHS, maximin LHS, several L2 discrepancy optimized LHS, Sobol’ sequence), only the centered
(C 2 ) and wrap-around (W 2 ) discrepancy optimized LHS have shown some strong robustness
properties in high dimension. This result numerically confirms the theoretical considerations
of [8]. The other tested designs are no more robust in subprojection when their dimension is
larger than 10. Tests, not shown here, on other types of designs bring the same conclusions.
Moreover, we have shown that the C 2 -discrepancy optimized LHS give more regular design than
the W 2 -discrepancy optimized LHS.
As a perspective, a such analysis can be extended to subprojections of larger dimensions.
For example, in a preliminary study, [23] has confirmed the same conclusions on the design
subprojection properties by considering 3D subsamples of the designs. Another future work
would be to carry out a more exhaustive and deeper benchmark of optimization algorithms of
LHS. For example, an idea would be to look at the convergence of the maximin LHS to the
exact solutions. These solutions are known in several cases (small N and small d) for the non
randomized maximin LHS (see [38]).
Finally, all our numerical tests have been computed within the R environment, by using
partially the DiceDesign package. We hope to soonly include in this free package the calculations
of the MST criteria and the three LHS optimization algorithms used in this paper.
21
6
Acknowledgments
Part of this work has been backed by French National Research Agency (ANR) through COSINUS program (project COSTA BRAVA noANR-09-COSI-015). We thank Luc Pronzato for
helpful discussions and Catalina Ciric for providing the prey-predator model example.
References
[1] B. Auder, A. de Crécy, B. Iooss, and M. Marquès. Screening and metamodeling of computer experiments with functional outputs. Application to thermal-hydraulic computations.
Reliability Engineering and System Safety, in press.
[2] C. Cannamela, J. Garnier, and B. Iooss. Controlled stratification for quantile estimation.
Annals of Apllied Statistics, 2:1554–1580, 2008.
[3] C. Ciric, P. Ciffroy, and S. Charles. Use of sensitivity analysis to discriminate non-influential
and influential parameters within an aquatic ecosystem model. Ecological Modelling, in
press.
[4] A. de Crécy, P. Bazin, H. Glaeser, T. Skorek, J. Joucla, P. Probst, K. Fujioka, B.D. Chung,
D.Y. Oh, M. Kyncl, R. Pernica, J. Macek, R. Meca, R. Macian, F. D’Auria, A. Petruzzi,
L. Batet, M. Perez, and F. Reventos. Uncertainty and sensitivity analysis of the LOFT L2-5
test: Results of the BEMUSE programme. Nuclear Engineering and Design, 12:3561–3578,
2008.
[5] E. de Rocquigny, N. Devictor, and S. Tarantola, editors. Uncertainty in industrial practice.
Wiley, 2008.
[6] C. Dussert, G. Rasigni, M. Rasigni, and J. Palmari. Minimal spanning tree: A new approach
for studying order and disorder. Physical Review B, 34(5):3528–3531, 1986.
[7] K-T. Fang. Wrap-around L2 -discrepancy of random sampling, Latin hypercube and uniform
designs. Journal of Complexity, 17:608–624, 2001.
[8] K-T. Fang, R. Li, and A. Sudjianto. Design and modeling for computer experiments.
Chapman & Hall/CRC, 2006.
[9] J. Franco. Planification d’expériences numériques en phase exploratoire pour la simulation
des phénomènes complexes. Thèse de l’Ecole Nationale Supérieure des Mines de SaintEtienne, 2008.
22
[10] J. Franco, X. Bay, B. Corre, and D. Dupuy. Strauss processes: A new space-filling design
for computer experiments. In Proceedings of Joint Meeting of the Statistical Society of
Canada and the Société Française de Statistique, Ottawa, Canada, may 2008.
[11] J. Franco, O. Vasseur, B. Corre, and M. Sergent. Minimum spanning tree: A new approach
to assess the quality of the design of computer experiments. Chemometrics and Intelligent
Laboratory Systems, 97:164–169, 2009.
[12] J.E. Gentle. Random number generation and Monte Carlo methods. Springer, 2003.
[13] F.J. Hickernell. A generalized discrepancy and quadrature error bound. Mathematics of
Computation, 67:299–322, 1998.
[14] B. Iooss. Space filling designs: some algorithms and numerical results on industrial problems. Workshop Accelerating productivity via deterministic computer experiments and
stochastic simulation experiments, Isaac Newton Institute, Cambridge, UK, September
2011. http://www.newton.ac.uk/programmes/DAE/seminars/090610001.html.
[15] B. Iooss, L. Boussouf, V. Feuillard, and A. Marrel. Numerical studies of the metamodel
fitting and validation processes. International Journal of Advances in Systems and Measurements, 3:11–21, 2010.
[16] B. Iooss, A-L. Popelin, G. Blatman, C. Ciric, F. Gamboa, S. Lacaze, and M. lamboni. Some
new insights in derivative-based global sensitivity measures. In Proceedings of the ESREL
2012 Conference, Helsinki, Finland, june 2012.
[17] R. Jin, W. Chen, and A. Sudjianto. An efficient algorithm for constructing optimal design
of computer experiments. Journal of Statistical Planning and Inference, 134:268–287, 2005.
[18] M.E. Johnson, L.M. Moore, and D. Ylvisaker. Minimax and maximin distance design.
Journal of Statistical Planning and Inference, 26:131–148, 1990.
[19] J.P.C. Kleijnen and R.G. Sargent. A methodology for fitting and validating metamodels
in simulation. European Journal of Operational Research, 120:14–29, 2000.
[20] J.R. Koehler and A.B. Owen. Computer experiments. In S. Ghosh and C.R. Rao, editors,
Design and analysis of experiments, volume 13 of Handbook of statistics. Elsevier, 1996.
[21] C. Lemieux. Monte Carlo and quasi-Monte Carlo sampling. Springer, 2009.
[22] J.L. Loeppky, J. Sacks, and W.J. Welch. Choosing the sample size of a computer experiment: A practical guide. Technometrics, 51:366–376, 2009.
23
[23] A. Marrel. Mise en oeuvre et exploitation du métamodèle processus gaussien pour l’analyse
de modèles numériques - Application à un code de transport hydrogéologique. Thèse de
l’INSA Toulouse, 2008.
[24] A. Marrel, B. Iooss, B. Laurent, and O. Roustant. Calculations of the Sobol indices for
the Gaussian process metamodel. Reliability Engineering and System Safety, 94:742–751,
2009.
[25] A. Marrel, B. Iooss, F. Van Dorpe, and E. Volkova. An efficient methodology for modeling complex computer codes with Gaussian processes. Computational Statistics and Data
Analysis, 52:4731–4744, 2008.
[26] M.D. McKay, R.J. Beckman, and W.J. Conover. A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics,
21:239–245, 1979.
[27] M.D. Morris and T.J. Mitchell. Exploratory designs for computationnal experiments. Journal of Statistical Planning and Inference, 43:381–402, 1995.
[28] M. Munoz-Zuniga, J. Garnier, E. Remy, and E. de Rocquigny. Adaptive directional stratification for controlled estimation of the probability of a rare event. Reliability Engineering
and System Safety, in press.
[29] H. Niederreiter. Random number generation and quasi-Monte Carlo methods. SIAM, 1992.
[30] J-S. Park. Optimal Latin-hypercube designs for computer experiments. Journal of Statistical Planning and Inference, 39:95–111, 1994.
[31] L. Pronzato and W. Müller. Design of computer experiments: space filling and beyond.
Statistics and Computing, 22:681–701, 2012.
[32] G. Pujol. Simplex-based screening designs for estimating metamodels. Reliability Engineering and System Safety, 94:1156–1160, 2009.
[33] J. Sacks, W.J. Welch, T.J. Mitchell, and H.P. Wynn. Design and analysis of computer
experiments. Statistical Science, 4:409–435, 1989.
[34] A. Saltelli, M. Ratto, S. Tarantola, and F. Campolongo. Sensitivity analysis practices:
Strategies for model-based inference. Reliability Engineering and System Safety, 91:1109–
1125, 2006.
[35] T. Santner, B. Williams, and W. Notz. The design and analysis of computer experiments.
Springer, 2003.
24
[36] M.C. Schwery and H.P. Wynn. Maximum entropy sampling. Journal of Applied Statistics,
14:165–170, 1987.
[37] T.W. Simpson, J.D. Peplinski, P.N. Kock, and J.K. Allen. Metamodel for computer-based
engineering designs: Survey and recommandations. Engineering with Computers, 17:129–
150, 2001.
[38] E.R. van Dam, B. Husslage, D. den Hertog, and H. Melissen. Maximin Latin hypercube
designs in two dimensions. Operations Research, 55:158–169, 2007.
[39] F.A.C. Viana, G. Venter, and V. Balabanov. An algorithm for fast optimal Latin hypercube design of experiments. International Journal for Numerical Methods in Engineering,
82:135–156, 2010.
25