The Incorporation of Model Uncertainty in Geostatistical Simulation

1
Geographical
& Environmental
Modelling,
Vo!. 6, No. 2, 2002, 147-169
.
r~ Carfax Publishing
11" T'ylo,&","mGm"p
The Incorporation of Model Uncertainty
in Geostatistical Simulation
P. A. DOWD & E. PARDO-IGUZQUIZA
ABSTRACT
A growing area of application for geostatistical conditional simulation is
as a tool for risk analysis in mineral resource and environmental projects. In these
applications accurate field measurement of a variable at a specific location is difficult
and measurement of variables at all locations is impossible. Conditional simulation
provides a means of generating stochastic realizations of spatial (essentially geological
and/or geotechnical) variables at unsampled locations thereby quantifying the uncertainty associated with limited sampling and providing stochastic models for 'downstream' applications such as risk assessment. However, because the number of
experimental data in practical applications is limited, the estimated geostatistical
parameters used in the simulation are themselves uncertain. The inference of these
parameters by maximum likelihood provides a means of assessing this estimation
uncertainty which, in turn, can be included in the conditional simulation procedure. A
case study based on transmissivity data is presented to show the methodology whereby
both model selection and parameter inference are solved by maximum likelihood. The
authors give an overview of their previously published work on maximum likelihood
estimation of geostatistical parameters with particular reference to uncertainty analysis
and its incorporation into geostatistical simulation.
Introduction
Mineral resource and environmental projects are designed on the basis of variables
that are subject to extreme uncertainty. This uncertainty arises both because of the
nature of the variables and the cost of obtaining information about them. Geological
and geotechnical variables can only be assessed and quantified on the basis of sparse
drilling and sampling programmes. Such programmes provide data on a relatively
large scale, which is invariably an order of magnitude greater than the scale required
for modelling, prediction and risk assessment.
In a gold mining project even at the (relatively advanced) mine planning stage, the
grades of 4 m x 4 m x 5 m selective mining units may be estimated from the grades
of samples taken from drillholes on a 30 m x 30 m grid; geotechnical design is based
on the geotechnical properties of sparse samples often not even collected for the
purpose at hand. Risk assessment of hazardous waste disposal sites requires spatial
FA. Dowd, Department of Mining and Mineral Engineering, University of Leeds, Leeds LS2 9fT. UK Fax:
+ 44-(0)113-246-7310;
E-mail: [email protected]
E. Pardo-Igzquiza, Department of Mining and Mineral Engineering, University of Leeds, Leeds LS2 9fT. UK
1361-5939
DOl:
printf1469-8323
online/02/020147-23
10.1080/1361593022000029476
IQ2002 Taylor & Francis Lld
148
1
I
I
P A. Dowd & E. Pardo-Iguzquiza
modelsof relevantgeologicaland geotechnicalvariablessuchasporosity, permeability, fracture networks and transmissivity.
Ultimately, quantified risk analysis requires an estimate of the likelihood, or
probability, of an event occurring. It may be argued that in the case of true
uncertainty it is not possibleto determine probabilities.However,this is a simplistic
view of probability and, in the context of most of the assessments required in mineral
resource and environmental applications, it is an incorrect view. What is required is
the generation of possible states of nature based on process models and then an
assessment of the likelihood of particular events occurring given these states of
nature.
The possible states of nature in these applications are values of geological and
geomechanical variables which are interpreted as spatial random variables (or, in the
geostatistical terminology, regionalized variables). Geostatistical simulation provides
a means of generating stochastic realizations of spatial variables and these can form
the basis of quantitative risk assessment. In essence, these realizations are treated as
possible realities and risk assessmentis conducted by subjecting them to response
functions and observingthe frequencywith which specifiedcriteria are exceededor
fail to be met. An exampleis provided by the assessment
of the risk of contamination
of the water tableby leakagefrom a proposedundergroundhazardous wastedisposal
site. Geostatistically simulated models of rock properties, including fracture networks,
porosity and permeability, can be subjected to fluid flow models to determine the
proportion of the simulated models in which contaminant pathways can be found
from the disposal site to the water table. Examples of risk assessment for mineral
resource extraction projects are given in Dowd (1994a, 1997). The assumptions in
this approach to risk assessment are:
.
.
the spatial models of variability used to generate the simulations adequately
quantify the sources of variability on all relevant scales;
the number of geostatistical simulations is sufficient to represent the range of
possibilities and that the frequency of occurrence of these possibilities reflects their
actual likelihood of occurrence.
Whilst conditional simulation provides a means of generatingstochastic realizations of spatial variablesit is based on a model of spatial variability that can only
be inferred from sparsedata and the model itself is, therefore,subjectto uncertainty.
A major criterion for assessing the performance of a simulation (or a simulation
method) is the extent to which the simulated values reproduce the specified model
of spatial variability. However, the significant uncertainty associated with the model
raises serious questions about the results and use of simulated values in risk
assessment.
In general, simulation is required when data are sparse and variability is erratic.
In such cases the spatial model of variability is uncertain and the uncertainty
increases with the variability and the lack of data. The model partially drives the
simulation (the extent depends on the simulation algorithm) and reproduction of
this uncertain model is no guarantee that the simulation is an adequate representation
of reality.
In this paper the authors propose the use of maximum likelihood methods to
quantify the uncertainty associated with models of spatial variability and demonstrate
how this uncertainty can then be incorporated into geostatisticalsimulation.A case
study is used to illustrate the effect of model uncertainty on geostatistically simulated
realizations of transmissivity.
l
1
Model Uncertainty in Geostatistical Simulation
149
The Geostatistical Framework
Values of spatial variables are measured at specific locations x. These values, z(x), at
locations x are interpreted as particular realizations of random variables, Z(x), at
the locations. The set of auto-correlated random variables {Z(x), xED} defines a
random function. Spatial variability is then quantified by the correlations among the
random variables.
The Universal Model
The experimental data are assumed to have been generated by the so-called universal
model (or generalized linear model):
Z(x) = m(x) + R(x)
(1)
where x denotes location, Z(x) is a random function, m(x) is the mean or drift, and
R(x) is the residual. The drift is the mathematical expectation of the random
function: E[Z(x)] = m(x) and it is modelled as a linear combination of known basis
functions (monomials) multiplied by unknown coefficients. In matrix notation:
J1= XP
(2)
where J1is an n x I vector of means, X is an n x p matrix of monomials, and
p x I vector of unknown coefficients. The residual is a zero-mean term:
E[R(x)]
p is a
(3)
characterized statistically by its second-order stationary covariance function:
C(h) = E{[R(x)
-
m(x)][R(x + h) - m(x + h)]}.
(4)
On the assumption of second-order stationarity the variogram is defined as:
y(h) = C(O)- C(h). There are several commonly used variogram models each of
which is defined by three parameters: a small-scale, or nugget, variance, Co, due to
variability that occurs on a scale less than the sample volume or sample spacing
(including measurement error); a larger scale variance, C, due to variability on a
scale larger than the sample volume; and a range of influence, a, that defines the
distance within which variable values are auto-correlated; the total variance (known
as the sill value) is Co + C. In practice, the larger scale variance (C) may be
subdivided into any number of sub-scales (Ci, i = I, ... , n) each with its own range
of influence (ai, i= I,...,n).
In matrix notation the universal model is
z = Xp + E
(5)
where z is an n x I vector of experimental data and E is an n x I vector of residuals.
The mathematical expectation is then: E[z] = Xp and the covariance of the
residual is
COV(E) = E(u') = V
(6)
1
I
150
1
P. A. Dowd & E. Pardo-Iguzquiza
where V is the n x n variance-covariance matrix and prime denotes transpose of the
vector.
The universal model is completely specified by the order of the drift, the coefficients,
p, and the parameters, 9, of the covariance, or variogram, model. In applications
none of the parameters are known in advance and they must be estimated from the
experimental data. Once the covariance model and the order of the drift have been
specified the most critical step in geostatistical applications is the inference of the
parameters {p, 9}.
Generalized Increments
Generalized increments (Matheron, 1973), or error contrasts, are linear combinations
of the data expressed as
y=pz
(7)
where y is an n x 1 vector of generalized increments and P is an n x n transformation
matrix.
The matrix P is chosen such that
PX=o
(8)
and in terms of generalized increments the universal model is
y = PXP + PE = PE
(9)
and the drift has been filtered out (although the order of the drift must still be
specified).
One possibility for P (Kitanidis, 1983) is to use the projection matrix:
P = 1 - X(X' X)
-
1 X'
(10)
The relation in (7) can then be written as
y = Az = AE
(11)
where the new transformation matrix A is derived by eliminating p rows from matrix
P where p is the order of the drift and is equal to the rank of the matrix X. This
operation reflects the fact that p of the increments are linearly dependent on the rest
of the increments (Kitanidis, 1983).
The first two moments of the generalized increments are
E[y] = 0
E[yy']
(12)
= AE[zz']A'.
Within the framework of the universal model the second moment becomes
E[yy'] = AVA'
(13)
and the parameters, 9, of the variance-covariance matrix, V,can be estimated without
the need to infer the drift coefficients p.
.,
Model Uncertainty in Geostatistical Simulation
Geostatistical
151
Simulation
Geostatistical simulation (Dowd, 1992; Journel & Alabert, 1989, 1990; Journel &
Huijbregts, 1978; Journel & Isaaks, 1984) is a generalization of the concepts
of Monte Carlo simulation to include three-dimensional spatial correlation. A
geostatistical simulation is one in which:
.
.
.
.
at sampled locations the simulated values of each variable are the same as the
measured (observed) values of those variables;
all simulated values of a given variable have the same spatial relationships observed
in the data values (spatial correlation);
all simulated values of any pair of variables have the same spatial inter-relationships
observed in the data values (spatial cross-correlation);
the histograms of the simulated values of all variables are the same as those
observed for the data.
The methods can be extended to most descriptive or qualitative variables simply
by defining the variables in terms of presence/absence at sampled and simulation
locations (Dowd, 1994b). When natural, physical structures are a significant source
of variability and/or exert a significant controlling influence on other variables (e.g.
geological controls on mineralization, lithostratigraphic controls on porosity and
permeability, rock types and rock properties may be significant factors in the physical
distribution of grade) they, or at least their effects, must be included in the simulation.
In some cases the modelling of categorical or descriptive variables may be an
intermediate stage that provides a means of accurately modelling a quantitative
variable (e.g. gold grades associated with quartz veins) in other cases they may be
the object of simulation (e.g. flow zones for the prediction of groundwater flows).
Geostatistical simulation is now widely used and accepted as a method of
generating stochastic models of mineral deposits, hydrocarbon reservoirs and geological structures which can then be subjected to various operational procedures
(David et al., 1974; Dowd, 1994a; Dowd & David, 1976) for design, analysis and
risk assessment (Dowd, 1997). There are many applications described in the literature
using one or more of the range of methods (Dowd, 1992) now available.
Maximum Likelihood
Maximum likelihood (ML) estimation is used extensively for the estimation of
unknown parameters of hypothesized probability density function (pdf) models
using experimental data sets that are assumed to be outcomes of independent and
identically distributed (with the hypothesized pdf) random variables. Under these
assumptions the joint pdf of n experimental data may be expressed as
n
p(z;9) = p(z! ;9)'P(Z2;9).. .p(zn;9) = flp(Zi;9)
i= 1
(14)
where 9 is an m x 1 vector of parameters that define the pdf and fez; 9) is the joint
pdf defined by 9 for the data z. The ML function is simply the joint pdf in (14)
viewed as a function of the unknown parameters 9 and containing the data z.
The ML estimate of 9 is the value that satisfies all equality and inequality
constraints for which the likelihood function attains its maximum value. As the
logarithm is a monotonic function, the value of 9 that maximizes feZ; 9) also
152
~
P A. Dowd & E. Pardo-Iguzquiza
maXImIzes In p(z; 9). The log-likelihood function is frequently used in order to
change multiplicative properties into additive ones. It is common practice to take
the negative of the log-likelihood function so as to change the maximization problem
to one of minimization. Unless otherwise specified, the ML function will be taken
to be the negative log-likelihood function (NLLF):
1
L(z;9) = -
L In {P(Zi;9)}
i=1
(15)
I
I
and the ML estimates are the values of 9 that minimize equation (15).
The heuristic argument for the ML estimator is that, amongst all sets of possible
values for the parameters, it yields the set that has the greatest possibility of giving
rise to the observed sample (with the hypothesized pdf).
The attraction of ML estimation lies in its large-sample or asymptotic properties.
Under certain regularity conditions (Norden, 1972) the ML estimator is consistent,
asymptotically normally distributed and asymptotically efficient.
Maximum Likelihood in Geostatistics
In geostatistical applications the experimental data are spatially correlated and thus
the form of the joint pdf of the experimental data in (14) is inadequate. For reasons
given below the most convenient choice of an alternative model is the multivariate
Gaussian distribution (mGd):
(16)
p(z;9) = (2n)-n/2IVI-1/2
exp{ - ~(z - Jlyv-1 (z - Jl)}
where 11denotes determinant and prime denotes transpose matrix.
Although the method has been widely reported in geostatistical applications
(Dietrich & Osborne, 1991; Hoeksema & Kitanidis, 1985; Kitanidis, 1983, 1987;
Kitanidis & Lane, 1985; Mardia & Marshall, 1984; Mardia & Watkins, 1989;
Zimmerman, 1989; among others) it also has its detractors (Ripley, 1988, 1992;
Warnes & Ripley, 1987).
There is a common misconception that ML is not applicable because of the
assumption that the data come from a mGd, an assumption which, in practice, is
impossible to verify. A reasoned justification for the choice of the mGd is given in
Pardo- Iguzquiza (1998) but perhaps one of the best reasons, albeit empirical, is that
ML with the mGd gives good results in practice.
In addition to the distributional assumption there are two further objections to
the M L estimation method:
..
there are many instances where the M L estimator is biased;
the ML method is computationally more intensive than other methods.
The second objection is becoming increasingly irrelevant with the rapidly increasing
power and speed of computers. Moreover, a relatively new method-approximate
ML estimation (Pardo-Iguzquiza & Dowd, 1997; Vecchia, 1988) described heresignificantly reduces the computational overhead of ML estimation. The reference
to bias cannot be considered a serious objection for several reasons:
1
1
153
Model Uncertainty in Geostatistical Simulation
.
.
..
The bias tends to zero as the number of samples increases (in practice the bias is
small if the number of samples is large enough).
On the basis of the mean square error (which is a trade-off between bias and
variance) the ML estimator may be better than many others.
The bias can be corrected.
It is possible to obtain unbiased estimators by a suitable transformation of the
original data (e.g. using generalized increments).
The negative log-likelihood function (hereafter referred to as the ML function)
corresponding to the mGd of equation (20) may be written as
1
1
1
L(z;e) = 2.nln(2n)+2.lnl V I+2.ln(z -
Jlyv-I
(17)
(z - Jl).
The values of e that minimize (17) are the ML estimates. The covariance matrix can
be factored as
V = (J2Q
(18)
where (J2is the variance and Q is the correlation matrix. Noting that
IVI = (J2n1VI
(19)
and
V-I
=(J-2Q-I
(20)
the ML function (17) can be written as
nil
(21)
L(P,(J2,e,z) = 2.ln(2n)+ n In(J+ 2.lnIQ I + 2(J2(z - XPYQ- I (z- XP)
where e now represents the covariance parameters but no longer the variance.
The ML estimate of P is obtained by minimizing (21) with respect to p. This
estimate, denoted by p, is identical to the generalized least squares estimate of P
(Searle, 1971)
P
= (X'Q-I X)-I X'Q-I Z.
(22)
The value 62 that minimizes (21) is
62
= !(z
n
- XPYQ-I
(23)
(z - XP).
The ML estimates of the covariance parameters e are the values that minimize the
expressIOn:
.
n
nn
1
n
L'(P, 62, e;z) = 2.ln(2n) + 2. - 2.ln(n)+ 2.lnIQI + 2.ln[(z-
.
XPYQ-I
.
(z - XP)],
(24)
""I
154
P A. Dowd & E. Pardo-Iguzquiza
Restricted Maximum Likelihood Estimation (REML)
It has been argued that the simultaneous estimation of drift and covariance parameters results in biased covariance estimates (Kitanidis, 1987; Matheron, 1971). The
REML method has been proposed (Kitanidis, 1987) as a means of reducing bias. In
REML instead of working with the original data, one works with generalized
increments (Matheron, 1973).
The ML function (21) with (11) takes the form
m
A
mm
L(J2, 8;y) = 2In(2n) + 2
-
1
m
2 In(m) +2:lnIAQA'1 + 2In[y'(AQA')-ly].
(25)
The REML estimates of 9 are the values that minimize (25). REML is very
similar to the ML estimation of generalized covariances in which case Q is replaced
by the generalized covariance K. The estimator of the variance is
,2
(J=
y'(AQA')-ly
.
(26)
n-p
Patters on and Thompson (1971) report the use of REML to estimate covariance
components although it is not clear why REML was preferred to ML (see Harville
(1977) and the comment by Rao (1977)). In geostatistical applications, REML
has been considered by several authors including Zimmerman (1989) and Dietrich
and Osborne (1991), but attention has been focused on efficient algorithms with
little or no emphasis on why REML should be preferable to ML.
As REML and ML are two different estimators, they should be compared
statistically by comparing the sampling distribution
of the estimates. One such
study is given in Pardo-Iglizquiza
(1998). Of particular interest are the mean and
the variance of the sampling distribution:
9= E[O]
Var(O)= E[(O- 9)(0
(27)
-
9)'].
(28)
The bias b is defined as the difference between the mean of the sampling
distribution and the true value of the parameter:
b = 9 - 9.
(29)
The variance of the estimator is the variance of its sampling distribution and
quantifies the dispersion of the estimates around the mean. The standard error is
defined as the square root of the variance of the estimator.
The bias is related to the accuracy of an estimator and the variance to the
precision. In the evaluation of the performance of an estimator there is a trade-off
between bias and variance. A badly biased estimator is as bad as an unbiased
estimator with a large estimation variance.
A true measure of the accuracy and precision of an estimator is provided by the
mean square error (mse). The mse is defined as the dispersion of the estimate
relative to the true value of the parameter rather than to the mean value of the
sampling distribution:
mse = var(O) + bb'.
In general the estimator with the lowest mse is preferred.
(30)
Model Uncertainty in Geostatistical Simulation
It can be shown (Kendall & Stuart,
is biased by an amount equal to
1979) that the estimator
b=_£(J2
n
155
of the variance (26)
(31)
where b is bias and n, p and (J2have already been defined.
The negative bias leads to underestimation
of the variance (on average). An
unbiased estimator may be obtained by multiplying the biased estimates by the
factor c defined by
c=-
n
(32)
n-p
Then
{f2= C . 62
is the unbiased
var
((f2)
(33)
estimator
with estimation
variance
= C2 .var( &2).
(34)
An unbiased estimator is obtained at the cost of increasing the sampling variance.
The bias given by (31) decreases as the number of data increases, and increases
as the order k of the drift increases (in two dimensions, for k = 0, 1 and 2 the
values of pare 1, 3 and 6, respectively). Thus, the amount of bias expected for
different values and for different orders of the drift can be assessed. For example,
with k = 2 and n = 15, c in (32) is 1.666, as the expected bias is 40% of the value
of the parameter, i.e. on average, the variance is underestimated by 40% of its true
value. Estimator (23) is seriously biased and can be corrected by (33) which implies
that the sampling variance increases. The trade-off between bias and variance is
given by the mse which is equal to the squared bias plus the variance. If the value
c is close to 1 the expected bias is small and the estimator may be considered
unbiased. In fact, the ML estimates are efficient and asymptotically unbiased.
An example of bias calculation and correction of covariance parameters estimated
by (24), for a simulated set of values, is given in Pardo-Iglizquiza and Dowd
(1998c).
Minimization
The ML estimates of
P and
(J2 can be expressed
analytically
but the ML estimates
of the covariance parameters 9 require the numerical minimization of (24). This
requires an iterative procedure for minimization in an m-dimensional parameter
space.
The minimization procedure is the core of the ML estimation routine and the
success of the estimation is closely related to the performance of the minimization
procedure. In addition, because each evaluation of (24) requires the inversion of an
n x n matrix, rapid convergence of the minimization procedure is important for
computational efficiency.A number of methods can be used to minimize (24), but in
our experience five have been found to be particularly suitable: direct search, scoring
1
156
,
I
P A. Dowd & E. Pardo-Iguzquiza
method, axial search, simplex method and simulated annealing. A description of
each, together with a performance comparison and a description of a public
domain program (MLREML) are given in Pardo-Iguzquiza (1996). An example of
minimization in a five-dimensional space using the simplex method can be found in
Pardo-Iguzquiza and Dowd (1998b). The conclusion from these studies is that the
direct search method is preferred when the number of parameters is less than three,
and the simplex method in all other circumstances. The minimum can be verified by
axial search. If multiple extrema are expected, the simulated annealing method
should be used for more than two dimensions in the parameter space.
I
Approximate Maximum Likelihood
The computational problems are caused by the evaluation of equation (21) and its
derivatives. In particular, the matrix Q of n x n data must be calculated and inverted
as many times as are required for the minimization procedure to reach convergence.
The approximate maximum likelihood approach starts from the well-known
multiplicative theorem which states that for any n events AI, Az, . . . , An' the following
relation holds:
Pr (A 1 nAzn...
= Pr(AI)' Pr(AzIAI)" .Pr(AnIAI,Az,...
nAn)
,An-d
(35)
where Pr(A IB) is the conditional probability of A given B.
Then, for the multivariate pdf:
n
p(y) = P(YI)' flp(YdYI
i=Z
,Yz,...
,Yi-I)'
(36)
Using the argument that some information provided by the data is redundant
(Vecchia, 1988), the following approximation can be used for the conditional
probability:
p(YiIYi-I,Yi-Z,"
',YI)
;:;;p(YiIYi-I,Yi-Z,"
',Yi-m)
(37)
with i-m>
I.
Thus, instead of minimizing the complete likelihood (21), the function minimized
is the NLLF derived from (36) and taking into account the approximation (37), i.e.
the approximate negative log-likelihood function (ANLLF):
n
L(y) = -lnp(YI)
-
L Inp(YiIY)
i=Z
forj=I,...,m.
(38)
On the assumption that the experimental data {y} are muItivariate Gaussian, the
conditional probability, P(Yi IYj),j = 1, . . . , m is also Gaussian for any i and any m,
with mean vector (Gray bill, 1976):
I!iU=Jli+VijVjjl(Yj-I!)
(39)
1
.,
157
Model Uncertainty in Geostatistical Simulation
and covariance matrix:
ViU = Vii
-
VijVjjl
(40)
Vji
where
Yj is a m x 1 vector of experimental data
J.lj is a m x 1 vector of means
J.liis a 1 x 1 matrix of the means at each of the experimental locations
Vii is a 1 x 1 matrix of the variances
Vij is a 1 x m vector of covariances between the ith point and the m points of vector Yj
Vji is a m x 1 vector equal to the transpose of Vij
Vjj is a m x m matrix of covariances between the points of vector Yjand themselves.
The following relations are obtained by taking into account the factorization (18):
J.liU=/li + QijQjjl(Yj-J.l)
Qilj=Qii-QijQjjlQji=
(41)
(42)
l-QijQjjlQji'
The Gaussian conditional probability function is then
p(Y11Y2"",Ym)=P(Y1Iy)
(43)
]
= (2n) -1/2CT-1IQI-1/2 exp[ - 2~2 (Yi - J.liurQil} (Yi - J.liU)
and the Gaussian pdf for the first data location is
(44)
P(Yl) = (2n)-1/2CT-lexp[
- 2~2(Yl-/ll)2J
Introducing the notation
£i=Yi-J.li
(45)
£j
= Yj -
J.lj
hi
= Yi -
J.liU = Yi - J.li- QijQjj
(46)
1(Yj - J.l)
= £i
- QijQjj
1£j
(47)
J.li=XiP
(48)
J.lj = XjP
(49)
where
£i
£j
hi
Xi
is a 1 x 1 vector of the residual at the ith location
is a m x 1 vector of residuals at the m locations
is a 1 x 1 vector of the conditional residual at the ith location
is a 1 x P vector of basis functions at the ith location, where P is the number of
basis functions that depend on the order of the drift
Xj is a m x p matrix of basis functions at the m experimental locations.
158
P A. Dowd & E. Pardo-Iguzquiza
The ANLLF can be written as
I "n 2 -I
2
n
SI" I
L(P,(j ,9Iy) = 2In(2n) +nln(j) + 2(j2 + 2i~ InlQiul + 2(j2i~ hi QiU'
2
Taking the partial derivatives of the
1
n
(50)
ANLLF with respect to the different para-
meters and setting the resulting equations to zero givesthe approximatelikelihood
equations which can be solvedto givethe followingestimates:
n
+ L Qil](Yi-QijQj}IY)(X.i-Qij%.~/X)
YIXI
I
i=2
(51)
P'=
X'I X I
2
SI
82
+
"
"
L...
i=2
2
.. Q :-:1X.) ' (X..- Q .. Q :-:1X. )
Q L}'- l ( X.-L Q L}}}
}
L
L}}}
}
l
-l
+ i=2
L... h i Q iU
=
(52)
n
The estimation of the covariance parameters 9 is done numerically by minimizing
equation (50) after substituting the estimates given by equations (51) and (52) for P
and (j2, respectively.
The uncertainty of the estimated parameters is assessed by the inverse of the Fisher
information matrix which gives the variance-covariance matrix of the estimates. The
square root of the diagonal elements of this matrix is the standard error of each
estimate.
For the drift coefficients:
n
[
Var(p) = (j2 X'1Xl
+ L-2
.~ Qii]
-l
(Xi - QijQj) 1Xl (Xi - QijQj) 1X)
J
(53)
and for the variance:
2(j6 n - 1
Var(82)
=
~
n [ si +t
L-2
.
(54)
hfQii] J
-
(j2
In practice, P and (j2 are unknown and are replaced by
Var(a-2) = 2(j4
n .
pand 82, then
(55)
It may come as a surprise that the sampling variance given by (55) is the same as
that for a Gaussian variable when the samples are independent (and thus uncorrelated), but it should be noted that the estimation of the variance (52) takes into
account the correlation among the data.
,
1
159
Model Uncertainty in Geostatistical Simulation
Var(9) is evaluated
numerically
by fitting a quadratic
surface to the
ANLLF at
the estimated minimum. The minimum of (52) is found by a minimization procedure
which requires the evaluation of the equation at each step of the process. The main
advantages of using the approximate likelihood instead of the complete likelihood are:
.
.
Computational time saving. Each step of the minimization procedure requires the
inversion of an n x n matrix (where n is the number of experimental data) and
matrix inversion is an n3 process. For example, n = 1000 data requires 109 operations. The ANLLF method with an approximation of m = 10 requires n inversions
of m x m matrices or 106 operations achieving a reduction factor of 1000.
Memory space saving. The working matrices are of size m x m in the approximate
likelihood instead of n x n if the complete likelihood is used.
A description of the method and a computer program is given in Pardo- Iguzquiza
and Dowd (1997).
An Example- TransmissivityData
This example has been chosen because the data are in the public domain; the
complete model inference case study can be found in Pardo-Iguzquiza and Dowd
(l998b) and only the results are presented here. The original application (Gotway,
1994) was for nuclear waste site performance assessment, where uncertainty in the
groundwater travel time of a particle is assessed through its probability density
function (pdf). This pdf is estimated by running groundwater flow and transport
programs with different transmissivity field inputs. These inputs are generated by
conditional simulation which generates possible images of the spatial variability of
transmissivity that honour the experimental data and reproduce the model of spatial
variability inferred from the experimental data.
In this case study we have used the spectral decomposition method of simulation.
This method assumes multivariate normality and the conditioning data were transformed to normally distributed values by interpolation with a standard normal
distribution followed by an inverse transform of the simulated values. The techniques
described in this paper could, however, be used with most methods of geostatistical
simulation.
The experimental data consist of 41 values of transmissivity measurements in the
Culebra Dolomite formation in New Mexico, USA (Gotway, 1994, Table 1). The
data are the decimal logarithm of transmissivity in units of m2 s -1. The spatial
Table 1. Estimated
covariance
model parameters
REML
Estimator
ML
REML
ML
REML
ML
REML
(range and sill) using ML and
Sill
Range
Drift order
Estimate
Standard error
Estimate
Standard error
0
0
1
1
2
2
3.98
5.98
1.28
1.61
1.18
2.22
0.880
1.337
0.284
0.368
0.260
0.530
8.14
12.82
1.99
2.69
1.76
3.99
2.050
3.131
0.667
0.865
0.610
1.179
i
I
160
P A. Dowd & E. Pardo-Iguzquiza
35
30 I
:""0'
251
~
20
>15
. . .:.. . . . . . ~ ~.. .. . . . . :. . . . . . . . ... . . .. . . . .
:
;..0
~
.. .. .. ..:.. .. .. . a:.. .. .. ?.:. q 0
~
.
'oAJ""""""
l
:
0
:
~t18
0
:
:...0...0.'.
.
:
:
0
:
q.:.........
.
.
":"""'"
00: 0
:
a 00" ci ':.. .. . .. ..;... .. .. ..
101""""':""""+?"""~""""~""""':'o'"...
51
~
0
5
?:
: a.
10
~..o.o..j
15
X
:.........
20
30
25
Figure 1. Spatial locations of experimental data.
locations of the data are shown in Figure 1 where the x and y coordinates are in
km. The cluster of points in the 5 km x 5 km central area contains a1most half
the data.
The histogram of the data values is shown in Figure 2 and the omni-directiona1
variogram is shown in Figure 3.
ML and REML were used to estimate the parameters of an exponential covariance
model for drift of orders 0, 1 and 2 and results are summarized in Table 1. The
results in Table 1 show that the estimates of the covariance parameters obtained by
REML are larger than the ML estimates. Table 1 also shows the standard errors
(square root of the estimation variances) of the estimates. This parameter quantifies
the uncertainty of the estimates and can be used to construct confidence intervals
providedthat a model is assumedfor the samplingdistribution of the estimates.The
14
12
~10
c:
Q)
5- 8
~
~
6
g
4
'5
.Cl
«
2
0
-12-11-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0
log T
Figure 2. Histogram of experimental data.
1
2
1
I
161
Model Uncertainty in Geostatistical Simulation
4.0
3.5
3,0
2.5
E
co
g. 2.0
'C:
co
> 1.5
1.0
0.5
0.0
2
0
4
6
8
10
12
Distance
(Iag)
14
16
18
20
Figure 3. Omni-directional variogram.
uncertainty associated with REML estimates is higher than that of ML estimates,
especially for drift orders 0 and 2. The variograms of the residuals for k = 0, 1 and
2 are shown in Figure 4. The variogram of the residuals for k = 0 is the omnidirectional variogram shown in Figure 3; the variograms of the residuals for k = 1
and 2 are much more easily reconciled with those of second-order stationary random
functions.
The Akaike information criterion (Akaike, 1974) was used to select the most
appropriate drift order. The order chosen was k = 1. The variogram of the residuals
for k = 1 is shown in Figure 5 together with the model fitted by ML using the
parameters givenin Table 2.
l-+-k=0--8-k=1-.-k=21
4.0
3.5
3.0
E 2.5
co
.~ 2.0
tu
> 1.5
1.0
0.5
0
2
4
6
8
10
Distance
12
14
16
18
(Iag)
Figure 4. Variograms of residuals for drift orders 0, 1 and 2.
162
P A. Dowd & E. Pardo-Iguzquiza
1.6
1.4
1.2
E
f:!
Cl
0
1
I
I
1.0
.~ 0.8
~
.~ 0.6
IJ)
0.4
0.2
0.0
2
4
6
8
10
Distance
12
14
16
18
(Iag)
Figure 5. Residual variogram for drift of order k = 1 and model fitted.
Table 2. ML estimates and standard errors of
first-order drift coefficients
Parameter
Estimate
{3t
{3z
{33
- 1.6062
-0.2245
-0.0141
Standard
error
0.8653
0.0426
0.0323
The ML estimates and associated standard errors for the k = 1 coefficients are
shown in Table 2. Although the drift is a deterministic component in the universal
model, in practice the coefficients are estimated from the experimental data and they
are thus random variables with the means and standard deviations given in Table 2.
This means that the drift is also uncertain and the information in Table 2 can be
used to construct confidence intervals to quantify the uncertainty associated with
the estimated drift.
The model adopted is a universal model with drift order k = 1 with drift coefficients
given in Table 2 and a zero-mean residual with isotropic exponential covariance with
sill 1.28 [log(m2s - t )]2 and range 1.99 km (practical range approximately6 km).
There is an uncertainty associated with this model, part of which is difficult to
evaluate and involves the model itself; the other part is merely a statistical uncertainty
due to the inference of the parameters from a limited number of data and has been
assessed by the standard errors of the estimates. This latter uncertainty can be
quantified; for instance interval estimates may be constructed assuming a model for
the estimation error, for example Gaussian. In this way the 95% confidence intervals
for sill and range are [0.71, 1.85] and [0.66, 3.32], respectively, and are obtained as
the estimates::!::twice their standard deviation.
In this case, as there is no nugget variance, range and sill are estimated independently by ML. The correlation between range and sill is thus zero and any
combination of values of the parameters inside their respective intervals is inside the
1
I
Model Uncertainty in Geostatistical
163
Simulation
3.0
2.5
2.0
D
E
.
i ,+
;g!
l
A
1.0
C
B
I
0.5
0.5
5.0
Figure 6. 95'Yoconfidence region for sill and range.
95% confidence region as shown in Figure 6. This is useful when using conditional
simulation in uncertainty analysis. Instead of using only the estimated parameters
(the centroid of the rectangle in Figure 6), the extreme cases (though still inside the
95% confidence region) represented by the combination of parameters (sill, range)
given by the corners in the rectangle of Figure 6 may be used. For example, the upper
right corner represents greater continuity (range 3.32 km) and greater variability (sill
1.85[log(m2 s
-
1 )]2), that
may produce spatial variabilitypatterns of transmissivity
different to those using the estimated values. The same may be said for the rest of
the values in the 95% confidence region.
To see that the variance estimates are independent of the range estimates note:
.
.
.
the factoring, in equation (18), of the covariance as the product of the variance
and the correlation;
equation (23) for the variance estimator is derived by setting the partial derivative
of the negative log-likelihood function to zero;
the range estimator is obtained in a similar way to the variance estimator although
the solution is numerical rather than analytical.
The drift estimates are also independent of the variance and the range but the
three drift coefficients are not estimated independently of each other. As the estimated
drift coefficients are correlated, not every combination of the three parameters is
equally reliable, i.e. values that are inside the 95% confidence interval of each
parameter when taken together may not be inside the 95% confidence region for the
parameters. The confidence region is not a parallelepiped but an ellipsoid defined by
the vector W = (/31'/32'/33)that verifiesthe relationship(Draper & Smith, 1981):
(~ -
prX'v
-
1
[
]
X (~- P) = n ~ p Y' (Y -
(56)
X~)
Fp,n - Po1 - a
Fp,n - p, 1 - a is the 1 - a point of the F distribution with p and n - p degrees of
freedom and a is the significance level.
where
1
164
P A. Dowd & E. Pardo-Iguzquiza
0.05
0
ba
-0.05
1
-0.10
I
-0.40
-0.35
-0.30
-0.25
-0.20
b2
Figure 7. 95'10 confidence region for drift parameters
-0.15
-0.10
[32and [33with [31= -1.6062.
Figure 7 shows the confidence region for the drift coefficients ([32' [33) when the
third coefficient [31in the model:
drift(x,y)
= [31+ [32X + [33Y
is fixed to the estimate 131 given in Table 2.
For illustrative purposes we have used the conditional simulation of the universal
model with drift: - 1.43324 - 0.1393x + 0.00763y and variogram of the residual:
y(h) = 1.85exp(- h/3.32).
The parameters used are not the estimates but they are inside the 95% confidence
levels. A contour plot of one simulation chosen at random is shown in Figure 8.
Areas with log-transmissivitygreater than - 0.2 log(m2 s-1) (which represent high
transmissivity values) were shown by the darkest colour and highlight paths of high
transmissivity suggested by the simulation.
To assess the effects of model uncertainty on the simulation outputs six simulations
have been generated for each pair of values denoted by A, B, C, D and E in Figure
6. These points denote the mid-point and the extremities of the 95% confidence
Figure 8. Simulated values from one simulation chosen at random.
1
I
Model Uncertainty in Geostatistical Simulation
165
Figure 9. Output from six simulations using the estimated variance and range parameters denoted by A in Figure 6 (centroid of rectangle).
region for the sill and variance. Each set of six simulations was started with the same
random number seed. The simulation outputs are shown in Figures 9-13.
The differences in the six simulation outputs for each of the points A, B, C, D
and E is due entirely to the changes in the model and these changes reflect the ranges
of uncertainty associated with the model parameters. Running groundwater flow and
transport programs using each of the simulations in Figures 9-13 as inputs would
provide an assessment of the effects of model uncertainty in risk assessment and
allow these effects to be incorporated in the risk assessment.
Conclusions
A major deficiency in the use of geostatistical simulation is the common failure to
take into account the uncertainty of the geostatistical models inferred from experimental data. The failure to do so can render invalid uncertainty models used for risk
analysis. The uncertainty of the covariance or variogram parameters, when estimated
by the classical non-parametric method, is difficult to evaluate. However, parametric
inference methods, such as ML, estimate the variogram/covariance parameters
directly and their uncertainty can be readily quantified. Geostatistical conditional
simulation generates images of reality that model the uncertainty at non-sampled
locations. By including the estimation variance of the estimated variogram parameters
it is possible to generate images that model both the uncertainty due to limited
sampling and the uncertainty of the variogram itself (which is also due to limited
sampling). An overview of ML methods has been given and the methodology has
been illustrated by application to a set of transmissivity data. In applications in
which there are large numbers of data the approximate maximum likelihood method
166
P A. Dowd & E. Pardo-Iguzquiza
1
Figure 10. Output from six simulations using the extreme case variance and range
parameters denoted by B in Figure 6.
..
Figure 11. Output from six simu1ations using the extreme case variance and range
parameters denoted by C in Figure 6.
l
Model Uncertainty in Geostatistical Simulation
167
1
Figure 12. Output from six simulations using the extreme case variance and range
parameters denoted by D in Figure 6.
Figure 13. Output from six simulations using the extreme case variance and range
parameters denoted by E in Figure 6.
i
168
P A. Dowd & E. Pardo-Iguzquiza
can be used instead of the complete maximum likelihood; a case study illustrating
this methodology is given in Pardo-Iguzquiza and Dowd (1998a).
Acknowledgement
This work was supported by EPSRC (Engineering and Physical Sciences Research
Council) grant number GR/M72944.
References
Akaike, H. (1974) A new look at the statistical
Control, AC-19(6), 716-723.
model identification.
IEEE
Transactions
on Automatic
David, M., Dowd, PA & Korobov, S. (1974) Forecasting departure from planning in open pit design and
grade control. In: 12th Symposium on the Application of Computers and Operations Research in the
Mineral Industries (APCOM),
Vol. 2. Golden, CO: Colorado School of Mines, pp. F13I-FI42.
Dietrich, CR. & Osborne, M.R. (1991) Estimation of covariance parameters in Kriging via restricted
maximum likelihood. Mathematical Geology, 23(7), 655-672.
Dowd, PA. (1992) A review of recent developments
1481-1500.
in geostatistics.
Computers
and Geosciences, 17(10),
Dowd, PA (1994a) Risk assessment in reserve estimation and open-pit planning. Transactions of the
Institution of Mining and Metallurgy (Section A: Mining Industry), 103, AI48-AI54.
Dowd, P.A. (1994b) Geological controls in the geostatistical simulation of hydrocarbon
reservoirs. The
Arabian Journal for Science and Engineering, 19(2B), 237-247.
Dowd, PA. (1997) Risk in minerals projects: analysis, perception and management.
Transactions of the
Institution of Mining and Metallurgy (Section A: Mining Industry), 106, A9-AI8.
Dowd, PA. & David, M. (1976) Planning from estimates: sensitivity of mine production schedules to
estimation methods. In: M. Guarascio, M. David & C Huijbregts, Eds, Advanced Geostatistics in the
Mining Industry, NATO ASI Series C: Mathematical and Physical Sciences, Vol. 24. Dordrecht: Reidel,
pp. 163-183.
Draper, N.R. & Smith, H. (1981) Applied Regression Analysis, 2nd edn. New York: John Wiley.
Gotway, CA. (1994) The use of conditional simulation in nuclear waste-site performance assessment.
Technometrics, 36(2), 129-141.
Graybill, FA. (1976) Theory and Application
704 pp.
of the Linear Model. North
Sutuate,
MA: Duxbury
Press,
Harville, H. (1977) Maximum likelihood approaches to variance component estimation and to related
problems. Journal of the American Statistical Association, 72(358), 320-388.
Hoeksema, R.1. & Kitanidis, PK. (1985) Analysis of the spatial structure of properties of selected aquifers.
Water Resources Research, 21(4), 563-572.
Journel, AG. & Alabert, F (1989) Non-gaussian
data expansion in the earth sciences. Terra Nova, I,
123-134.
Journel, A.G. & Alabert,
February, 212-218.
F (1990) New method
for reservoir mapping.
Journal of Petroleum
Technology,
Journel, A.G. & Huijbregts, C (1978) Mining Geostatistics. New York: Academic Press.
Journel, A.G. & Isaaks, EH. (1984) Conditional
indicator simulation: application to a Saskatchewan
uranium deposit. Mathematical
Geology, 16(7),685-718.
Kendall, M. & Stuart, A (1979) The Advanced Theory of Statistics. Vo!. 2: Inference and Relationship,
4th edn. London: Charles Griffin.
Kitanidis, PK. (1983) Statistical estimation of polynomial
tions. Water Resources Research, 19(2), 909-921.
Kitanidis, P.K. (1987) Parametric
671-680.
estimation
of regionalised
covariance
variables.
functions
and hydrologic
Water Resources
applica-
Bulletin,
24(4),
Kitanidis, PK. & Lane, R.W (1985) Maximum likelihood parameter estimation of hydrologic spatial
processes by the Gauss-Newton
method. Journal of Hydrology, 79(1-2), 53-71.
Mardia, K.V & Marshall, R.J. (1984) Maximum likelihood estimation of models for residual covariance
in spatial regression. Biometrika, 71(1), 135-146.
Mardia, K. V & Watkins (1989) On multimodality
76(2), 289-295.
of the likelihood
in the spatial linear model. Biometrika,
-:
169
Model Uncertainty in Geostatistical Simulation
Matheron, G. (1971) The theory of regionalized
de Paris, Fontainebleau,
France.
variables and its applications.
Matheron, G. (1973) The intrinsic random functions and their applications.
5, 439-468.
Norden, R.H. (1972) A survey of maximum
329-354.
likelihood
estimation.
Ecole Superieure
des Mines
Advances in Applied Probability,
International
Statistical
Review, 40(3),
Pardo-Iglizquiza,
E. (1996) MLREML:
a computer program for the inference of covariance parameters
by maximum likelihood and restricted maximum likelihood. Computers and Geosciences, 23(2), 153-162.
Pardo-Iguzquiza,
E. (1998) Maximum likelihood estimation of spatial covariance parameters. Mathematical Geology, 30(1), 95-108.
Pardo-Iguzquiza,
E. & Dowd, P.A. (1997) AMLE3D: a computer program for the statistical inference of
covariance parameters
23(7), 793-805.
by approximate
maximum
likelihood
Pardo-Iguzquiza,
E. & Dowd, P.A. (1998a) Maximum
of soil properties. Soil Science, 163(3), 212-219.
likelihood
estimation.
Computers
and Geosciences,
inference of spatial covariance
parameters
Pardo- Iguzq uiza, E. & Dowd, P.A. (1998b) A case study of model selection and parameter inference by
maximum likelihood with application to uncertainty analysis. Nonrenewable Resources, 7(1), 63-73.
Pardo-Iguzquiza,
E. & Dowd, PA. (1998c) The second order stationary Universal Kriging model revisited. Mathematical
Geology, 30(4), 347-378.
Patterson, H.D. & Thompson,
R. (1971) Recovery of interblock information when the block sizes are
unequal. Biometrika, 58, 545-554.
Rao, 1.N.K.
component
338-339.
(1977) Comment to D.A. Harville (1977), Maximum likelihood approaches to variance
estimation and related problems. Journal of the American Statistical Association, 72(358),
Ripley, B.D.
Rip]ey, B.D.
Walden &
Wiley, pp.
(1988) Statistical Inference for Spatial Processes. Cambridge: Cambridge University Press.
(1992) Stochastic models of the distribution of rock types in petroleum reservoirs. In: AT
P. Guttorp, Eds, Statistics in the Environmental and Earth Sciences, Ch. ]2. New York: John
247-282.
Searle, S.R. (1971) Linear Models. New York: John Wiley.
Vecchia, A. V. (1988) Estimation and model identification for continuous spatial processes. Journal of the
Royal Statistical Society (B), 50,297-312.
Warnes, 1.1. & Ripley, B.O. (1987) Problems with the likelihood estimation of covariance functions of
spatial Gaussian
processes. Biometrika,
74(3), 640-642.
Zimmerman,
D.L. (1989) Computationally
efficient restricted maximum
eralised covariance functions. Mathematical Geology, 21(7), 655-672.
likelihood
estimation
of gen-