7. Joint Usage of a Simple and Complex Model

A Theoretical Analysis of Model Simplification
John Doherty and Catherine Moore
March, 2017
Contents
1. Introduction ........................................................................................................................................ 1
2. Using Models to Make Decisions ........................................................................................................ 2
2.1 Risk Analysis .................................................................................................................................. 2
2.2 Model Predictive Uncertainty ....................................................................................................... 3
2.3 Complex Models and Simple Models............................................................................................ 4
3. History-Matching ................................................................................................................................ 8
3.1 Introduction .................................................................................................................................. 8
3.2 Linearized Bayes Equation ............................................................................................................ 9
3.3 Model Calibration ....................................................................................................................... 10
3.3.1 Some Concepts..................................................................................................................... 10
3.3.2 Regularisation ...................................................................................................................... 11
3.4 Calibration through Singular Value Decomposition.................................................................... 12
3.4.1 Preconditions ....................................................................................................................... 12
3.4.2 Singular Value Decomposition ............................................................................................. 13
3.4.3 Quantification of Parameter and Predictive Error ............................................................... 15
3.4.4 Model-to-Measurement Misfit ............................................................................................ 18
4. Accommodating Model Defects........................................................................................................ 19
4.1 Introduction ................................................................................................................................ 19
4.2 Mathematical Formulation ......................................................................................................... 19
4.2.1 The Problem Defining Equation ........................................................................................... 19
4.2.2 Parameter Estimation .......................................................................................................... 20
4.2.3 Model-to-Measurement Fit ................................................................................................. 20
4.2.4 Predictive Error .................................................................................................................... 21
4.3 A Pictorial Representation .......................................................................................................... 24
4.4 Regularisation and Simplification ............................................................................................... 28
4.5 Non-Linear Analysis..................................................................................................................... 29
5. Repercussions for Construction, Calibration and Deployment of Simple Models............................ 32
5.1 Introduction ................................................................................................................................ 32
5.2 Benefits and Drawbacks of Complex Models ............................................................................. 32
5.3 Notation ...................................................................................................................................... 34
5.4 Prediction-Specific Modelling ..................................................................................................... 34
5.5 Playing to a Model’s Strengths ................................................................................................... 35
5.6 Dispensing with the Need for Model Calibration ....................................................................... 37
5.7 Tuning the Calibration Dataset to Match the Prediction ............................................................ 38
5.8 Model Optimality and Goodness of Fit ....................................................................................... 40
5.9 Detecting Non-Optimality of Simplification ................................................................................ 41
6. Avoiding Underestimation of Predictive Uncertainty when Using Simple Models .......................... 44
6.1 Introduction ................................................................................................................................ 44
6.2 Solution Space Dependent Predictions....................................................................................... 45
6.3 Null Space Dependent Predictions.............................................................................................. 46
6.4 Predictions which are Dependent on Both Spaces ..................................................................... 48
6.5 Direct Hypothesis-Testing ........................................................................................................... 49
7. Joint Usage of a Simple and Complex Model .................................................................................... 51
7.1 Introduction ................................................................................................................................ 51
7.2 Linear Analysis............................................................................................................................. 51
7.3 Predictive Scatterplots ................................................................................................................ 52
7.4 Surrogate and Proxy Models ....................................................................................................... 53
7.5 Suggested Improvements to Paired Simple/Complex Model Usage .......................................... 54
7.5.1 General ................................................................................................................................. 54
7.5.2 Option 1: Using the Simple Model for Derivatives Calculation ........................................... 54
7.5.3 Option 2: Modifications to Accommodate Complex Model Numerical Problems .............. 55
7.5.4 Option 3: Direct Adjustment of Random Parameter Fields ................................................. 55
8. Conclusions ....................................................................................................................................... 57
9. References ........................................................................................................................................ 60
1. Introduction
The discussion presented herein is focussed on modelling in the context of decision-support. First it
briefly examines the role that numerical modelling should play in this process, and then provides some
metrics for judging its success or otherwise in supporting it. Having defined these metrics, the
discussion then turns to the level of complexity that a model must possess in order to achieve these
metrics. In doing, so it attempts to provide an intellectual structure through which an appropriate
level of model complexity can be selected for use in a particular decision-making context, at the same
time as it attempts to define how models can be simplified in a way that does not erode the benefits
that modelling can bring to the decision-making process. It also examines whether a modeller actually
needs to make a choice between “simple” and “complex”, or whether both types of models can be
used in partnership in order to gain access to the benefits of both while avoiding the detriments of
either.
To the authors’ knowledge, an analysis of the type presented herein has not been previously
presented. Because of this, there is a tendency for models that are built to support environmental
decision-making to be more complex than they need to be. Rarely is this bias toward complexity the
outcome of deliberations that conclude that a high level of complexity is warranted in a particular
decision-support context. More often than not, a modeller’s bias towards complexity arises from a
fear that that his/her work will be criticised by reviewers for omitting the complexity required for
his/her model to be considered as a faithful simulator of environmental processes at a particular study
site.
In fact, no model is a faithful simulator of environmental processes, regardless of its complexity. All
models are gross simplifications of the myriad of environmental process that take place within their
domain at every scale. In spite of this, they can still form a solid basis for environmental decisionsupport. It is argued herein that this support is not an outcome of their ability to simulate what will
happen if management of a system changes, for no model can do this. Instead, it is an outcome of the
fact that models are unique in their ability to provide receptacles for two types of information, namely
that arising from expert knowledge and direct measurements of system properties on the one hand,
and that which is resident in the historical behaviour of an environmental system on the other hand.
These receptacles are far from perfect, and may indeed be improved with more faithful reproduction
of system dynamics. However it will be argued herein that while some extra receptacles can be opened
with increased model complexity, others are simultaneously closed.
Conceptualization of models as receptacles for information implies that there is an optimal level of
complexity that is appropriate to any particular decision-making context. Furthermore this level is
“fluid” in the sense that it may change with advances in the design of simulation and inversion
algorithms, and with advances in computing technology. At the same time, it may depend on the size
and talent of human resources on which an institution can draw for model construction and
deployment. Hence the decision for adoption of a certain level of model complexity must be made
anew for each new decision context, and possibly for each decision that pertains to that context.
Moreover, it may need to undergo periodic revision as the modelling process progresses.
Arguments for and against complexity should take all necessary factors into account if they are to
support decisions on how models can best support the decision-making process. This requires a clear
conceptualization of what complexity can provide and what simplicity can provide in any decisionmaking context. The purpose of the present document is to provide these conceptualizations.
1
2. Using Models to Make Decisions
2.1 Risk Analysis
In a seminal paper on the role of models in decision-support, Freeze et al (1990) characterize this role
as quantifying the level of risk associated with a particular course of management action. They argue
that for any of the choices that face a decision-maker, an object function Փ can be roughly calculated
as follows:
Փ= B – C – R
(2.1.1)
In this equation B represents the benefits accruing from this particular choice, C represents the costs
associated with this choice, while R represents the risk of failure. R can be loosely defined as the
probability of failure times the cost of failure. Freeze et al argue that for any particular management
option B and C are known. It is the role of modelling to evaluate R as it pertains to different
management choices. The option with the highest objective function represents the best course of
action.
It follows immediately that if modelling is to properly serve the decision-making process, its outputs
must be probabilistic. As will be discussed below, it is the very nature of environmental modelling that
expectations of anything other than probabilistic outputs are inconsistent with what modelling can
provide.
The concepts embodied in equation 2.1.1 can be expressed as follows. Associated with any
management course of action is the possibility that an unwanted event, or “bad thing” may happen.
In the water management context this “bad thing” may, for example, be the occurrence of unduly high
or unduly low groundwater levels at a particular location at a particular time, durations of lower-thanthreshold stream flows that exceed those required for maintenance of stream health, or
concentrations of contaminants that exceed human or biotic health limits. Any of these may constitute
failure of a selected environmental management plan. Tolerance of failure is related to the cost of
failure. If the cost is relatively low, then a decision-maker can tolerate a moderate possibility of failure
if this reduces implementation costs for a particular management option. On the other hand, if the
price of failure is high, the probability of its occurrence must be low for a management option to be
acceptable.
Seen in another light, failure of a particular management plan constitutes a type of “null hypothesis”.
The purpose of environmental modelling it to test that hypothesis. Ideally, for a particular
management option to be viable, the modelling process must demonstrate that the failure hypothesis
can be rejected at a level of confidence that is appropriate for the cost associated with such failure.
This concept of the role of modelling in decision-making allows us to define failure of the modelling
process as it supports the decision-making process. Suppose that a modeller concludes that the
probability of occurrence of a bad thing or unwanted event is low. Suppose, however, that the actual
probability of occurrence of that event is greater than quantified by the modeller. This constitutes a
so-called type II statistical error, this being defined as false rejection of a hypothesis. In the present
discussion we declare the occurrence of such an error to constitute failure of the modelling process,
for that process has provided an overly optimistic view of the outcomes of a particular management
option.
It can be argued that the occurrence of a type I statistical error should also be construed as failure of
a modelling exercise. A type I error occurs through failure to reject a hypothesis that is, in fact, unlikely.
It is argued herein that this does not provide a useful definition of modelling failure. As will become
2
apparent later in this document where model simplification is considered, an inevitable cost of
simplification is (or should be) that model-calculated predictive uncertainty margins may be broader
than would have been calculated by a more complex model. However, as will also be discussed,
excessive run times and numerical instability associated with use of a complex model may preclude
quantification of predictive uncertainty at all. The possible occurrence of a type I statistical error is
thus a price that may need to be paid for quantification of uncertainty. Of course, if quantified
uncertainty bounds are too broad, then the benefits of modelling are lost, for hypotheses of bad things
happening can never be rejected. While we do not classify this as failure herein, we do characterize it
as “unhelpful”.
2.2 Model Predictive Uncertainty
Conceptually, the uncertainties associated with predictions made by an environmental model can be
quantified using Bayes equation. For the sake of brevity we assume that model predictive uncertainty
is an outcome of model parametric uncertainty. “Parameters” in this context can include system
properties that are represented in the model, as well as system stresses and boundary conditions. In
fact they can include any aspect of a model’s construction and deployment of which a modeller is not
entirely certain. The analysis could be extended to include conceptual uncertainties that underpin
design of a model; however, though important, these are not addressed until later in this document
where the concept of model defects is introduced.
Let the vector k denote parameters employed by a model. At this stage we assume that the model is
complex enough to represent all aspects of an environmental system that are salient to a prediction.
The vector k may therefore possess many elements – far more than can be estimated uniquely. Let
P(k) characterize the prior probability distribution of k. As such, P(k) expresses expert knowledge as it
pertains to system properties; it also reflects direct measurements of those properties that may have
been made at a number of locations throughout the domain of the system of interest.
Let the vector h represent measurements of system state that collectively comprise a “calibration
dataset” for a particular model. (The meaning of the term “calibration” will be addressed later in this
document.) Prior estimates of k must be “conditioned” by these measurements if model-generated
counterparts to these measurements are to replicate them to a level that is commensurate with
measurement noise. That is to say, the range of possibilities for k that is expressed by P(k) must be
narrowed in order for the model to be capable of reproducing historical measurements of system
state (i.e. h) when supplied with historical stresses. The outcome of this conditioning process is the
so-called posterior probability distribution of k, denoted herein as P(k|h) (i.e. the probability
distribution of k as conditioned by h). The relationship between the prior and posterior parameter
probability distributions is expressed by Bayes equation, that is
Pk | h Ph | k  Pk 
(2.2.1)
In equation 2.2.1 P(h|k) is called the “likelihood function”. It increases with the extent to which model
outputs approach measurement of system state. Thus parameters which give rise to better
reproduction by the model of historical system behaviour are more likely to be representative of those
which exist in reality than those which do not.
Predictions of the future state of a system made by a model are also dependent on the parameter set
k that is used to make them. Naturally, predictions grow in probability as the parameters that the
model employs to make these predictions themselves grow in probability. Notionally, the posterior
probability distribution of a prediction can be constructed by sampling the posterior probability
distribution of parameters and making a model run using each such sample. Sampling of the posterior
3
parameter distribution can be effected using methodologies such as Markov chain Monte Carlo.
However while high sampling efficiencies can be achieved using Markov chain Monte Carlo where
parameters are few in number, this is not the case where parameters number in the hundreds or
thousands. Sampling efficiencies then become low. Furthermore, where a model takes a long time to
run, the computation burden of sampling P(k|h) becomes impossibly high. Fortunately however,
alternative, albeit approximate, methods for sampling the posterior parameter probability
distribution are available for use in highly parameterized contexts where model run times are long.
This is further discussed below.
While the two terms on the right of equation 2.2.1 denote probability distributions, conceptually they
can also be viewed as receptacles for information. P(k) expresses information contained in expert
knowledge, including the inherently probabilistic nature of such knowledge as it pertains to complex
environmental systems. On the other hand, P(h|k) expresses information contained in historical
measurements of system state. Where a model employs parameters that are, on the one hand,
informed by expert knowledge while, on the other hand, induce pertinent model-calculated quantities
to replicate historical observations of system state, it can be considered to be a repository for these
two types of information. As will be further discussed below, this does not mean that its parameters
are unique; in fact parameter nonuniqueness is expressed by the fact that the history-matching
process yields a probability distribution, denoted in equation 2.2.1 as P(k|h). What it does mean is
that, in respecting this posterior parameter probability distribution, the model provides receptacles
for the two types of information that are required for specification of the posterior probability
distribution of a prediction of management interest.
The concept of a model as providing receptacles for information can be extended to better understand
the role of models in the decision-making process. Recall from section 2.1 that a model supports the
decision-making process through the ability it provides to test whether the possibility of eventuality
of a particular bad thing can be rejected (at a certain level of confidence). Rejection of the “bad thing
hypothesis” can be made if the occurrence of the bad thing is demonstrably incompatible with either
expert knowledge of system properties and processes, the historical behaviour of the system, or both.
It is the task of the model to hold the information through which the management-related hypothesis
can be tested, and possibly rejected. Conceptually, the hypothesis can be rejected if it is not possible
to find a reasonable (in terms of expert knowledge) set of parameters that allows the model to fit the
calibration dataset. Methods are available through which a model can be deployed directly in this
hypothesis-testing capacity. See Moore et al (2010), Doherty (2015) and section 6.5 of this document.
The idea that an environmental model is more usefully viewed as a repository for information than as
a simulator of environmental processes is eloquently expressed by Kitanidis (2015). This is not to say
that a model’s ability to simulate environmental processes is of no use. Rather it expresses the concept
that a model’s simulation capabilities should serve its primary role as a repository for decision-critical
information rather than the other way round. Hence where a model that is built for use in the decisionmaking process is reviewed by a modeller’s peers, it is its ability to hold and use information that is
salient to available decision options that should be most carefully scrutinized, and not its (imagined)
ability to “realistically” replicate environmental processes.
2.3 Complex Models and Simple Models
The above discussion has attempted to define the role that models should play in the decision-making
process. At the same time it has presented a metric for failure of a modelling exercise when conducted
in support of that process. This provides an intellectual structure for choosing modelling options. The
choice of what type of model to build in a particular decision context, and how that model should be
4
deployed to support the decision-making process, should be accompanied by a guarantee that
modelling will not fail in attempting to play this role, using the metric for failure provided above.
From a Bayesian perspective, the task of modelling in decision-support can be viewed as quantifying
the uncertainties of predictions of management interest, while reducing those uncertainties as much
as possible through giving voice to all pertinent information that pertains to the system under study.
This information is comprised of expert knowledge (including direct measurements of system
properties), as well as the historical behaviour of the system. However, because it is more useful in
exploring concepts related to model simplification, the present document adopts a frequentist
perspective, in which the role of the modelling process is depicted as testing the hypothesis that an
unwanted event will occur following implementation of an environmental management strategy that
may precipitate that event. It then attempts to reject the hypothesis of its occurrence through
demonstrable incompatibility of the event with information for which a model provides receptacles.
Conceptually, the more complex is a model, the more receptacles can it provide for information.
Presumably a complex model is endowed with many parameters. It can thus express the
heterogeneity of system properties that govern the processes that are operative within that system.
Expressions of heterogeneity are expressions of expert knowledge. Furthermore, because a complex
model can represent the hydraulic properties of a spatial system at a scale that is commensurate with
field measurements of those properties, these measurements can be used to constrain the expression
of hydraulic property detail in the model at field measurement locations. However, it is important to
realize that while complex models have the ability to represent system detail, they must recognize the
probabilistic nature of such detail, and the fact that direct measurements of system properties can
condition that detail at only a discrete number of points. In general, the greater the detail that is
expressed by a model, the greater is the uncertainty associated with that detail. Stochasticity
(normally involving a large number of model runs) thus becomes integral to expressing the expert
knowledge of which a complex model is the repository.
Because complex models can be endowed with many parameters, adjustment of these parameters
should promulgate a good fit between model outputs and measurements of system state which
comprise a calibration dataset. In theory, complex models can therefore provide receptacles for the
information contained in these measurements. However the transfer of this information to the model
requires that a satisfactory (in terms of measurement noise) level of fit be attained between pertinent
model outputs and field measurements. In practice, this requires use of the model in conjunction with
software such as PEST (Doherty, 2016) that is capable of implementing highly parameterized inversion.
This, in turn, requires that the model be run many times, and that the model’s numerical performance
be good. Unfortunately, complex models are often burdened with long run times; moreover, their
numerical performance is often questionable. History-matching of a complex model to produce a
parameter field of minimum error variance which can be deemed to “calibrate” the model can
therefore be an extremely difficult undertaking. Generating many other parameters fields which also
satisfy calibration constraints in an attempt to sample the posterior parameter probability distribution
of the model can be an impossibly difficult undertaking.
From the above considerations it is apparent that it is difficult, if not impossible, for a complex model
to live up to its decision-support potential. Even in modelling circumstances where complexity is
embraced because of the possibility that it offers to represent detail, rarely is the information content
of expert knowledge given proper expression through using the model in a stochastic framework.
Simple models generally employ fewer parameters than complex models. Generally these parameters
express abstractions of system properties such as large zones of assumed piecewise constancy in a
5
groundwater model, or lumped storage elements in a land use or surface water model. Expert
knowledge can be difficult to apply to such parameters. Hence the ability of a simple model to provide
receptacles for expert knowledge is limited.
On the other hand, simple models often run fast and are numerically stable. If they are endowed with
enough parameters, it may be possible to provide values for these parameters which support a good
fit between pertinent model outputs and historical measurements of system state. Furthermore,
achievement of this fit can often be accomplished very quickly using inversion software such as PEST.
Simple models therefore constitute good receptacles for information contained in historical
measurements of system state. As will be discussed later in this document, for some predictions this
is all the information that a model needs to carry. However where a prediction is partly sensitive to
combinations of parameters which occupy the calibration null space (see below), the parameters
which comprise these combinations must be represented in the model, even if they cannot be
uniquely estimated, if the uncertainty of the prediction is to be properly quantified. Furthermore, the
simple model’s parameters must be capable of adjustment through a range of values that is
compatible with their prior uncertainties as posterior predictive uncertainty is explored. A problem
with many simple models is that these parameters may not be represented at all in the model, as they
are not required to achieve a good fit with the calibration dataset. Furthermore, even if they are
represented, their prior uncertainties may be difficult to establish because of the abstract nature of
parameters that the simple model employs.
The above discussion attempts to illuminate some fundamental differences between complex and
simple models, as well as the ramifications of these differences for model-based decision-making.
However the discussion omits some important nuances which will be covered later in this document.
The choice of appropriate complexity will always be site-specific, and indeed prediction-specific.
Furthermore “appropriate” must be judged in the context of a given modelling budget. As a
practitioner of the “art of modelling”, the modeller must choose a level of complexity that is best
tuned to his/her decision context and to his/her modelling budget. In doing so he/she must guarantee
that his/her choice is accompanied by model specifications that prevent occurrence of a type II
statistical error in which the hypothesis of a bad thing happening is wrongly rejected.
In some decision contexts it may be possible to build a model that is complex enough to express expert
knowledge yet simple enough to be employed with inversion software that enables a good fit between
model outputs and field data to be obtained. In another context, a modeller may decide to build a
simple model; however the model may be too simple to support a parameterization scheme that
promulgates a good fit between model outputs and historical measurements of system state. The
modeller may then increase parameterization complexity of the model in order to achieve such a fit.
However, as will be shown in following chapters, even if a good fit between model outputs and field
measurements can be obtained, the calibration process may induce bias in some predictions. The
attainment of a “well calibrated model” may therefore actually compromise the utility of the model
in that particular decision context.
It is an inconvenient truth that for all but the simplest environmental systems, it is difficult with current
modelling technology to build a model which simultaneously provides receptacles for the two
information types that are expressed by the two terms on the right hand side of Bayes equation. A
modeller must therefore ask him/herself which type of information should be better expressed by
his/her model, given his/her current decision-making context. Where a prediction is sensitive to
parameters (or parameter combinations) whose uncertainties cannot be reduced much through
history-matching, it is more important that the model provide receptacles for expert knowledge than
for information contained in measurements of system state. In contrast, where a prediction is
6
sensitive to parameters whose uncertainties can be significantly reduced through history-matching,
then the model must run fast enough, and be numerically stable enough, to be used in concert with
inversion software which provides a good fit between model outputs and historical measurements of
system state; the uncertainty of that prediction can be consequentially reduced.
As will be discussed in following chapters, the choice of modelling approach becomes most difficult
where a prediction of management interest is partially sensitive to parameters whose uncertainties
can be reduced through history-matching, and partially sensitive to parameters whose uncertainties
cannot be thus reduced. Unfortunately many predictions of management interest fall into this
category. This is because the modelling imperative often arises from a proposal that management of
a system be altered from its historical state. The system will therefore be exposed to stresses to which
it has not hitherto been exposed. Historical measurements of system state may not inform all
(combinations of) parameters to which model predictions of future system behaviour are sensitive;
however it may inform some of these (combinations of) parameters. In these circumstances, model
design for decision-support becomes very difficult. These difficulties are compounded by the fact that
the requirements of a simple model in this context are more stringent than in contexts where
predictions are very similar in nature to measurements that comprise a calibration dataset. As will be
shown, while a simple model may indeed support a sufficient number of parameters for a good fit to
be obtained between model outputs and field data, the information which is thereby transferred from
these measurements to the model’s parameters may be placed into receptacles that are imperfect
and distorted. The repercussions for some predictions may be small; the repercussions for other
predictions may be dire.
Theory through which an understanding of model simplification can be gained is presented in the
following chapters of this document.
7
3. History-Matching
3.1 Introduction
This chapter, and the following chapter, introduce theory which, it is hoped, provides insights into
both history-matching and model simplification. These are seen to be closely related. The theory is
descriptive rather than exact, as it is based on an assumption of model linearity, this implying that the
relationship between a model’s outputs and its parameters can be described by a matrix. Most models
are, of course, nonlinear. Nevertheless, their behaviour can be considered to be “locally linear” as long
as parameters are not varied too much from specified starting values. Of more importance to the
present context, however, is the fact that linear analysis supports the use of subspace concepts in
examining the roles played by model calibration and model simplification. As will be seen, the light
that these concepts shed on calibration and simplification are profound.
The presentation of theory in this and the following chapter is relatively brief. Full details can be found
in Doherty (2015). Many of the equations derived below can be evaluated for real-world models using
members of the PEST and PyEMU suites; see Doherty (2016) and White et al (2016) for details. See, in
particular, members of the PEST PREDUNC* and PREDVAR* suite of programs. PREDVAR1C is of
particular importance as it accommodates model defects (see the following chapter), albeit in
restricted situations where so-called “defect parameters” can be identified.
As in the previous chapter, we use the vector k to denote parameters used by a model. To begin with
we consider models which are “complex”, in that they employ enough parameters to represent real
world heterogeneity, and simulate enough processes for their outputs to be uncompromised by any
defects. Let the vector h, once again, denote the calibration dataset. The number of elements of h
may or may not exceed that of k; hence in a numerical groundwater model, for example, different
properties can be ascribed to every cell of the model grid or mesh (which is indeed representative of
the complexity associated with many groundwater systems). Let the vector ε denote measurement
noise associated with the elements of h. Finally, let the matrix Z denote the action of the model under
calibration conditions. Then
h = Zk + ε
(3.1.1)
The symbol C(k) is used herein to denote the prior covariance matrix of k, this being associated with
the prior probability density function P(k) featured in equation 2.2.1. Let C(ε) denote the covariance
matrix of measurement noise; this is used in calculation of the likelihood function P(h|k) of equation
2.2.1. We express the interrelationship between the k and ε vectors and their respective covariance
matrices using the expressions
k ~ C(k)
(3.1.2a)
ε ~ C(ε)
(3.1.2b)
In most cases of interest C(ε) is diagonal as the noise associated with any one measurement is
considered to be independent of that associated with any other. However this does not have to be
the case. Indeed it is not the case if noise is “structural” in origin – a matter which will be discussed
later in this document. Let the scalar s denote a prediction of interest made by the model, and let the
sensitivities of this prediction to model parameters be encapsulated in the vector y. Then
s = ytk
(3.1.3)
where the superscript “t” denotes matrix/vector transpose.
8
Before proceeding, we remind the reader of a matrix relationship used to express propagation of
variance. Let x be a random vector with covariance matrix C(x). Let y be calculable from x through a
linear relationship involving the matrix A. That is
y = Ax
(3.1.4)
It is easily shown (see, for example, Koch, 1999) that the covariance matrix C(y) of y is calculated from
that of x through the relationship
C(y) = AC(x)At
(3.1.5)
Applying this to equation 3.1.3, the prior covariance matrix of s (which is the variance of s, as s is a
scalar) becomes
σ2s = ytC(k)y
(3.1.6)
(Recall that variance is the square of standard deviation.) For a real (nonlinear) model, the variance of
predictive uncertainty would be calculated by drawing samples from P(k), running the model for each
sample, and building an empirical probability density function for s.
3.2 Linearized Bayes Equation
Equation 3.1.6 depicts the uncertainty of a prediction made by a model which has not been calibrated;
this is the so-called “prior uncertainty” of the prediction. As was discussed in the preceding chapter,
application of Bayes equation to the history-matching process restricts the range of predictive
possibilities to those that are calculated using parameter sets that match the calibration dataset h to
within a tolerance set by the noise ε that is associated with h. It can be shown (Doherty, 2015), that
either of the following, mathematically equivalent, equations can be used to calculate the posterior
covariance matrix Cʹ(k) of parameters k used by the model. Cʹ(k) is thus the covariance matrix of these
parameters subject to the conditioning effects of history-matching.
Cʹ(k) = C(k) –C(k)Zt[ZC(k)Zt + C(ε)]-1ZC(k)
(3.2.1a)
Cʹ(k) = [ZtC-1(ε)Z + C-1(k)]-1
(3.2.1b)
The posterior uncertainty variance of a prediction s made by the model, which we denote as σsʹ2, is
then readily calculated using either of the following equations.
σsʹ2 = ytC(k)y – ytC(k)Zt[ZC(k)Zt + C(ε)]-1ZC(k)yt
(3.2.2a)
σsʹ2 = yt[ZtC-1(ε)Z + C-1(k)]-1y
(3.2.2b)
Note that all of equations 3.2.1a to 3.2.2b depend on an assumption of model linearity; they also
assume that prior parameter probabilities and measurement noise have multi Gaussian probability
distributions. A comparison of equation 3.2.2a with equation 3.1.6 shows that history-matching
reduces the range of predictive possibilities, thereby ensuring that σsʹ2 is no greater than σ2s.
Where a model is nonlinear, calculation of posterior parameter and predictive probabilities is far more
difficult than implementing equations 3.2.1 and 3.2.2. Methods such as Markov chain Monte Carlo
must be used. As has already been stated, these become numerically intractable where model run
times are large and/or where parameter numbers are high. Nevertheless, provided that sensitivities
encapsulated in the Z and y matrix are calculable, equations 3.2.1 and 3.2.2 can provide useable
approximations to posterior parameter and predictive uncertainty. They can also be used to calculate
value-added quantities of modelling interest such as
9


contributions to predictive uncertainty by different parameters or parameter types;
worth of various types of existing or posited data in reducing uncertainties of predictions of
management interest.
Calculations such as these and others are made easier by the fact that only sensitivities, and not the
actual values of parameters k or measurements h, appear in equations 3.2.1 and 3.2.2. Hence the
worth of data in reducing uncertainty can be calculated prior to actual acquisition of that data. This
provides a powerful basis for choosing between different data acquisition strategies. See Dausman et
al (2010) and Wallis et al (2014) for examples of application of these equations.
3.3 Model Calibration
3.3.1 Some Concepts
In most contexts of real-world model deployment, history-matching is achieved through model
calibration rather than through conditioning using Bayesian methodologies. Model calibration seeks
a unique solution to a normally ill-posed inverse problem. Obviously, uniqueness is an artificiality;
basic Bayesian considerations expose the fact that parameters and predictions are uncertain both
before and after history-matching. However, numerically, solution of an ill-posed inverse problem is
easier to undertake than calculation of a posterior probability distribution.
(It is worth noting that the pivotal role played by the “calibrated model” in model-based decisionsupport as it us undertaken at the present time is rooted more in an age-old quest for insights into an
unknown, but possibly dark, future than on a theory-based understanding of what history-matching
can actually deliver. Theory presented below demonstrates that the commonly-held view that
predictions made by a calibrated model cannot be too wrong, has no mathematical basis.)
Environmental systems are complex. Hence the model parameter vector k has many elements. Rarely
is there a unique solution to the inverse problem of model calibration. Thus a model can be
“calibrated” in many ways; that is to say, there are infinitely many parameter sets which allow a model
to fit the calibration dataset h. Ideally, however, the parameter set k which is accepted as the
“calibrated parameter set” should have a certain property which guarantees its uniqueness and its
usefulness. This property is that of minimized error variance. In other words, the parameter set k
which is deemed to calibrate a model should achieve uniqueness not through any claim to correctness
in the sense that it resembles the real (but unknown) parameter set k. Its claim to uniqueness should
be based on the notion that its error in being an estimate of k has been minimized. This error may still
be large, for it is set by the information content of the calibration dataset (which may be small).
Notionally, minimization of the variance of parameter error k-k is achieved through trying to estimate
a k that lies somewhere near the centre of the posterior probability distribution of k. The potential for
error of k is therefore minimized through making this potential symmetrical with respect to k. In doing
so, both k, and predictions s that are made on the basis of k, can be considered to be unbiased.
In mathematical parlance, the means through which uniqueness of solution of an ill-posed inverse
problem is achieved is termed “regularisation”. Calibration methodologies differ in the way that
regularisation is achieved. However the metric by which one regularisation method can be judged to
be superior to another regularisation method is the guarantee that it provides that the k which it
calculates is indeed of minimized error variance. In practice, claims to minimized error variance cannot
be verified, especially in complex parameterization contexts where prior parameter probability
distributions have no analytical description and must be sampled through geostatistical means.
Nevertheless, most methods of numerical regularisation have mathematical foundations which seek
to provide a practical guarantee that the parameter set which is calculated through its use is “unbiased
10
enough” for predictions made using that parameter set to be “close enough” to minimum error
variance.
It is obvious from the above discussion that a “calibrated model” does not constitute a suitable basis
for decision-support, for it provides no means to assess the risk of unwanted environmental events,
this being central to a model’s role in decision-support. See the discussion in chapter 2 of this
document. Calculation of a parameter set k of minimized error variance supports calculation of a
prediction s of minimized error variance. This should be considered as the first step in a two-step
process of quantifying that error variance as a substitute for quantifying posterior predictive
uncertainty. Doherty (2015) shows that post-calibration predictive error variance is larger than
posterior predictive uncertainty variance. However it is generally not too much larger and, provided
certain conditions are met, is much easier to evaluate.
3.3.2 Regularisation
As was stated in the above section, the means through which a unique solution is attained to a
fundamentally non-unique inverse problem is termed “regularisation”. In many instances of model
usage, regularisation is implemented manually. A modeller first defines a parsimonious parameter set
p by combining or lumping elements of k. In the groundwater modelling context this can be achieved
by defining a suite of zones of assumed piecewise constancy that collectively span the model domain.
In other modelling contexts regularisation may be achieved by fixing many of the elements of k so that
only a few are exposed to adjustment through the calibration process. Alternatively, groups of
parameters may be linked so that relativity of their values is preserved while grouped values are
adjusted. Regardless of how manual regularisation is implemented, the elements of p should be few
enough in number to be uniquely estimable. In achieving their estimation the model is deemed to be
calibrated.
Despite the ubiquitous use of manual regularisation in model calibration, it should be used with
caution. It offers no mathematical guarantee that the parameter field achieved through solution of
the inverse problem is indeed of minimum error variance. Hence the calibration process may actually
engender bias in some model predictions, a matter that will be discussed later in this document.
Secondly, to the extent that a model prediction is sensitive to elements of k which cannot be
estimated, and are therefore removed from the model in order to achieve parameter uniqueness, the
error variance of that prediction will be under-estimated. If this is a decision-critical prediction, a type
II statistical error may therefore follow.
Model simplification can also be considered as a form of regularisation – a matter which will be
discussed extensively in the next chapter of this document. A simplified model generally employs far
fewer parameters than are required to depict the potential for spatial heterogeneity of hydraulic and
other properties throughout a study area. The same problems may thus be incurred through
calibration and deployment of a simplified model as may be incurred through suboptimal manual
regularisation of a complex model parameter field, namely induction of predictive bias and failure to
fully quantify predictive uncertainty. However, as will be shown below, this depends on the prediction;
some predictions made by a simplified model may be afflicted by these problems, while other
predictions made by the same model may be free of them.
Non-manual regularisation is achieved mathematically, and implemented numerically, as part of the
calibration process itself. There are two broad approaches to mathematical regularisation. These can
be broadly labelled as Tikhonov and subspace methods, singular value decomposition (SVD) being the
best-known means to implement the latter approach.
11
Tikhonov regularisation achieves uniqueness by supplementing the information content of the
calibration dataset with expert knowledge as it applies to parameter values and spatial/temporal
relationships between parameter values. Expert knowledge is therefore given mathematical voice,
and included in the inverse problem solution process, in a way that attempts to guarantee parameter
uniqueness at the same time as it attempts to promulgate numerical stability in solution of the illposed inverse problem. Claims to minimum error variance solution of that problem are based on the
role awarded to expert knowledge in defining a prior minimum error variance parameter set, and on
the use of inversion algorithms that seek a parameter set that departs as little as possible from that
prior parameter set in order to support a good fit of pertinent model outputs with the calibration
dataset. Thus any unsupported heterogeneity is suppressed. Of course the real world is
heterogeneous. But expressions of heterogeneity in a calibrated parameter field that are not
supported by the calibration dataset run the risk of being incorrect. Such a parameter field therefore
loses its claim to that of minimized parameter error variance.
To put it another way, Tikhonov regularisation seeks parameter heterogeneity that MUST exist to
explain a measurement dataset. Meanwhile, exploration of the heterogeneity that MAY exist, and that
is consistent with the measurement dataset, is the task of post-calibration error/uncertainty analysis.
In contrast to Tikhonov regularisation, subspace methods seek parameter uniqueness through
parameter simplification. These methods are discussed in detail in the present document as model
simplification can be considered to be a form of parameter simplification. Exposition of the theory
which underpins use of subspace methods in model calibration therefore provides insights into the
positive and negative outcomes of model simplification. It is to exposition of this theory that we now
turn.
3.4 Calibration through Singular Value Decomposition
3.4.1 Preconditions
In order to simplify the following discussion, we will assume that the matrix Z which features in
equation 3.1.1 has more columns than rows. We thus assume that parameters comprising the vector
k outnumber observations comprising the vector h. This guarantees that the inverse problem of
estimation of k is ill-posed. However the discussion that follows is just as pertinent to cases where the
elements of h outnumber those of k. Most such problems are also ill posed. However if an inverse
problem is well-posed, the methods outlined below can still be used for solution of that problem. In
all cases a solution of minimum error variance is found for the inverse problem – provided certain
conditions are met.
To further simplify the mathematical presentation below, the following assumptions are made. Both
of these are crucial to achievement of a solution of minimum error variance to the inverse problem
posed by equation 3.1.1 where that problem is solved using singular value decomposition.
C(k) = σ2kI
(3.4.1)
C(ε) = σ2εI
(3.4.2)
Equation 3.4.1 states that elements of k are statistically independent from an expert knowledge point
of view. In practice this assumption is often violated – especially in groundwater model domains where
hydraulic properties are expected to show a high degree of spatial correlation, so that C(k) has offdiagonal elements. In principal this problem can be overcome through estimating a set of transformed
parameters j, and then back-transforming to k after a minimum error variance solution for j has been
obtained. The transformation from k to j must be such as to provide j with the properties defined in
12
equation 3.4.1; such a transformation is known as a Kahunen-Loève transformation. Where spatial
variability of k can be specified through a covariance matrix C(k), this transformation is easily
formulated through undertaking singular value decomposition of this matrix; see Doherty (2015) for
details.
Equation 3.4.2 states that measurement noise is homoscedastic and uncorrelated. In practice, an
observation dataset will normally be comprised of different data types, with different levels of noise
associated with different types of measurements. Also, even for the same data type, some
measurements will be more reliable than others. Fortunately, this situation is easily accommodated
by applying weights to different measurements and slightly re-formulating the inverse problem in
accordance with this weighting scheme. Ideally, minimum error variance of estimated parameters k is
achieved through use of a weight matrix Q that is designed according to the specification
Q = σ2rC(ε)
(3.4.3)
The inverse problem is then reformulated to estimate Q1/2h instead of h. However, in order to
maintain simplicity of the following equations, (3.4.2) will be assumed.
3.4.2 Singular Value Decomposition
Singular value decomposition (SVD) can be applied to any matrix. Singular value decomposition of the
matrix Z can be formulated as follows.
Z = USVt
(3.4.4)
In equation 3.4.4, U is an orthonormal matrix whose columns span the range space of Z. V is an
orthonormal matrix whose columns span parameter space. S is a rectangular matrix whose diagonal
elements are comprised of singular values. These are either positive or zero; normally they are
arranged down the diagonal from highest to lowest. Off-diagonal elements of S are zero.
An orthonormal matrix has columns which are unit vectors which are orthogonal to each other. It has
the very useful property that its transpose is its inverse. Hence, for the orthonormal matrix V,
VtV = VVt = I
(3.4.5)
Suppose, as will be done shortly, V is partitioned as
V = [V1 V2]
(3.4.6)
Vt1V1 = I
(3.4.7)
Then
However V1Vt1 is not equal to I. Rather, it is an orthogonal projection operator into a subspace spanned
by the columns of V1. See texts such as Menke (2012) and Aster et al (2013) for further details.
If partitioning of V according to equation 3.4.6 is performed in such a way that the number of columns
of V1 is equal to the number of non-zero singular values in S (see equation 3.4.4), then the columns of
V2 span the null space of the matrix Z. The null space of this matrix is comprised of vectors δk for which
Zδk = 0
(3.4.8)
Meanwhile the minimum error variance solution k to the inverse problem defined by equation 3.1.1
can be calculated as
k = V1S-11Ut1h
(3.4.9)
13
The use of U1 instead of U in equation 3.4.9 follows from the fact that partitioning of V into [V1 V2]
may imply complementary partitioning of U into [U1 U2].
Equation 3.4.9 states that the solution k to the inverse problem of model calibration lies within the
subspace spanned by the columns of V1. This subspace, the orthogonal complement to the null space,
is referred to herein as the “solution space” of Z. k is, in fact, the orthogonal projection of the unknown
parameter set k onto the solution space of Z. It is the minimum error variance solution to the inverse
problem defined by (3.1.1). It gains this status from the fact that k includes no null space components
of k. Hence it includes no features that are not supported by the calibration dataset h. By confining
itself to the solution space, and admitting no null space components, calculation of k using equation
3.4.9 avoids the risk of “wandering into the null space in the wrong direction”. k therefore constitutes
the safest, and therefore minimum error variance, parameter vector which allows the model to fit the
calibration dataset. See Moore and Doherty (2005) for further details.
The situation can be visualized using figure 3.1. Suppose that parameter space has three dimensions;
that is, the number of elements comprising the vector k is 3. Singular value decomposition of Z yields
the orthogonal unit vectors v1, v2 and v3 which collectively span parameter space. These do not, in
1 
 
general, point in the same direction as the vectors  0  ,
 0 
0 
1  and
 
 0 
0 
 0  . Hence each of vectors v , v
1
2
 
1 
and v3 are effectively orthogonal combinations of the parameters k1, k2 and k3 comprising the vector
k. v1, v2 and v3 define new axes in parameter space.
Suppose that, for this particular example, the dimensionality of the solution space is 2 while that of
the null space is 1. The relationship between the calculated k and the unknown k is depicted in the
figure.
v3
k
v1
v2
k
Figure 3.1 Relationship between estimated parameters k and real world parameters k. From
Doherty (2015).
14
3.4.3 Quantification of Parameter and Predictive Error
In practice, the optimal dimensionality of the solution space is not defined by the number of non-zero
singular values in S, for this takes no account of the presence of noise in the calibration dataset. This
will now be shown.
Though substitution of equation 3.1.1 into equation 3.4.9 we obtain
k = V1S-11Ut1Zk + V1S-11Ut1ε
(3.4.10)
From (3.4.4) and the orthogonality of V1 and V2 this becomes
k = V1Vt1k + V1S-11Ut1ε
(3.4.11)
Parameter error can therefore be calculated as
k – k = -(I – V1Vt1)k + V1S-11Ut1ε
(3.4.12)
Through use of the following relationship
V1Vt1 + V2Vt2 = I
(3.4.13)
equation 3.4.12 can be re-written as
k – k = -V2Vt2k + V1S-11Ut1ε
(3.4.14)
Equation 3.4.14 makes it clear that the error in the parameter set k obtained through model
calibration has two orthogonal components; see figure 3.2. The first contribution is expressed by the
first term on the right of equation 3.4.4. This is the “cost of uniqueness” term, the error accrued
through eschewing the null space when seeking a minimum error variance solution to the inverse
problem. The second term arises from noise in the calibration dataset. Where an inverse problem is
well posed, this is the only source of error in the calibrated parameter field. Where manual
regularisation is employed to formulate a well-posed inverse problem from an ill-posed inverse
problem, this is the only “visible” source of parameter error. However, parameter (and predictive)
error may be seriously underestimated through ignoring the first term. An advantage of undertaking
regularisation numerically instead of manually is that this term is, in theory, calculable because
parameter complexity is retained in solving the inverse problem. Where regularisation is undertaken
manually, it is not.
15
v3
k
-V2Vt2k
v1
v2
k
V1S-11Ut1ε
Figure 3.2. The two contributions to error in the estimated parameter set k. (From Doherty, 2015).
Parameter error k - k cannot be calculated (and therefore corrected) as both k and ε are unknown.
However the potential for parameter error can be expressed through the covariance matrix of
parameter error. If equation 3.1.5 is applied to equation 3.4.14 we obtain
C(k – k) = V2Vt2C(k)V2Vt2 + V1S-11Ut1C(ε)U1S-11Vt1
(3.4.15)
From (3.4.1) and (3.4.2) this becomes
C(k – k) = σ2kV2Vt2 + σ2εV1S-21Vt1
(3.4.16)
Suppose now that we wish to make a prediction s using the calibrated model. The correct value for
this prediction is given by equation 3.1.3. On the other hand, the prediction s made by the calibrated
model is given by
s = ytk
(3.4.17)
Predictive error is thus calculated as
s – s = yt(k - k)
(3.4.18)
Using equation 3.1.5, predictive error variance can be calculated as
σ2s–s = ytC(k -k)y
(3.4.19)
If (3.4.16) is substituted into (3.4.19) we obtain
σ2s–s = σ2kytV2Vt2y + σ2εytV1S-21Vt1y
(3.4.20)
Moore and Doherty (2005) discuss equation 3.4.20 in detail. This equation can be used to define
optimal partitioning of parameter space into solution and null subspaces. If the error variance of a
prediction is plotted against the number of pre-truncation singular values (with “truncation” referring
to the location at which partitioning of V into V1 and V2 occurs), a graph such as that shown in figure
3.3 is obtained.
16
total predictive
error variance
σ2s-s
benefit of calibration
solution
space term
null space
term
Number of singular values
Figure 3.3. Predictive error variance as a function of number of singular values used in the inversion
process. (From Doherty, 2015.)
(Note that, in general, goodness of fit achieved with the calibration dataset increases with number of
singular values assigned to the solution space. The horizontal axis of the graph shown in figure 3.3 can
therefore be labelled “goodness of fit”. With this as the independent variable, it can also be drawn for
other forms of regularisation, including manual regularisation.)
The use of zero singular values in figure 3.3 is equivalent to not calibrating the model at all. Predictive
error variance under these circumstances is equivalent to the variance of prior predictive uncertainty;
see equation 3.1.6. The first term on the right of equation 3.4.20 falls monotonically as the number of
pre-truncation singular values increases; this is referred to as the “null space term” in figure 3.3. At
the same time, second term rises monotonically, slowly at first and then very rapidly because of the
presence of S1-2 component of this term; this is referred to as the “solution space term” in figure 3.3.
At some point predictive error variance is minimized. The difference between the error variance at
this point and that calculated for truncation at zero singular values records the reduction in predictive
error variance from its pre-calibration value accrued by the calibration process. Where a prediction is
sensitive to solution space parameter components this, and hence the benefits of history-matching,
can be large. In contrast, where a prediction is sensitive to predominantly null-space parameter
components, the benefits of calibration may be very small. For all predictions made by the model, the
minimum of the predictive error variance curve, no matter how shallow, will occur at about the same
number of singular values. Doherty (2015) shows that the minimized predictive error variance will be
slightly greater than the predictive uncertainty variance calculated using the linearized form of Bayes
equation (see equations 3.2.2).
The above theory illustrates why “over-fitting” should be avoided. Presumably, a fit with the
calibration dataset should be sought which is no better than that which is dictated by the potential for
error in measurements which comprise it. That is, the level of model-to-measurement misfit should
respect measurement noise. Equation 3.4.20 shows that the information contained in a calibration
dataset becomes more and more contaminated by associated measurement noise as a modeller tries
to expand the dimensionality of the solution space. At some point, the potential for parameter and
predictive error incurred by measurement noise as the solution space is expanded becomes greater
17
than the reduction in parameter and predictive error incurred through reducing the dimensionality of
the null space.
3.4.4 Model-to-Measurement Misfit
A residual is the difference between a measurement and the corresponding quantity calculated by a
model. For a calibrated model, the vector of residuals r is calculated as
r = h - Zk
(3.4.21)
Substitution of (3.4.11) into (3.4.21) yields
r = h – ZV1Vt1k – ZV1S-11Ut1ε
(3.4.22)
Following substitution of equation 3.1.1 for h this becomes
r = Zk + ε – ZV1Vt1k – ZV1S-11Ut1ε
(3.4.23)
Subjecting Z to singular value decomposition, and noting that
Z = U1S1Vt1 + U2S2Vt2
(3.4.24)
equation 3.4.23 becomes
r = U2S2Vt2k + (I – U1Ut1)ε
(3.4.25)
Through implementation of the relationship
U1Ut1 + U2Ut2 = I
(3.4.26)
we finally obtain
r = U2S2Vt2k + U2Ut2ε
(3.4.27)
so that (using equations 3.1.5, 3.4.1 and 3.4.2) the covariance matrix of residuals is calculated as
C(r) = σ2kU2S22Ut2 + σ2εU2Ut2
(3.4.28)
Both terms of equation 3.4.28 fall monotonically as the number of singular values employed for
solution of the inverse problem (i.e. the number of pre-truncation singular values) increases. Doherty
(2015) shows that the diagonal elements of C(r) should all be less than twice σ2ε if truncation takes
place at that point where predictive error variance is minimized. If weights are chosen in accordance
with equation 3.4.3, the calibration objective function (sum of weighted squared residuals) should
thus be between one and two times the number of observations comprising the calibration dataset.
18
4. Accommodating Model Defects
4.1 Introduction
Implied in the theoretical developments of the previous chapter is the notion that a numerical model
contains no defects, and that all predictive error arises from the quest for uniqueness and from the
presence of noise in the measurement dataset. It is also implied that the potential for predictive error
(that is predictive error variance) can be quantified, and that this provides a conservative estimate of
posterior predictive uncertainty. The latter is assumed to be a quantity that depends only on the
information content of expert knowledge and of historical measurements of system state. Both of
these are contaminated “noise”.
In fact, all models have defects because all models are simplifications of reality. Simpler models have
greater defects. Simple models may not provide receptacles for all of the information that is resident
in expert knowledge and in historical measurements of system state. At the same time, the receptacles
that they do provide for this information may be imperfect, so that when the model is used to make
a prediction, that prediction is accompanied by bias. Worse still, the extent to which a prediction may
be corrupted through having being calculated using a defective model is inherently unquantifiable
using that model.
Notwithstanding the problems that use of a simple model may incur, simple models are attractive for
many reasons. They are less expensive to build than complex models. They run much faster, and are
generally more numerically stable than complex models. If they are endowed with enough
parameters, they can often be calibrated to yield a good fit with a measurement dataset. It is tempting
to believe that if this is the case then their defects have been effectively “calibrated out”. As it turns
out, this may indeed be the case for some predictions; however if the same simple model is used to
make other predictions, the potential for error associated with those predictions may have actually
been amplified through the history-matching process. At the same time, because of the parsimonious
parameter set that normally accompanies simple model design, the calibration null space may not be
well represented in a simple model. The first term of equation 3.4.16 is therefore diminished;
predictive error variance is thereby underestimated.
The present chapter expands the theoretical developments of the previous chapter to accommodate
model defects. The subspace theme is retained because of the light that this sheds on the
simplification process. In fact, singular value decomposition can be considered a simplification process
itself, for in separating the null space from the solution space, and restricting solution of the inverse
problem to the latter space, it dispenses with parameter combinations that are non-essential to
attaining a good fit with the calibration dataset. Furthermore, simplification achieved in this manner
is “optimal” in the sense that it incurs no bias in estimated parameters and in predictions required of
the model. As previously stated however, elimination of the null space comes at a cost, for it
compromises the ability of a simple model to quantify the uncertainties of predictions that have any
null space dependency.
4.2 Mathematical Formulation
4.2.1 The Problem Defining Equation
To recognize the presence of defects in a model, equation 3.1.1 which describes the action of a model
under calibration conditions, is replaced by the following equation
h = Zk + Zdkd + ε
(4.2.1)
19
Equation 4.2.1 introduces a new set of parameters to the model, these being encapsulated in the kd
vector. These can be thought of as “defect parameters”. They represent differences between a simple
model and the real world. These differences can arise from parameter simplification. However they
can also arise from approximations incurred by the nature of the model itself. Thus they can represent
the effects of gridding, meshing and other forms of spatial and temporal discretisation,
misrepresentation and simplification of boundary conditions, use of lumped water storage elements
in place of spatially distributed storages, incorrect and simplified forcing functions, etc. The matrix Zd
represents the processes that operate on these defect parameters.
4.2.2 Parameter Estimation
The user of a simplified model is unaware of the Zdkd term of equation 4.2.1. When he/she calibrates
the model, he/she assumes that equation 3.1.1 prevails. Calibration of the simple model, achieved
through regularised inversion, calculates a calibrated parameter set k using equation 3.4.9. If equation
4.2.1 instead of equation 3.1.1 is then substituted for h in equation 3.4.9 we obtain
k = V1S-11Ut1Zk + V1S-11Ut1Zdkd + V1S-11Ut1ε
(4.2.2)
from which parameter error is readily calculated as
k - k = -V2Vt2k + V1S-11Ut1Zdkd + V1S-11Ut1ε
(4.2.3)
From the above equation it is apparent that unless all of the columns of U1 are orthogonal to those of
Zd, model defects incur calibrated parameter error. This does not mean that these defects are
necessarily exposed through the calibration process as irreducible model-to-measurement misfit,
though this may be the case for some simple models; see the discussion on model-to-measurement
misfit below. In many cases of model simplification however, it may mean that adjustable parameters
k adopt roles that compensate for model defect parameters kd to such an extent that these defects
are effectively concealed from the calibration process altogether. A good fit is therefore obtained with
the calibration dataset. The greater the extent to which they are concealed, the greater is the extent
to which adjustable model parameters k thereby “absorb” the misinformation that is encased in
model defects. Given that even simple models are generally not so simple that they seriously
compromise model-to-measurement misfit (this is normally an important design criterion for a simple
model), the potential for parameters employed by a simple model to adopt surrogate roles in
compensation for their defects is high.
A modeller may be able to mitigate the adverse effects of parameter surrogacy through appropriate
re-formulation of the inverse problem. Suppose that observations used in the calibration process,
together with their model generated counterparts, are transformed prior to undertaking this process
so that equation 4.2.1 becomes
Th = TZk + TZdkd + Tε
(4.2.4)
The transformation matrix T should be chosen to be as orthogonal to Zd as possible. Such
transformation will necessarily require that the number of elements in Th are fewer than those in h.
Thus data which holds information that the simple model is not capable of accommodating is
eliminated, or “orthogonalized out” of the calibration process. Doherty and Welter (2010) and White
et al (2014) show that such transformation will often involve spatial and temporal differencing. This is
not unexpected, as a model is often better at calculating differences than absolutes. Thus the
calibration process is formulated in such a way that it “plays to the model’s strengths”.
4.2.3 Model-to-Measurement Fit
With model defects taken into account, equation 3.4.21 becomes
20
r = h – Zk - Zkd
(4.2.5)
With a little matrix manipulation, similar to that undertaken previously, this can be expressed as
r = U2S2Vt2k + U2Ut2ε + U2Ut2Zdkd
(4.2.6)
Equation 4.2.6 shows that model defects have a minimal effect on model-to-measurement misfit if
the columns of Zd lie within the range space of U1, and hence are orthogonal to the columns of U2. This
means that errors in model-calculated quantities that emerge from its defects can be “absorbed” by
adjustable parameters during the process of model calibration. As stated above, parameters adopt
surrogate roles in doing so. As will be shown below, this opens the possibility of predictive bias. At the
same time, the defects themselves may be invisible to the calibration process as their presence is not
expressed as model-to-measurement misfit. Nevertheless, their presence may be expressed by lack
of credibility of values assigned to some parameters through the calibration process. It is important
to note, however, that while this manner of detecting parameter surrogacy may provide a safeguard
for physically-based models for which the link between model parameters and real-world hydraulic
properties is explicit, it provides less of a safeguard for simple models which employ lumped storage
elements and other abstract numerical devices for which the links to directly observable quantities
whose properties are informed by expert knowledge of the real world are less direct.
Suppose that the elements of kd are statistically independent of those of k, and that (for the sake of
simplicity) the covariance matrix of defect parameters C(kd) can be expressed as
C(kd) = σ2kdI
(4.2.7)
Then from (4.2.6) and (3.1.5) the covariance matrix of residuals C(r) is then readily expressed as
C(r) = σ2kU2S22Ut2 + σ2εU2Ut2 + σ2kdU2Ut2ZdZtdU2Ut2
(4.2.8)
The first term of equation 4.2.8 decreases, ultimately to zero, with increasing singular value truncation
point. The second term decreases more slowly, but does not disappear completely as measurement
noise finds expression in model-to-measurement misfit. The behaviour of the third term depends on
the nature of model defects expressed by kd and Zd. If some columns of Zd do not lie within the range
space of Z (and hence are non-orthogonal to columns of U2 which correspond to singular values of
zero) these defects will find expression in model-to-measurement misfit regardless of values assigned
to parameters k adjusted through the calibration process. In this case the inadequacies of the
simplified model will indeed be exposed through calibrating that model. This does not mean, however,
that all of its adjustable parameters, and some of its predictions, will be immune from calibrationinduced bias, for some other columns of Zd may indeed lie within the range space of Z.
4.2.4 Predictive Error
Suppose that the calibrated, simple model is used to make a prediction. For the linear model which is
the subject of the present discussion, a prediction is made using equation 3.4.17. This, of course, is
the same equation as that employed when making a prediction with a non-defective model; after all,
when the owner of a simple model uses that model to make a prediction, he/she is oblivious to its
defects. The real world prediction, however, is calculated using the equation
s = ytk + ytdkd
(4.2.9)
In this equation the vector y encapsulates sensitivities of the prediction to parameters employed by
the simplified model. However the vector yd holds sensitivities of the prediction to defect parameters
which are an integral, though inaccessible, aspect of the simple model’s design. The error in the
prediction made by the simple model is therefore
21
s – s = yt(k – k) - ytdkd
(4.2.10)
Through substitution of equation 4.2.3 into equation 4.2.10, followed by a small amount of matrix
manipulation, this becomes
s – s =-ytV2Vt2k + ytVt1S-11Ut1ε + (ytV1S-11Ut1Zd – ytd)kd
(4.2.11)
The k and ε vectors that feature in equation (4.2.11) are unknown. However, as was done in the
previous chapter, a modeller can be expected to have some knowledge of the level of noise afflicting
his/her measurements, and can express this through the measurement noise covariance matrix C(ε).
Similarly, the covariance matrix C(k) can be used to express prior parameter uncertainty, this reflecting
expert knowledge as it pertains to these parameters. As stated above, however, the link between
expert knowledge and parameters employed by a simplified model may be more tenuous than that
between expert knowledge and parameters used by more complex, physically-based models.
Stochastic characterization of defect parameters kd is more difficult, as this approaches the realm of
“unknown unknowns”. Nevertheless, for the purpose of continuing the mathematical discussion, it is
assumed that the “extent of simple model wrongness” can be characterized by a covariance matrix
C(kd) that is described by equation 4.2.7. Then, if model defect parameters show no statistical
correlation with either the parameters k employed by the simple model, or with measurement noise
ε, propagation of variance as expressed by equation 3.1.5 can be applied to equation 4.2.11 to yield
the error variance of a prediction made by the simplified model.
σ2s–s = σ2kytV2Vt2y + σ2εytVt1S-21V1y + σ2kd(ytV1S-11Ut1Zd – ytd)t(ytV1S-11Ut1Zd – ytd)
(4.2.12)
The first two terms of equation 4.2.12 are identical to those of equation 3.4.20. These are the terms
that a modeller actually employs to calculate predictive error variance using his/her simplified model.
Alternatively, but similarly, he/she may use the linearized form of Bayes equation (equations 3.2.2) to
calculate the posterior uncertainty of a prediction using sensitivities embodied in the Z matrix. As has
already been discussed, both of these equations should yield similar results. Or the modeller may
undertake nonlinear predictive uncertainty analysis using, for example, Markov chain Monte Carlo. If
the simplified model is not too nonlinear the results will be similar. In using any of these
methodologies to characterize model predictive error/uncertainty, model defects are ignored. Hence
the real potential for model predictive error is probably miscalculated.
Predictive error variance cannot in fact be calculated unless both Zd and kd are known. This can only
be done for synthetic cases designed specifically to explore the effects of model defects on predictive
error; see Watson et al (2013), White et al (2014), White et al (2016), and the PREDVAR1C utility of
Doherty (2016). If Zd and C(kd) are, in fact, known, the graph depicted in figure 3.3 can be modified to
that depicted in figure 4.1. In this figure the black lines are the same as those shown in figure 3.3; they
depict the outcomes of predictive error/uncertainty analysis that the owner of the simple model
undertakes using his/her model. The red lines, however, are the normally unseen terms that express
the effects of model simplification on predictive error variance. The red dashed line depicts the third
term of equation 4.2.12, while the red full line depicts total predictive error variance.
22
total predictive error variance
(model defects included)
total predictive error
variance (no model
defects)
σ2s-s
solution space term
null space
term
model defect term
Number of singular values
Figure 4.1. The three terms of equation 4.2.12, together with total predictive error variance. (From
Doherty, 2015.)
As was discussed in the chapter 3 of this document, the first term of equation 4.2.12 falls with
increasing number of singular values used in the inversion process (and hence with goodness of
model-to-measurement fit achieved through that process). In contrast, the second term rises,
eventually very fast. The third term, however, cannot be expected to show monotonic behaviour. Nor
can it be expected to be zero when the number of singular values is zero.
An important feature of the third term of equation 4.2.12 is its prediction-specific nature. If a
prediction is such that
ytd = ytV1S-11Ut1Zd
(4.2.13)
then this term will be zero. Doherty (2015) shows that this occurs where a prediction is entirely
sensitive to solution space parameters of the real-world “model” – the model of which the numerical
model actually used to make predictions is a simplification. In general, these are predictions that are
very similar in character to measurements comprising the calibration dataset. The model is thus being
asked to make similar predictions in the future to observations against which it was calibrated;
furthermore, the stresses to which the simulated system will be subjected in the future are expected
to be similar to those that it has experienced in the past. In these circumstances, the design of a model
can be very forgiving indeed, for its capacity to fit the past is all that is required for it to predict the
future. Furthermore, prior information pertaining to simple model parameters takes no part in
predictive uncertainty quantification, as the information that the model requires to make the
prediction is resident solely in the calibration dataset.
The situation is different, however, for predictions that are at least partially sensitive to null space
parameter components of the “real-world model” (of which the simplified model is a defective
emulator). As far as the owner of the simple model is concerned, these predictions may be sensitive
only to solution space components of his/her model. However, as will be shown below, adjustment of
23
these parameters during calibration of the simple model may entrain null space parameters of the
real world model. In doing so, predictions which are sensitive to thus-entrained, real world null space
parameters lose their minimum error variance status, and therefore accrue bias. The extent to which
this occurs is prediction-specific and, in general, unquantifiable.
Another important feature of simple model calibration that is demonstrated by equation 4.2.12, and
by figure 4.1 that schematizes application of this equation, is the shift to the left of the number of
singular values at which predictive error variance is minimized. Once again, the extent of this shift is
prediction-specific and cannot in general be known. In some cases the total predictive error variance
curve may not even have a minimum, but instead rise from zero singular values. The repercussions of
this for calibration of a simple model are profound. If the simple model is used to make certain types
of predictions - predictions that are similar to measurements comprising the calibration dataset so
that equation 4.2.13 applies - then the modeller is entitled to seek just as good a fit with the calibration
dataset as if the model were not defective at all. Alternatively, where the model is used to make other
types of predictions, a poorer fit with field measurements should be sought as the model is calibrated.
For some predictions, especially those which are largely uninformed by the calibration dataset as they
are sensitive mainly to null space parameter components of the background real world model, it may
be better to eschew history-matching altogether. The prediction is then made using the uncalibrated
model; in exploring the uncertainty of this prediction, prior uncertainty is used as a surrogate for
posterior uncertainty.
An extremely important point that this analysis demonstrates is that, for a simplified model, the link
between good parameters and good predictions is broken. Where equation 4.2.13 applies,
parameters may adopt surrogate roles during the calibration process to the point where they are
assigned values that are highly questionable from an expert knowledge point of view. However the
prediction is not compromised in any way. For other predictions, parameter surrogacy induced by
history-matching may engender a high degree of unquantifiable predictive bias. It is possible that this
can be mitigated to some extent through under-fitting. In other cases, reformulation of the inverse
problem through filtering out bias-inducing combinations of measurements from the calibration
dataset using a strategy that is schematically depicted by equation 4.2.4 may be effective in reducing
predictive bias. The important point, however, is that calibration of a simple model needs to be done
with the prediction that it is required to make in mind. If the model is required to make a number of
different predictions, then it may need to be calibrated, and then re-calibrated, accordingly.
At this point it is salient to remember that all models – not just so-called “simple models” - are gross
simplifications of reality. The above considerations apply to them as well. Sadly, the extent to which
they apply is unquantifiable. This makes them no less real, and recognition of the phenomena
discussed herein, no less urgent.
4.3 A Pictorial Representation
The following explanation and pictures are slightly modified from Doherty (2015).
Figure 4.2 depicts a three-dimensional parameter space with orthogonal axes which coincide with the
three parameters that define this space. Two of the parameters specified by these axes are employed
by a simple model; these are adjustable through calibration of that model. The third parameter is a
defect parameter (i.e. a “kd” parameter of equation 4.2.1). Hence its (unknown) value is hardwired
into the defective model’s construction. Collectively, the three parameters span the entirety of
parameter space. Hence, conceptually at least, replication of past and future system behaviour is
possible using these three parameters even though parameter kd1, the sole element of the kd
parameter set of equation 4.2.1, cannot be varied.
24
kd1
k1
k2
Figure 4.2. Two parameters employed by a simple model and one defect parameter; collectively
they span the entirety of parameter space.
Let the matrix Zr denote the “reality” model of the system. (As has been stated above, the modeller
does not have access to this model; he/she has access only to the Z model). Thus
k 
k 
h  Z r    ε  Z Z d    ε = Zk + Zdkd + ε
k d 
k d 
(4.2.14)
Let the real world model matrix Zr be subjected to singular value decomposition, so that
Zr = UrSrVtr
(4.2.15)
Suppose that the null space of Zr has one dimension and that its solution space therefore has 2
dimensions. The three vi vectors that comprise the columns of Vr are added to the three native model
parameter vectors in figure 4.3. v1 and v2 span the solution space of Zr while v3 spans its null space.
25
v3
kd1
k1
v1
v2
k2
Figure 4.3. Model parameters together with the vi vectors (shown in black) which result from
singular value decomposition of the real world model matrix Zr.
Let the vector kr represent the three parameters of the real world model Zr. Suppose that it is possible
to actually build and then calibrate this model. The calibration process of the Zr model would yield the
vector kr shown in figure 4.4. This is the projection of kr onto the solution space of the real world
model matrix Zr. This estimate of parameters is not, of course, correct. But it is of minimum error
variance (and hence without bias) because the parameter set kr allows model outputs to fit the
calibration dataset to a level that is commensurate with measurement noise while possessing no null
space components; the latter are, by definition, unsupported by the calibration dataset.
26
v3
kr
kd1
k1
v1
kr
v2
k2
Figure 4.4. True model parameter set kr and parameter set kr (shown in red) that would be estimated
through an “ideal” calibration process undertaken using the real world model Zr.
Unfortunately, the real world model Zr cannot be calibrated because a modeller does not have access
to it. He/she can only calibrate the simplified model Z. It is through estimation of the k1 and k2
parameters comprising the vector k of equation 4.2.14 that a good fit is thereby sought with the
calibration dataset. Obviously, there are only two of these. Meanwhile the third parameter kd of the
simple model which expresses its defective nature with respect to the real world system is fixed at a
certain value, this value being implied in construction of the simplified, defective model.
Because the dimensionality of the solution space of Zr is two, the two parameters of the simplified Z
model are enough to support a good fit between its outputs and the calibration dataset (provided that
neither of its parameters lies entirely within the null space of Zr). Let k denote the parameter set
achieved through calibration of the Z model. The vector corresponding to this parameter set must lie
in the k1/k2 plane. At the same time, its projection onto the solution space of the real world model (i.e.
the space spanned by v1 and v2) must be kr; if this is not the case then k would not allow the model to
fit the calibration dataset. This is depicted in figure 4.5.
27
v3
kr
kd
k1
k1
v1
k
k2
kr
v2
k2
Figure 4.5. Calibration of the Z model through adjustment of only k1 and k2 with kd fixed leads to the
vector k (shown in blue) which projects onto the solution space of the real world model Zr as kr.
While the projection of k onto the solution space of the real world model is correct (as it must be for
the simple model to fit the calibration dataset), the projection of k onto the null space of the real
world model is incorrect if the vector k is to claim a status of minimized error variance. This projection
has a non-zero value. To the extent that any prediction required of the simple model is sensitive to
null space components of the real world model, that prediction will therefore have gained an
unquantifiable bias through calibration of the simple model.
For further discussion, see Doherty (2015).
4.4 Regularisation and Simplification
The above discussion suggests that model simplification can be viewed as a form of regularisation.
Recall from the discussion of chapter 3 that “regularisation” is the term used to describe the process
through which a unique solution is found to an inherently nonunique inverse problem. As was
discussed above, there is a metric for optimality of this unique solution. Obviously, it is desirable for
the calibrated model to fit the calibration dataset well; in doing so it has extracted the information
content of that dataset and transferred it to the model. However, because the inverse problem is
nonunique, there are an infinite number of ways to fit the calibration dataset. The “best” way is that
which achieves a parameter set of minimized error variance. Any prediction which is made using the
calibrated model is therefore also of minimized error variance; that is, it is without bias.
It was shown in chapter 3 that singular value decomposition achieves this minimum error variance
solution to the inverse problem. If necessary, it has to be applied to an inverse problem which is
modified through pre-calibration Kahunen-Loève transformation of its parameters. (Watson et al,
2013, demonstrate how parameter and predictive bias may be incurred if this is not done.)
28
Model simplification also achieves parameter decomposition. Ideally, a simplified model employs
enough (appropriately designed) parameters to allow a good fit to be achieved between model
outputs and members of the calibration dataset. However the decomposition implied by construction
of the simple model may not be ideal. That is, parameter decomposition implied by construction of
the simple model may depart from “optimal simplification” implied by singular value decomposition
of the (unattainable) “real world model”. Hence calibration of the simple model may lead to
entrainment of null space parameter components of the notional real world model. Some predictions
made by the simplified model may therefore incur bias through the act of calibrating that model. Other
predictions, particularly those that are similar in nature to measurements comprising the calibration
dataset, will not.
For the making of predictions that are similar in nature to those comprising the calibration dataset, a
simplified model is therefore “fit for purpose” as long as it can fit the calibration dataset well. It may
not be “fit for purpose” when asked to make other predictions, however. This does not mean that it
should not be asked to make these other predictions. What it does mean, is that the simple model
may need to be re-calibrated before making them in order to render it fit for this new purpose. The
revised calibration procedure may adopt an inversion formulation that filters out components of the
calibration dataset that may induce bias in these new predictions; at the same time, a greater misfit
may be tolerated between model outputs and the calibration dataset.
4.5 Non-Linear Analysis
Despite their linear origins, the equations developed in the present chapter can be used as a basis for
nonlinear analysis through which the costs and benefits of model simplification can be explored. First,
we rewrite equation 4.2.11 as equation 4.5.1 after slightly re-arranging terms.
s = s + ytV2Vt2k - ytVt1S-11Ut1ε – (ytV1S-11Ut1Zd + ytd)kd
(4.5.1)
Suppose that a modeller has built a complex model for a study site, as well as a simpler model for the
same site. Let us assume that defects associated with the complex model do not compromise its
predictions, and that they do not induce parameter surrogacy through its calibration. These are
standard (though not necessarily correct) assumptions that underpin most complex model usage. The
simple model employs a parameter set k. However implied in its construction is a defect parameter
set kd. Elements of kd are unknown and non-adjustable. For reasons discussed above, however, their
existence may compromise the use of the simple model.
Suppose now that the modeller populates the complex model with N different random realisations of
the (appropriately complex) set of parameters employed by this model. (The complex model may not
even employ explicit parameters; perhaps its parameterization is based on stochastic, geostatisticallybased, hydraulic property fields.) For each such realization, the complex model is run to produce a set
of outcomes which correspond to measurements h which comprise the calibration dataset.
Realisations of measurement noise ε are added to these outputs. For each of these N parameter fields
and measurement noise realisations, the simple model is then calibrated against the complex-modelgenerated calibration dataset to yield a parameter set k corresponding to the original complex model
parameter field. Ideally the simple model runs fast enough to undergo rapid calibration, yet is complex
enough to support a good fit with the calibration dataset. The outcome of this process is N sets of
random realisations of the complex model’s parameter field, and N sets of corresponding k parameter
sets which allow the simple model to match a calibration dataset generated by the complex model.
Suppose now that a prediction s of interest is made by the complex model using each of its N random
parameter sets. The simple model is then used to make this same prediction using each of its
29
respective k parameter sets. We designate the prediction made by the simple model as s. If s (the
prediction made by the complex model) is plotted against s (the prediction made by the simple model
with the partnered parameter field k), a graph such as that shown in figure 4.6 results.
Figure 4.6. A plot of predictions s made by a complex model against predictions s made by a simple
model. On each occasion the simple model is calibrated against outputs generated by the complex
model. (From Doherty, 2015.)
Doherty and Christensen (2011) analyse the properties of scatterplots such as those depicted
schematically in figure 4.6. As is illustrated in that figure, prior parameter uncertainty is expressed by
the vertical range of the plot. Where simplification is ideal, a line of best fit through the scatterplot
has a slope of unity and passes through the origin. However, to the extent that calibration of the
simple model induces bias in predictions made by that model because of the surrogate roles that
simple model parameters must play to compensate for its defects, the slope of the best-fit line through
the scatterplot falls below 1.0. This indicates that the range of predictions made by the calibrated
simple model can be greater than those made by the complex model (and, by inference, those which
are possible in reality). Alternatively, where the line of best-fit has a slope of greater than unity, this
illustrates an incapacity on the part of the simple model to simulate the range of conditions that may
prevail in the future. In terms of equation 4.2.1, this is an outcome of correlation of kd with k, an
expression of the fact that, as far as the prediction of interest is concerned, the simple model is not fit
for purpose.
Suppose now that the simple model is calibrated one more time – this time against the real-world
dataset instead of a synthetic dataset generated by the complex model of that study site. The simple
model is then used to make the prediction s. Doherty and Christensen (2011) show that the prediction
of minimum error variance that the complex model would have made if it had been calibrated can be
inferred from a graph such as that shown in figure 4.6. The post-calibration predictive error variance
associated with that prediction can also be ascertained; see figure 4.7. This error variance is the same
as that which would have been calculated using equation 3.4.20 (if the model were linear), but using
the real model.
30
s
post-calibration
predictive error variance
minimum error
variance prediction
inferred for
complex model
s
prediction made by calibrated simple model
Figure 4.7. Estimation of the prediction, and its associated error variance, that would have been
made using a complex model when that prediction is actually made with a paired simple model.
31
5. Repercussions for Construction, Calibration and Deployment of
Simple Models
5.1 Introduction
The present chapter examines some of the repercussions of the theory presented in the previous
chapter for how a simple model should be built, calibrated and deployed in the decision-making
context. Recall from chapter 2 that the role of modelling in decision-support is to enable risk
assessment. Conceptually, this requires that the model be capable of testing hypotheses that certain
bad things will happen if certain courses of management action are taken. Rejection of the hypothesis
that a certain bad thing will happen may allow that course of action to proceed. Modelling fails as a
basis for decision-support where a hypothesis is rejected that should not be rejected; a type II
statistical error is therefore committed. Predictive uncertainty intervals calculated by a simple model
must therefore be conservative. In being so, they must account for any bias that is introduced to the
simple model through its construction and through its calibration. However predictive uncertainty
intervals must not be so conservative as to render the support that modelling provides to the decisionmaking process meaningless.
The role of modelling in decision-support, and the fact that this role must take into account the
defective nature of both complex and simple models, is addressed by Doherty and Simmons (2013)
and by Doherty and Vogwill (2015). The discussion provided in the present chapter draws on
conclusions presented in both of these documents. At the same time it expands the discussion to
include other facets of simple model usage in the decision-making context.
5.2 Benefits and Drawbacks of Complex Models
From the theoretical developments of the preceding chapter it can be concluded that simple models
should be used with caution. The same theory also indicates what to be cautious about. Before
discussing how caution should be exercised in simple model deployment, it is worthwhile considering
why there is an incentive to employ a simple model in place of a complex model in the first place. In
doing so, some of the points made in chapter 2 of this document are revisited.
Conceptually, complex models have the following advantages.



The parameters and processes which they embody are represented in ways that can be
informed by expert knowledge. Complex models are generally physically-based. Their
parameters pertain to system properties that can be measured in the field or in a laboratory.
The geometries of their constituent subdomains are directly inferable from real-world
measurements, this reducing the burden of calibration in providing values for many of the
parameters employed by these models.
Model complexity supports representation of system property (and therefore
parameterization) detail. Not only the calibration solution space, but also the calibration null
space, can therefore be adequately represented in the model. Representation of the former
space allows the model to replicate historical system behaviour. Good model-tomeasurement fits can therefore be attained through calibration of a complex model.
Representation of the null space ensures that the uncertainties of decision-critical predictions
can be properly explored. The greater the degree to which these predictions are sensitive to
non-inferable parameterization detail, the more important this becomes.
There are strong sociological pressures to build a complex model. Uninformed stakeholders
demand that a numerical model be in accordance with concepts of “a model” drawn from
other contexts. Where models compete for public or judicial approval, a model which looks
32

more like “the real thing” is generally favoured over one that does not for, in the eyes of the
mathematically illiterate, it has already demonstrated its superior abilities to emulate an
environmental system. Looks are everything.
Models that are built by one party are generally reviewed by another party. Those who are
paid to build models, and who need the approval of reviewers in order to satisfy their clients,
are disinclined to resort to abstraction and simplicity, as this approach to model-based
decision-support is more likely to meet with peer disapproval than attempts to more
“accurately” simulate the nuances of environmental processes. In short, reviewers are less
likely to object to a complex model than to a simple model.
At the same time, complex models have many disadvantages. In many modelling contexts, these
heavily outweigh their advantages. They include the following.




The run times of complex models can be inordinately long. While the physical basis of a
complex model may provide a vehicle for expression of expert knowledge, it is rarely given
proper voice in everyday modelling practice. As is discussed in chapter 3 of this document,
expert knowledge is a stochastic quantity. The greater is the level of detail that is represented
in a complex model, the greater is the uncertainty associated with this detail. Stochasticity
can be given expression by running a model many times using different random realisations
of its parameters. This becomes impractical where a model take a long time to run. It also
requires the development of stochastic descriptors of parameter fields of high dimension. The
skillsets required to do this are rare among environmental modelling practitioners.
Complex models are often numerically unstable. This increases run times. It also makes
computation of finite-difference derivatives of model outputs with respect to adjustable
parameters almost impossible. This makes calibration of complex models difficult, if not
impossible. It also obstructs use of packages such as PEST which can obtain minimum error
variance solutions to highly parameterized inverse problems. Unfortunately, in many practical
complex modelling exercises the burden of calibration is eased by draping a simplistic
parameterization scheme over the complex model domain. Such a strategy often
compromises model-to-measurement fit achieved through the calibration process at the
same time as it fails to achieve a minimum error variance parameter field.
Notwithstanding the availability of numerical methodologies such as null space Monte Carlo,
(Tonkin and Doherty, 2009; Doherty, 2016), calibration-constrained uncertainty analysis in
highly parameterized contexts can be a time-consuming undertaking. Where model run times
are large, and where model numerical stability is questionable, it is simply unachievable.
Where it cannot be used to assess posterior predictive uncertainty, or is compromised in its
attempts to do so, a complex model can contribute little to the decision-making process,
unless it is used to assess prior uncertainty as a surrogate for posterior uncertainty. The latter
course of action may prove useful for predictions whose uncertainties are reduced only mildly
through the history-matching process.
“Complexity” is a relative thing. Complex models are more complex than simple models.
However they are far less complex than reality. Their ability to quantify the uncertainty of a
particular prediction may therefore be compromised by failure of the model to include in its
parameterisation all null space parameter components to which the prediction may be
sensitive. Some predictions made by a complex model may incur bias through calibrationinduced parameter surrogacy, especially if the complex model is endowed with a simplistic
parameterization scheme. The performance of a complex model in this regard may be little
better than that of a cleverly-designed simple model.
33

Modellers often lose touch with the decisions that modelling must support when meeting the
daily challenges of building a complex model. Once they have embarked on this timeconsuming and expensive exercise, their overwhelming concern is to produce something “that
works”. “Works” is often defined in terms of maintaining solution convergence while finding
a parameter field that provides a fit with the calibration dataset that is good enough to be
accepted by reviewers or clients. The need to ensure that decision-critical predictions are
unbiased, and that their uncertainties are explored, is forgotten.
The case for providing decision-support using simple models should therefore be strong. However in
building a simple model, the theoretical insights provided in the preceding chapter should be heeded.
Some of the means through which they can be given practical voice in construction, calibration and
deployment of a simple model are now discussed.
5.3 Notation
While the present chapter draws on theory presented in previous chapters, a slight change in notation
is now introduced in order to make it easier to distinguish between a complex model and a simple
model. In the following discussion the former is presumed to be a numerical substitute for reality;
hence its defects are not considered.
For a complex model the y, Z and k notation is preserved. Hence under calibration conditions
h = Zk + ε
(5.3.1)
When a complex model makes a prediction s, this is made using the equation
s = ytk
(5.3.2)
It is presumed that the simple model counterpart to a complex model employs a smaller number of
parameters than the complex model. Its parameters will be specified by the vector p. Because the
model is simple, it has defects; these are specified as pd. The action of a simple model on its
parameters is specified by the matrix X; defect parameters are subject to the action of the matrix Xd.
Thus
h = Xp + Xdpd + ε
(5.3.3)
Equation 5.3.3 can be re-written as
h = Xp - Zk + Zk + Xdpd + ε = Zk + η + ε
(5.3.4)
In equation 5.3.4 η is the “structural noise” induced by model simplification. Ideally a simple model
should be complex enough for this to be small. In other words, the simple model should be complex
enough to fit the calibration dataset well – or to fit the “necessary parts” of the calibration dataset
well; see below.
Let the sensitivities of the prediction s to p and pd parameters be contained in the vectors w and wd.
Thus
s = wtp + wtdpd
(5.3.5)
5.4 Prediction-Specific Modelling
Complex models are often asked to make a variety of predictions of the future behaviour of an
environmental system under a variety of future stresses, some of which may be different from any
stresses to which the system has been subjected in the past. Given the fact that a complex model is
simple compared to reality, the expectations that are thus placed on it are questionable. Nevertheless,
34
this often defines the context in which it is built and deployed, the justification being that, as a
physically-based simulator of an environmental system, it can be asked to predict any aspect of system
behaviour under any conditions.
The notion that a numerical model should be considered more as a provider of receptacles for
information than as a simulator of complex environmental behaviour was discussed in chapters 1 and
2 of this document. While a complex model is rarely viewed from this standpoint (wrongly in the
authors’ opinion), it is important that a simple model be viewed in this way. Furthermore, the design
of a simple model should focus on a specific prediction of management interest so that its
performance in making this prediction can be optimised. This occurs when




it can make the prediction with as little bias as possible;
it can quantify the uncertainty of the prediction;
it can be guaranteed not to underestimate the uncertainty of the prediction (thereby avoiding
a type II statistical error); while
reducing the uncertainty of the prediction as much as may be required to test, and maybe
reject, the likelihood of occurrence of an unwanted event.
The last specification is met if the simple model provides receptacles for the information contained
within either or both of expert knowledge and measurements of system behaviour through which
inconsistency of the occurrence of the unwanted event with this information can be demonstrated.
A simple model may not be capable of providing receptacles for all expert knowledge and
measurement information that is available at a particular study site. Nor should it. Simplicity (and with
it numerical tractability) may be gained if it only provides receptacles for the information which is
pertinent to the prediction that it is designed to support.
5.5 Playing to a Model’s Strengths
The ability of a simple model to make predictions of future environmental behaviour with integrity is
compromised by the presence of defect parameters pd. (The same applies, of course, to a complex
model, for it too is defective.) Some predictions will be more sensitive than others to pd parameters.
In general, the greater the extent to which a particular model prediction pertains to nuances of
environmental behaviour that occur at specific locations within a model domain, and the greater the
extent to which the prediction depends on subtleties and details of environmental processes that are
difficult to represent in a numerical simulator, the more likely it is that the prediction is sensitive to pd
parameters. Unfortunately, many predictions that are required of a model are of this type. These
include the response of an environmental system to extreme climatic events (for example to
prolonged droughts), and calculation of indicators of aquatic biotic health such as the number of days
below which stream flow will be less than a certain threshold, and/or the number of days over which
nutrient concentrations will be above a certain threshold.
These same types of prediction are often subject to a high degree of uncertainty. Conceptually, if a
model is complex enough, and if it includes all processes and parameters to which a prediction of this
type is sensitive (whereby pd parameters of equation 5.3.5 are included in k parameters of equation
5.3.2), the uncertainty of this prediction can be quantified. Perhaps it can also be reduced through
conditioning by the calibration dataset. In practice, increased complexity may fail to achieve either of
these goals because

even complex models employ simplistic representations of processes which are salient to
predictions of these types;
35


the long run times of complex models may preclude stochastic analysis of uncertainty;
the long run times and questionable numerical stability of complex models may preclude
reduction of predictive uncertainty through history-matching, and/or quantification of posthistory-matching predictive uncertainty.
A simple model will almost certainly run faster than a complex model. However its representation of
prediction-salient processes may be even more compromised, this exacerbating problems that it
encounters in making certain types of predictions.
A question that therefore arises is whether environmental policy should be based on quantities that
(especially simple models) can calculate with integrity rather than on those which are difficult to
calculate, and whose uncertainty cannot be quantified. The assistance that models provide to
environmental management may be illusory, or even negative, when that assistance is based on the
false premise that key model predictions are made without bias, and/or that the magnitude of possible
predictive bias can be included in quantified uncertainty intervals. In contrast, to the extent that
environmental decision-making is based on quantities that a model is demonstrably capable of
predicting with integrity, modelling can provide better support to the decision-making process. The
chances that it may diminish, rather than enhance, the decision-making process will thereby be
reduced.
Expectations by the non-modelling community of what models can deliver are generally too high.
These expectations are rarely disavowed by modellers themselves, as few understand the
ramifications of model defects on model performance, and even fewer wish to suggest to a potential
client that a large investment in modelling may be wasted. However, if decision-making was focussed
more on quantities which are somewhat immune from model defects, the role that modelling can play
in that process would be greatly enhanced. At the same time, it may be able to do so at lower cost. In
general, these quantities tend to be differences rather than absolutes. In any particular management
context they may include:




the extent to which management option A improves stream quality over management option
B;
the extent to which durations of low flow will be increased or decreased above/below their
present levels following specific changes in land management;
the difference in drawdown at a particular observation well following an alteration to
pumping at a particular production well;
the extent to which an historical measurement of water quality would have been improved if
a different land management protocol was in place at the time of the measurement.
Uncertainty analysis undertaken with a complex model would probably demonstrate that all of these
predictions are accompanied by less uncertainty than predictions of actual low-flow durations, water
levels and nutrient concentrations. This implies that predictive differences of these types are less
sensitive to null space parameter components and to model defects than are predictive absolutes.
The need to represent these null space components in a decision-support model, and/or to ensure
that a decision-support model has no defects which bias these predictions, is thus reduced. This grants
the modelling process a license for simplicity.
If support for environmental management can be demonstrated to be more robust when provided by
a simple model than by a complex model, more such models can be built at more locations than would
otherwise be the case. This would further enhance the ability of modelling to support environmental
management.
36
5.6 Dispensing with the Need for Model Calibration
Where the information content of a calibration dataset h is limited with respect to a prediction of
management interest (either because it is small, noisy, or simply uninformative of the prediction), this
implies that the prediction required of the model is sensitive to null space parameter components of
the real world model k. In these circumstances a modeller may judge that the risk of model calibration
in terms of reducing uncertainty is not worth the potential for predictive bias that calibration of a
simple model may incur. He/she may thus decide to eschew calibration of the simple model, and make
the prediction using parameters p that are informed by expert knowledge alone. Predictive
uncertainty can then be explored by making the prediction many times with different stochastic
realizations of p. The prior predictive probability distribution is thus used as a surrogate for the
posterior predictive probability distribution.
Provided that:


the prediction of interest is not sensitive to model defects (or can be formulated to be such),
and
realistic prior probabilities can be assigned to the elements of p, notwithstanding their
abstract nature,
such a procedure, if properly undertaken, should provide conservative predictive intervals, and
therefore avoid a type II statistical error (and thus failure of the modelling process according to the
metrics set out in chapter 2). Conservatism arises from the fact that the posterior predictive
probability distribution should, in theory, be no wider than the prior predictive probability distribution
because of the constraining effect of the likelihood function on the former. However if data paucity,
poor data quality, or poor data relevance suggests that its constraining effect will be small, then the
degree of conservatism that is accepted through adoption of the prior predictive uncertainty interval
as a surrogate for the posterior predictive uncertainty interval may not be too great.
While adoption of such a strategy can be justified on the basis of theory provided in previous chapters,
it may meet with opposition from modelling stakeholders and reviewers who judge that a model can
have no predictive integrity unless it is “calibrated”. While this viewpoint is not supported by
mathematics presented in this document, there is nevertheless some merit in ensuring that modelcalculated quantities are representative of the behaviour of the environmental system whose
management it is designed to support. Hence there is validity in comparing outputs of the simple
model with the historical behaviour of that system. However, there is no need to actually “calibrate”
the simple model. Instead, its outputs can be compared with the calibration dataset in a probabilistic
sense in order to ensure that these outputs, when generated with a variety of stochastic realizations
of p, encompass the calibration dataset even if no particular p provides a fit with this dataset that can
be considered to “calibrate” the model. (As is described shortly, it may be useful to process the
calibration dataset before fitting model outputs to it. In this is done, processed measurements should
be compared with corresponding processed model outputs.) The situation is schematized in figure 5.1.
37
measured quantity
time
Figure 5.1. A schematic example of “stochastic fitting” of model outputs to measurements of system
state.
The decision to use prior predictive uncertainty as a surrogate for posterior predictive uncertainty may
be a difficult one to make for, in general, a modeller cannot know the extent to which the calibration
process will induce bias in predictions in which he/she is most interested. After all, he/she has no
access to the pd parameters and the Xd and wd matrix/vector of equations 5.3.3 to 5.3.5; the modeller
has access to only the X matrix and p vector featured in these equations. Nevertheless, these items
may provide enough information to justify the decision to forego model calibration. If linear analysis,
undertaken using tools of the PEST or PyEMU suites, demonstrates that the uncertainty of a particular
prediction is unlikely to be reduced much through history-matching, even if such an analysis ignores
Xd, pd and wd, then the decision is probably being made on good grounds.
Where stochastic analysis is untenable because of time constraints, or because it is difficult to know
the prior probability distribution of p parameters, then worst case scenario analysis may provide a
sound basis for testing, and attempting rejection, of hypotheses pertaining to the occurrence of
unwanted events. This, of course, is simple to implement in a context where calibration constraints
are not applied. A modeller simply maximizes or minimizes wtp of equation 5.3.5 by choosing values
for the elements of p appropriately. Here it is assumed that the elements of wd are small because
model simplification is tuned to the prediction of interest; the pd vector does not therefore
compromise the model’s ability to make that prediction.
5.7 Tuning the Calibration Dataset to Match the Prediction
As stated above, prediction-specificity is the key to appropriate model simplification. It follows that in
a single study area, a number of different models may be constructed, each optimised to make a
specific prediction and to examine the uncertainty in that prediction. The question of how a simple
model may be prediction-optimised through the way it is calibrated is now addressed. See also the
discussion above pertaining to the nature of predictions that are best sought from a simplified model.
Let us start by considering the making of a prediction by a complex model. From equation 5.3.2
s = ytk = ytk – yt(k - k)
(5.7.1)
38
where k is the minimum error variance solution to the inverse problem of model calibration. From
(3.4.9) and (3.4.14), this becomes
s = ytk = yt V1S-11Ut1h + ytV2Vt2k - ytV1S-11Ut1ε
(5.7.2)
Ignoring measurement noise (for simplicity), this can be written as:
n
s   si1y t v1i u1t i h 
i 1
m
y v
t
i n 1
2i
v 2t i k
(5.7.3)
where n is the number of pre-truncation singular values, m is the total number of parameters
comprising the vector k, v1i are unit vectors comprising the columns of V1, and v2i are unit vectors
comprising the columns of V2. The first summation provides the prediction of minimum error variance
while the second summation must be done probabilistically for exploration of uncertainty using, for
example, different realizations of k based on expert knowledge. This can be accomplished using a
methodology such as null space Monte Carlo; see Doherty (2016). Alternatively, it can be
accomplished using methods discussed in chapter 6 which better accommodate the design of simple
models.
For the moment, let us focus on the first summation that yields the prediction of minimized postcalibration error variance, so that
n
s   si1y t v1i u1t i h
(5.7.4)
i 1
Equation 5.7.4 can be re-written as
n
s   si1 i u1t i h
(5.7.5)
αi = ytv1i
(5.7.6)
i 1
where
Each ut1ih in equation 5.7.5 is a single number. It is the value of the observation dataset projected onto
a single unit vector spanning the range space of the model. It can be considered to be the value of a
combination of observations that is informative of a combination of parameters v1i used by the model;
see Doherty (2015). The extent to which this combination of observations contains information that
is relevant to the prediction is determined by the respective αi. Where the v1i which defines the
combination of parameters that is uniquely and entirely informed by ut1ih is nearly orthogonal to the
predictive sensitivity vector y, the respective combination of observations lacks information that is
pertinent to the prediction. It is also apparent that, regardless of its size, any calibration dataset
possesses a finite number of useable pieces of information (this being equal to the number of pretruncation singular values). Many of these may be of little pertinence to the prediction required of the
model.
These concepts can free the designer of a simple model from the imperative of designing a model that
is able to replicate all aspects of the historical behaviour of a system whose management it is intended
to support. There are some aspects of this behaviour that the simple model must reproduce in order
to constrain the prediction that it is required to make. As the same time, there are others aspects of
this behaviour that it does not need to reproduce. In other words, the simple model does not need to
provide receptacles for the information content of those aspects of historical system behaviour that
39
have no bearing on the prediction that is the focus of its design. As a result, the model does not need
to be as complicated as it would need to be if it were being asked to make more than the single
prediction. (The analysis can easily be extended to multiple predictions, with similar conclusions.)
Despite the fact that most models are nonlinear, the above considerations are nevertheless salient to
the design of any prediction-specific, simple model. The simple model must provide receptacles for
information that is salient to the prediction that it supports; it does not need to provide receptacles
for other information. This can reduce its complexity and increase its speed of execution. However, in
calibrating the model, measurements comprising the calibration dataset may require processing (as
do corresponding numbers calculated by the model) so that aspects of this dataset which the model
cannot fit (because it provides no receptacles for the information which it contains) do not erode the
capacity of the model to fit those aspects of the calibration dataset which it can fit (or provide the
visual impression that the model is poorly calibrated).
This strategy of simple model deployment is actually quite common. For example, if predictions
required for decision-support pertain to permanent alterations to a groundwater system, these can
be made with a steady state model. Despite the fact that the historical behaviour of the system may
show seasonal variations, steady state conditions may be assumed for its calibration. While transient
data is richer in information than averaged, steady state data, much of this information is not salient
to predictions required of the model. Furthermore, a transient calibration process would require a
much more complicated model for which the information content of the calibration dataset would
have to inform more parameters (storage and recharge parameters). Hence the capacity of the model
to make the steady state prediction may be no better for having undergone transient calibration.
Instead, a steady state calibration in which model outputs are compared with appropriately processed
historical observations can provide the model with predictive abilities of the desired type which are in
no way diminished from that of a vastly more complicated model that must undergo a vastly more
complex calibration process.
Another means through which the above strategy can be implemented is through formulation of a
multi-component objective function (i.e. sum of weighted squared residuals) in which the same data,
processed in different ways, comprises the different components of the objective function. Each
component of the objective function is weighted to ensure roughly equal visibility at the
commencement of the inversion process. The benefits of data transformation prior to fitting,
particularly where transformation involves spatial and temporal differencing, were discussed in the
chapter 4. As is stated therein, such processing can “orthogonalize out” information that would
otherwise be directed to model receptacles that would bias some predictions as the model is
calibrated. In general, a modeller is not aware of just how non-optimal his/her model is, nor of the
amount of bias that the calibration process will induce in any particular prediction. Nevertheless, this
should not stop him/her from taking some precautions against the unwanted side-effects of
simplification. By admitting both the original and transformed (in multiple ways) data into the
calibration process, some defence against these side-effects is provided. Further defences can then
be put into place if initial calibration of the model provides continued evidence of non-optimality of
simplification through estimation of parameter values that are highly dubious from an expert
knowledge point of view. See Doherty and Welter (2010) and Doherty (2015) for further discussion of
this approach.
5.8 Model Optimality and Goodness of Fit
Optimality of simplification was discussed in chapter 4. There it was pointed out that model
simplification can be considered as a form of parameter decomposition. Ideally, the model
40
simplification process should perform a role similar to that of singular value decomposition. It should
separate parameter space into two orthogonal subspaces such that calibration of the simple model
does not entrain real world null space parameter combinations that can bias certain predictions.
Though not often stated in these terms in the modelling literate, many instances of model
simplification tend to follow this precept. Hence, for example, an “upper soil moisture store” and a
“lower soil moisture store” may form important components of a rainfall-runoff model or a land use
model whose domain spans a watershed of large area. Implied in definition of these storages is the
notion that the information content of the calibration dataset is sufficient to parameterize each of
these separately. If a more complex model were used instead, this same information would flow to
similar receptacles that are not defined in these simplistic terms, but are defined using many more
spatial parameters that have greater hydrogeological meaning. However few of the greater number
of parameters associated with the latter conceptualization of storage mechanisms that prevail in the
watershed would be individually identifiable on the basis of the calibration dataset. Nevertheless the
post-calibration correlation that these parameters exhibit because of this lack of information would
be internal to each storage, or cross between these storages to only a limited degree.
It was also stated in chapter 4 that there is no need for model simplification to be optimal where
predictions required of a model are entirely dependent on solution space components of the “real
world model”. Of course, a modeller cannot determine whether this is actually the case because
he/she has no access to the real world model. However, as has been stated, where a prediction is
similar in nature to measurements comprising the calibration dataset this is likely to be the case. This
often applies to rainfall-runoff models, except where the latter are asked to make predictions
pertaining to extremes of rainfall or drought that are not represented in the calibration dataset; these
predictions are likely to exhibit sensitivity to real-world null space parameter combinations.
In some modelling contexts, therefore, optimality of model simplification is not an issue. In these
contexts a modeller should seek as good a fit with the calibration dataset as he/she can (with due
consideration taken of measurement noise associated with the calibration dataset). Information
contained within this calibration dataset is transferred directly to the prediction, with no possibility of
predictive bias. However in modelling contexts where there is a possibility of calibration-induced
predictive bias, this strategy may not be optimal. In that case a modeller may purposefully seek a misfit
with the calibration dataset that is greater than that which would be judged as optimal on the basis
of measurement noise in order to avoid bias when using the model to make certain predictions; see
figure 4.1. The result will be higher calculated (using the simple model) uncertainty for those
predictions than would have been the case if a better fit with the calibration dataset had been
attained. However predictive error (which cannot be calculated using the simple model) may be less.
A problem with this approach, however, is that a modeller is often unaware of how great a misfit
he/she should seek with the calibration dataset. Sadly, this problem is unavoidable.
It is important to note that this approach to calibration raises the spectre that a model should be
“calibrated tightly” for the purpose of making some predictions and “calibrated loosely” for the
purpose of making other predictions. This concept is at odds with common modelling practice.
Nevertheless it is supported by theory outlined in this document.
5.9 Detecting Non-Optimality of Simplification
Despite the fact that optimality of simplification is subject to theoretical characterization, it is difficult
to verify in real world modelling practice. Furthermore, for reasons discussed above, it may not
matter. Theory presented in chapter 4 shows that when model defects are taken into account, the link
41
between good parameters and good predictions is broken. A non-optimal model may therefore
provide unbiased predictions of low posterior uncertainty simply because it can fit the calibration
dataset well. At the same time, predictions that are highly sensitive to entrained real-world null space
parameter components may be subject to considerable bias. Many predictions will fall between these
two extremes. It is thus apparent that non-optimality of simple model design can only be defined for
a specific prediction. Hence the same model may be optimal for the making of one prediction but nonoptimal for the making of others.
The PREDVAR1C utility available through the PEST suite of software can be used to assess optimality
of simplification as it pertains to specific predictions. This utility employs the theory that is developed
in chapter 4 of this document. Utilities within the PyEMU suite provide similar functionality; the latter
suite was used in studying optimality of simplification by White et al (2015).
While PREDVAR1C and PyEMU are unique in providing these capabilities (the authors know of no other
software that provides the same functionality), there are disadvantages associated with their use. In
particular:


model linearity is assumed, and
use of these packages requires definition of defect parameters (i.e. the pd parameters of
equation 5.3.3), and that sensitivities of calibration-related and prediction-related model
outputs with respect to these parameters be calculated.
The latter of the above two points restricts use of these packages to a small range of modelling
instances where model defects can be expressed as continuously variable parameters. It does not
allow exploration of the effects of categorical simplifications such as course grid spacing, missing
model layers, or the choice between a physically-based model and a lumped parameter counterpart
to this model (although sometimes it is possible to define continuous surrogate parameters to explore
these effects).
A nonlinear alternative to the use of PREDVAR1B and PyEMU is implementation of the methodology
presented by Doherty and Christensen (2011) and discussed in section 4.5 of this document. However
this mode of model defect analysis is numerically intensive. It requires the construction of a complex
model and a complementary simple model. The latter is then calibrated against the former for many
parameter realizations of the former. Both are then employed to make one or a variety of predictions
for which the effects of model defects on predictive bias are explored. This methodology is more
flexible than that provided by linear analysis, and readily accommodates categorical features of model
construction like grid cell size, number of layers and modelling approach.
It is the authors’ opinion that further research into the costs and benefits of model simplification, and
of ways that simplification can be optimised in different modelling contexts, is urgently needed. With
greater industry knowledge of what simple models can achieve, the default modelling condition of
“complexity is better” may change. However to achieve such a culture change, modellers will require
guidance in many matters related to simple model construction and deployment, including:


simple model specifications for different decision-support contexts; and
formulation of appropriate calibration strategies for simple models.
As has been extensively discussed in the present and previous chapters of this document, it is in the
nature of simple models that their construction, calibration and deployment must be done in a
prediction-specific manner. It is of interest to note that this approach to modelling is somewhat at
odds with present day modelling culture in which a large, all-purpose “model for an area” is built to
42
support the making of many different predictions. The model is then reviewed by a third party who
assesses the model in terms of its presumed capacity to simulate environmental processes within the
study area. If the model is considered satisfactory in this regard, it is then used to make a variety of
predictions of future system behaviour under a variety of management scenarios. A more enlightened
approach to model construction and assessment would recognize that different models, and/or
different approaches to calibration of the same model, may be required for the making of different
decision-critical predictions. Modelling methodologies, rather than models themselves, should
therefore be the subject of assessment and review.
The following list summarizes a number of features of a modelling methodology that should be
assessed in reviewing that methodology for its usefulness in decision-support.




If the prediction required of a model is very similar in nature to at least some members of the
calibration dataset, then a good fit with the calibration dataset should be sought (taking into
account the level of measurement noise associated with that dataset).
If linear analysis using a simple model suggests that the uncertainty of a prediction will be
reduced little through calibration, then prior predictive uncertainty analysis may be worthy of
consideration as a surrogate for posterior predictive uncertainty analysis.
When calibrating a model to make predictions which fall between these extremes, a modeller
should look for early signs of over-fitting. If parameter values violate reasonableness, and/or
if spatial parameters adopt unusual patterns, these may signal that the information content
of the calibration dataset as it pertains to a prediction of management interest is being
directed towards receptacles that are inappropriate for the making of that prediction.
Overfitting can be avoided through purposefully seeking a “less-than-perfect” fit with the
calibration dataset. (When using PEST in “regularisation” mode, this can be implemented
through use of an appropriate target measurement objective function.) It can also be avoided
through “orthogonalizing out” those components of the calibration dataset for which a
simplified model has no, or defective, receptacles. Thus some modes of historical system
behaviour are fit, while others are not.
43
6. Avoiding Underestimation of Predictive Uncertainty when Using
Simple Models
6.1 Introduction
Most of the focus of the previous chapter has been on how to avoid bias when using a simple model
to make predictions. A problem with predictive bias is that its existence is generally undetectable.
Nevertheless, because it contributes to the potential error of a prediction, it needs to be included in
assessment of that potential. As was discussed in chapter 2, estimation of predictive error variance is
central to the use of a model in the decision-making context. As a surrogate for predictive uncertainty
it is central to evaluation of risk, and hence to ascribing a level of confidence to the assertion that a
“bad thing” will not happen if a certain course of management action is taken. The centrality of
predictive uncertainty analysis to model use in the decision-making context was underlined in the
definition of failure presented in that chapter, namely the occurrence of a type II statistical error
wherein the hypothesis of occurrence of an unwanted event is falsely rejected. This type of error can,
of course, be avoided by calculating uncertainty intervals which are very wide. However this renders
a model unhelpful, for it removes from it the ability to reject any hypotheses at all.
The present chapter focusses on how uncertainty can be quantified when making a prediction using a
simple model. In many modelling contexts this will be the greatest challenge facing use of a simple
model. This is because a simple model generally employs far fewer parameters than a complex model
of the same system. A simple model often dispenses with parameters that cannot be inferred through
the history-matching process, and hence belong to the calibration null space. However in many
instances of environmental modelling undertaken for decision-support, these are the very parameters
whose presence is required for quantification of predictive uncertainty. This applies especially to
predictions of expected system behaviour under climatic, land use or other conditions which are
different from those which prevailed during calibration of the model. At least some of the parameters
(or parameter combinations) to which such a prediction will be sensitive are likely to be uninformed
by the calibration dataset. Because they therefore inhabit the calibration null space, they are informed
only by expert knowledge. This provides their expected values, the extent to which their values may
be different from expected, and spatial, temporal or other correlations which these differences may
exhibit. The representation of null space parameter components in a model is therefore crucial to
correct inference of predictive uncertainty – not because of their estimability (which is often put
forward as the sole criterion for inclusion of a parameter in a model) but because of their lack of
estimability.
Equation 3.2.1 (which describes posterior predictive uncertainty) and equation 3.4.20 (which
describes post-calibration predictive error) proclaim that the posterior uncertainty/error of a
prediction depends on:



the parameters to which it is sensitive;
the potential for variability of those parameters (this being a matter of expert knowledge
expressed by the C(k) prior parameter covariance matrix);
the extent to which this variability is reduced through history-matching.
Reduction of parameter variability occurs if the calibration dataset contains information which is
pertinent to those parameters. However the extent to which this information can effect a reduction
in parameter variability depends on the extent to which it is contaminated by measurement noise;
the magnitude of measurement noise is described by the C(ε) matrix that is discussed in previous
chapters of this document. However the level of model-to-measurement fit attained through
44
calibration of a model is rarely commensurate with that expected from measurement noise; this is
especially the case where a model has been simplified – often with consequential reduction in its
ability to simulate some nuances of environmental system behaviour. So-called “structural noise”
then contributes to model-to-measurement misfit, this being specified by the η vector of equation
5.3.4. In most attempts to estimate posterior parameter and predictive uncertainty, structural noise
is treated as if it were measurement noise. It is assigned an (often diagonal) covariance matrix and
used in equations such as 3.2.1 and 3.4.20. This is done not because it is a theoretically correct
approach to handling structural noise, but because it is easy.
It can be shown that structural noise can indeed be treated as if it were measurement noise provided
it is awarded a covariance matrix C(η) that is “correct” in the sense that its structural origins are
acknowledged. Methods for accomplishing this in certain instances of parameterization simplification
are provided by Cooley (2004), Cooley and Christensen (2006), and Christensen (2017). In most
practical modelling contexts, however, this theory is difficult to apply, requiring paired models and
many runs of the complex member of the pair. Christensen (2017) shows how correction factors to
calculated predictive uncertainty intervals can be applied to accommodate the structural noise
contribution to model-to-measurement misfit. However, more often than not, these result in
predictive uncertainty intervals that are too wide to be of use.
Attempts to obtain a covariance matrix for structural noise are further hampered by the fact that C(η)
is generally singular; see Doherty and Welter (2010). Its non-existent inverse makes development of
a suitable calibration weighting strategy difficult; equation 3.4.3 cannot be applied.
In light of the above considerations, it is apparent that problems facing estimation of the uncertainty
associated with predictions made by simplified models are considerable. Given the centrality of
uncertainty analysis to decision-support, this has the potential to hinder their use in this context.
However, for reasons discussed in the previous chapter, quantification of uncertainty of predictions
made by complex models faces even greater challenges. Hence there is no alternative but to seek
approximate means of uncertainty quantification where simple models are used as a basis for
decision-support. The present chapter suggests a few of these means.
6.2 Solution Space Dependent Predictions
As has already been discussed, where a prediction is sensitive only to solution space parameter
components of a notional “real world model”, then that prediction has no null space dependency.
Entrainment of null space components of the background real world model through calibration of the
simple model does not therefore incur predictive bias. A simple model which is designed to make this
kind of prediction can be endowed with a parameterization scheme that is parsimonious enough for
its calibration to constitute a well-posed inverse problem. Predictive uncertainty analysis can then
proceed using standard regression techniques; see for example Draper and Smith (1998). If the model
runs fast enough, methods such as Markov chain Monte Carlo can be employed to sample the
posterior parameter probability distribution with a relatively high level of numerical efficiency; see for
example Gamerman and Lopes (2006) and Laloy and Vrugt (2012).
Where an inverse problem is well posed then, theoretically, all parameter uncertainty is inherited
from noise in the calibration dataset. However, as was discussed in the preceding paragraph, much of
this “noise” is likely to be structural in nature so that its stochastic characterisation is illusive. The
problem of stochastic characterization of measurement noise becomes particular difficult where the
measurement dataset is comprised of time series of quantities such as stream flow and/or water
quality which have a very large dynamic range and exhibit considerable temporal correlation. Ways in
which this has been addressed include
45




use of a subjective likelihood function (Beven, 2005; Beven et al, 2008);
use of autoregressive moving average (ARMA) techniques (Kuczera, 1983; Campbell et al,
1999; Campbell and Bates, 2001);
formulation of a multi-component objective function in which different modes of system
behaviour are given equal visibility so that their information content is exposed to the
calibration process (Doherty and Johnson, 2003; Doherty and Welter, 2010; White et al, 2015);
simulation of model errors using a Gaussian process whose specifications are inferred through
the calibration process itself (Kennedy and O’Hagan, 2001; Higdon et al, 2005).
When using a model whose outputs are accompanied by structural noise, this noise must be
accommodated as “predictive noise” when assessing the total uncertainty of a prediction. The socalled “predictive interval” of a decision-critical prediction must therefore include not just the
uncertainty margin of the prediction that arises from uncertainties in estimated parameters. When
calculating the range of possible predictions that are compatible with the historical behaviour of the
system as recorded in the calibration dataset, this range of possibilities must also include an interval
that quantifies limitations in a model’s ability to replicate all nuances of that behaviour. This interval
can be calculated using regression theory as outlined by Graybill (1976). However, as Christensen
(2017) points out, calculations become much more difficult when the structural nature of model-tomeasurement misfit is recognized. Alternatively, it may be possible to construct data-driven, modelerror-correcting statistical submodels using modern “big data” processing methods; see, for example,
Demissie et al (2009).
6.3 Null Space Dependent Predictions
History-matching does little to reduce the uncertainty of a prediction that is sensitive almost entirely
to real world null space parameter components. In fact, has been extensively discussed, for a simple
model, history-matching may do more harm than good as it may ascribe erroneous values to real
world null space parameter combinations that are artificially linked to solution space parameter
combinations through the simple model’s parameterization scheme. In Section 5.6 it was suggested
that when a simple model (or even a more complex model for that matter) is asked to make a
prediction which is largely uninformed by the calibration dataset, history-matching may be dispensed
with; the prior probability distribution of the prediction may then be considered as a surrogate for its
posterior probability distribution.
While this strategy overcomes inflation of post-calibration predictive error incurred through
calibration-induced null space entrainment, other problems remain. In particular:


The simple model must represent all parameters to which the prediction of interest is
sensitive despite the fact that many of these will be inestimable; and
A prior uncertainty must be ascribed to simple model parameters; this may be difficult if they
are abstract in nature.
These two problems are not independent of each other. A simple model is indeed likely to employ far
fewer parameters than a more complex model of the same system. Hence each of its parameters is
likely to represent a combination of complex model parameters, and hence temporal/spatial averages
of real world properties.
Suppose that, either actually or conceptually, simple model parameters p can be formulated from
complex model parameters (or real world system properties) k using a linear equation of the type
p = Nk
(6.3.1)
46
For a simple groundwater model, p may represent a handful of zones of piecewise constancy whereas
k may represent cell-by-cell hydraulic properties of a complimentary complex model, or point-by-point
hydraulic properties of the real world. In this case, N may be considered as an averaging matrix; each
of its rows is thus comprised of zeroes except for a range of elements whose values are all 1/n where
n is the number of complex model cells, or real-world points, comprising a particular simple model
zone that is represented by a single element of p. It would be tempting to use equation 3.1.5 to
calculate a covariance matrix for p as
C(p) = NC(k)Nt
(6.3.2)
Presumably C(k) is known, as it expresses expert knowledge. Let the sensitivities of a prediction s to
parameters p be encapsulated in the vector w. The simple model could then be used to make this
prediction, the linear representation of this process being
s = wtp
(6.3.3)
The uncertainty variance of the prediction would then be calculated as
σ2s = wtC(p)w = wtNC(k)Ntw
(6.3.4)
That this course of action is incorrect can be illustrated using a simple groundwater modelling
example. Suppose that the purpose of the simple model is to compute travel time of a contaminant
to a receptor, and that the subsurface contains narrow, coarse-grained, alluvial channels set in finegrained flood plain sediments, and that preferential flow takes place through the former.
Representation of these channels is lost where averaging takes place to form broad zones of piecewise
constancy. Furthermore, the variability of upscaled permeability in a large zone will be quite small if
calculated using equation 6.3.2. Predictive uncertainty will therefore be seriously underestimated as
no account is taken of the fact that the contaminant may be transported through a channel.
The mistake in the above approach arises from failure to account for the model defect term; see
equation 5.3.5. If hydraulic property averaging takes place to form a large zone of piecewise constancy
when the zone in fact contains a high permeability channel (or MAY contain a high permeability
channel), then this comprises a defect in the simple model to which the prediction of interest is
particularly sensitive but which is unrepresented in calculation of the value of the prediction or in
assessment of its uncertainty.
The simple groundwater model would no longer be simple if, in order to rectify this problem, it was
modified to represent discrete alluvial channels within a broader host rock. This would probably have
to be done stochastically as a modeller may not know where the channels are, or even if they really
exist. To retain simplicity of the model while not compromising its purpose, the modeller must
introduce defects in a more appropriate way. Hence he/she may assign to zonal permeabilities
downstream of a contaminant source values that are more in accordance with those of buried channel
alluvium than with those of flood plain sediments. The uncertainties associated with these zonal
permeabilities would also be increased to accommodate the fact that the zone may, or may not,
contain an alluvial channel. The defect terms that are now implied in construction of the simple model
are thus shifted from failing to represent alluvium to failing to represent host material.
Underestimation of contaminant travel times will consequentially be avoided.
Alternatively, the weighting scheme used in definition of the N matrix of equation 6.3.1 could be
altered from that of simple spatial averaging, to that of weighting in terms of predictive sensitivity.
Because the prediction of interest is that of contaminant transport, alluvial channel permeabilities
would receive much higher weights in determining upscaled zonal permeabilities than would host rock
47
permeabilities. Upscaled zonal permeabilities, and their upscaled prior uncertainties, would thus be
much more appropriate for the prediction required of the model.
Similar considerations should apply to the design of simple models for other purposes. The predictionspecific nature of their design is again apparent.
6.4 Predictions which are Dependent on Both Spaces
This is the most difficult case of all to deal with. Unfortunately it is also the most common case. It
pertains to predictions that are only partly informed by the calibration dataset. Hence their
uncertainties have the capacity to be reduced through history-matching. However a large amount of
uncertainty may remain, this being an outcome of their sensitivity to parameters whose variability is
constrained by expert knowledge alone; see equation 3.4.20. Quantification of the uncertainties of
these predictions is therefore afflicted by all of problems that have been addressed previously in this
chapter. In particular:


The solution space component of predictive error variance (second term of equation 3.4.20)
may be difficult to quantify because of difficulties in ascribing a stochastic characterization to
structural noise (which is likely to make a significant contribution to model-to-measurement
misfit);
The null space component of predictive error variance (first term of equation 3.4.20) may be
difficult to quantify because of the abstract nature of model parameters.
A third problem with this modelling context is that this is the context in which calibration-induced
predictive bias is most likely to occur.
Suggestions provided in the two previous subsections of the present chapter pertaining to formulation
of an objective function and strategic definition of model parameters (and associated defects) can be
applied to this case. Strategies which may apply in addition to these include the following.


Because of the need to avoid calibration-induced predictive bias, a modeller may purposefully
under-fit the calibration dataset. The contribution to predictive error variance from
measurement/structural noise (second term of equation 3.4.20) is thereby increased.
Stochastic characterization of this noise should be such that its variance is comparable with
the fit that is ultimately attained through the calibration process. The C(ε) matrix should be
informed of the level of measurement/structural noise implied by model-to-measurement
misfit before being used for quantification of parameter/predictive uncertainty.
With increased misfit, the minimum of the error variance curve of figure 3.3 is shifted to the
left. The dimensionality of the calibration null space is consequentially increased. The role of
C(p) in contributing to predictive uncertainty is therefore increased. A modeller may then
decide to allow greater variability in C(p) than would be allowed on the basis of considerations
that are encapsulated in an equation such as 6.3.2. This greater variability allows the
uncertainty analysis process to award greater variability to the prediction, thereby avoiding a
type II statistical error.
With values for elements of C(p) and C(ε) increased appropriately in accordance with the above
considerations, analysis of parameter/predictive uncertainty could take place using any of a number
of methodologies that are designed to undertake calibration-constrained uncertainty analysis. These
include:

linear analysis using PEST or PyEMU utilities;
48





Markov chain Monte Carlo (if the simple model runs fast enough, and its parameters are few
enough in number);
PEST’s predictive analyser (if the simple model runs fast enough and its parameters are few
enough in number);
null space Monte-Carlo;
ensemble Kalman Filter/Smoother;
direct hypothesis testing (see below).
The presence of “predictive noise” may also require accommodation for reasons already discussed.
Alternatively, or as well, an “engineering safety margin” may be added to estimates of predictive
uncertainty computed in any of the ways listed above.
6.5 Direct Hypothesis-Testing
It was stated in chapter 1 of this document that the unique features that modelling can bring to
environmental decision-support is its ability to test, and maybe reject, hypotheses that bad things will
result from certain courses of management action. It can do this by demonstrating the incompatibility
of these unwanted events with information contained in expert knowledge (including direct
measurements of system properties), and/or with information contained in measurements of the
behaviour of the system.
One way in which a hypothesis of management interest can be tested using a model is to draw samples
from the posterior parameter probability distribution, run the model using each sample, and then
count the number of times (if any) that the unwanted event occurs. This comprises the more-or-less
“standard” way to do uncertainty analysis.
Another option is to use highly-parameterized inversion software such as PEST to “observe” the
occurrence of the unwanted event in an observation dataset that is expanded from the original
calibration dataset by one to include this unwanted occurrence. If it can be established that the
unwanted event will occur only if either


the parameter field required for its occurrence cannot support a good fit with the calibration
dataset, or
the parameter field required for its occurrence is “unrealistic”
then the hypothesis of occurrence of the bad thing can be rejected on the basis of incompatibility with
the two types of information for which the model is a repository. PEST can be asked to undertake this
process if run in “Pareto” mode. When run in this mode, the model is initially provided with the
minimum error variance parameter field estimated through a previous inversion exercise. The “bad
thing” observation is initially given a weight of zero. Over a series of inversion iterations the weight
ascribed to the bad thing observation is slowly increased. From data which PEST records on a number
of output files which are specific to this mode of its operation, the modeller can decide for him/herself
the value of the prediction of management interest at which likelihood of that value diminishes to
something approaching zero because of demonstrable incompatibility between that predictive value
and either or both of fit with the calibration dataset or reasonableness of the model’s parameter field.
In providing this information, PEST traverses the so-called “Pareto front” in which occurrence of the
unwanted event is traded off against fit with the calibration dataset and reasonableness of model
parameters.
It is important to note that this mode of direct hypothesis testing should not artificially close off
predictive possibilities through use of parameter schema that lack the flexibility to introduce nuances
49
of hydraulic property heterogeneity that, on the one hand are not unrealistic, and on the other hand
are required for realization of an unwanted event. For example, pilot points may be preferred to zones
of piecewise constancy as the spatial parameterization device of choice in critical parts of the model
domain when undertaking direct predictive hypothesis-testing in this manner. This is because zones
of piecewise constancy may disable the emergence of realistic expressions of heterogeneity that are
essential for occurrence of the unwanted event. A model’s capacity to explore predictive possibilities
is therefore considerably reduced through use of zonal parameters. See Fienen et al (2010) for a
further discussion of this issue.
Unfortunately, the use of pilot points requires that the model be run many times in order to calculate
finite-difference derivatives with respect to parameters associated with these points; these
calculations must be repeated during each iteration of the inversion process through which the Pareto
front is traversed. A relatively simple model may thus be required for direct hypothesis testing.
However, for reasons outlined above, while the model may be simple, its parameterization may need
to be locally complex.
In principle, use of PEST in “Pareto” mode allows a rigorous confidence limit to be associated with
rejection of the hypothesis of occurrence of an unwanted event; see Moore et al (2010). In practice,
for reasons already discussed, specification of the stochastic properties of measurement noise and
prior parameter variability (these are encapsulated in the C(ε) and C(p) matrices discussed above) will
probably have a subjective component. By gradually placing greater and greater weight on the
observation of occurrence of the prediction of interest, a modeller will be able to witness the degree
to which calibration misfit must be incurred, and/or unrealistic heterogeneity must be introduced to
a model’s parameter field, in order to support values of the prediction that approach undesirable
levels. Rejection of the hypothesis that the event will actually occur is subjectively enabled through
PEST’s provision of model outputs and parameter fields at each stage of the Pareto front traversal
process. The necessarily subjective nature of risk assessment when enabled through use of a simple
model is thereby embraced through providing the modeller with as much information as is needed for
him/her to exercise his/her judgement.
The potential that it offers for a modeller to exercise informed subjectivity is a significant strength of
the direct hypothesis-testing methodology offered by PEST. As has already been discussed, traditional
uncertainty analysis often explores parameter and predictive uncertainty by drawing random samples
from the posterior parameter probability distribution; the model is then run using each such sample.
It is important to note that characterization of the posterior parameter probability distribution
depends heavily on how the prior parameter probability distribution is defined. In some circumstances
the prior parameter probability distribution can be described analytically. In other circumstance
(particularly in groundwater modelling), more complex geostatistical descriptions of prior parameter
stochasticity are employed. However even the most picturesque realizations of parameters, based on
the most complex geological concepts, may fail to include key geological nuances that may enable the
occurrence of an unwanted event. In other cases the key geological nuance that promulgates the
occurrence of an unwanted event may indeed be compatible with prior stochastic characterization of
geological heterogeneity; however it may not be realized in the limited number of samples that are
drawn from the prior parameter probability distribution. A strength of direct hypothesis-testing in
which an inversion package such as PEST is directed to “make it happen”, is that the parameterization
nuance required for the occurrence of an unwanted prediction is brought into existence, and made
explicitly visible, as part of a calibration-constrained worst-case-scenario calibration exercise. It is then
up to the modeller to judge the reasonableness or otherwise of this nuance.
50
7. Joint Usage of a Simple and Complex Model
7.1 Introduction
Much of the discussion in previous chapters has focussed on use of a simple model in place of a
complex model, this allowing use of the complex model to be dispensed with so that a modeller can
gain access to the benefits of fast run time and numerical stability that accompany use of a simple
model. The dangers of replacing a complex model with a simple model have been outlined. Means by
which these dangers can be averted have also been addressed.
We conclude this document with some suggestions of how a simple and complex model can be used
together, this perhaps allowing a modeller access to the benefits of both while ameliorating the
disadvantages associated with use of either on its own. The discussion is brief. References are
provided through which the interested reader may acquire further information on this issue. Some of
the suggestions provided in this document have not yet been implemented. The authors consider the
use of complex/simple models in partnership an area of fruitful research whose outcomes may have
profoundly useful consequences for model-based environmental decision-making.
7.2 Linear Analysis
As has already been discussed, equations developed for linear analysis in chapter 4 of this document
are implemented in programs PREDVAR1C of the PEST suite, as well as in functions available through
the PyEMU suite. At the time of writing, literature-documented use of these equations is provided
only by Watson et al (2013) and White et al (2015). However the authors are aware of other contexts
in which they are being used to explore issues related to parameterization and model simplification.
It is anticipated that the outcomes of these studies will eventually be published. They include the
following.



PREDVAR1C has been used in geothermal reservoir modelling to inquire into the
repercussions of treating a dual porosity system as if it were single porosity. While a dual
porosity geothermal reservoir model can be calibrated under both assumptions, it has been
found that considerable predictive bias can be incurred if a model neglects dual porosity.
Predictive uncertainty may also be seriously underestimated.
In another geothermal modelling application which has implications for other types of spatial
models such as groundwater models, the use of various spatial parameterization devices was
explored. The outcomes of this research suggest that if an area is faulted, but a modeller does
not know the exact locations of these faults, then use of pilot points as a parameterization
device can support good calibration, valid uncertainty analysis, and an avoidance of predictive
bias. However their use comes at a high computational cost. Zones of piecewise constancy
constitute a cheaper parameterization device. However unless fault-specific zones are
emplaced at the correct locations, predictive bias may be incurred and predictive uncertainty
may be underestimated.
Neglecting along-river horizontal anisotropy when calibrating a ground water /surface water
model in which part of the model domain includes an alluvial aquifer, does not normally
compromise goodness of fit attained through calibration of that model, as horizontal
anisotropy is normally “invisible” to the calibration process. However certain predictions may
accrue considerable bias. Furthermore, the uncertainties of these predictions may be
seriously underestimated if the model is calibrated under the false assumption of along-river
isotropy.
51
Implementation of the linear theory presented in chapter 4 requires that a complex model be
constructed, and that families of parameters be identified as “defect parameters”. These parameters
may represent hydraulic properties of the modelled system (as is often the case for parameters).
Alternatively, they may represent features of an environmental model that are not usually adjusted
or estimated, but can nevertheless be considered as somewhat simplified or defective representations
of the real world. Particularly important in this regard may be the specifications of model boundary
conditions, which often comprise simplistic representations of far more complex environmental
stresses and process.
The “paired models” used for linear analysis, based on the equations of chapter 4, are actually the
same (complex) model. This model must be designed in such a way that the assignment of defect
status to a subset of its parameters results in a simple model that is representative of models used in
current modelling practice. However other simplifications, such as a reduction in the number of model
layers, or an increase in model cell size, cannot be readily explored using this methodology, as
implementation of linear theory requires that model outputs be differentiable with respect to defect
parameters. Hence, as has already been stated, though powerful, the range of model defects that can
be examined using linear methods is somewhat restricted. Nevertheless studies which employ linear
analysis can (and have) provided fruitful insights that can be readily extended to practical modelling
applications.
7.3 Predictive Scatterplots
Section 4.5 of this document describes a methodology of paired model usage whereby a simple
counterpart to a complex model is repeatedly calibrated to match outputs produced by the latter as
it is provided with different realizations of parameters. These realisations are sampled from the prior
probability distribution of those parameters. The complex model can employ categorical (and hence
non-continuous) parameter fields generated using multiple point or other modern geostatistical
methods. Though not implemented so far, realizations may also include categorical features of
complex model construction, such as the presence or otherwise of geological features whose
dispositions are unknown.
In contrast to linear analysis, the complex and simple models used in this analysis do not need to be
related to each other by a set of “defect parameters”. In fact defect parameters do not need to be
explicitly defined. Nor do the parameters used by the two models need to be the same. The simple
model can employ an entirely different parameterization scheme from that employed by the complex
model. In fact, use of a simple, lumped parameter, model as a partner to a far more complex,
physically-based, model would allow the latter to be calibrated rapidly to outputs generated by the
former. This would expedite considerably the efficiency of this methodology, as repeated calibration
of the simple model is the most time-consuming aspect of implementation of this methodology.
The methodology has the strength that it allows quantification of calibration-induced simple model
predictive bias, at the same time as it allows for correction of that bias. It supports the making of a
prediction of minimum error variance, and quantification of that error variance, at the same time as
it provides information that could be used as a basis for simple model design.
Apart from repeated calibration of the simple model, the need to run the complex model using
different random parameter fields adds to the computational demands of implementing this
methodology. This is especially the case if the complex model takes a long time to run. This may limit
the number of random parameter fields that can be tested. This, in turn, may confuse the
interpretation of scatterplots yielded by this methodology.
52
Another problem with this methodology (that is common to all methodologies that employ random
parameter field generation) is whether the random realizations or hydraulic property fields that it
employs are in fact representative of reality. Geological and environmental process “surprises” are
encountered in most modelling exercises. If key aspects of model parameterization and/or processes
are not represented in the complex model, then this model has defects. The methodology has no
ability to explore the ramifications of these defects on predictive bias and uncertainty.
To the authors’ knowledge, implementation of this methodology has been restricted to two
publications, namely Doherty and Christensen (2011) and Watson et al (2013). There have been noreal-world applications of which the authors are aware. This is not surprising, given the need to
construct two separate models; most modelling budgets would not support this. On the other hand,
most modelling budgets support construction of a complex model that is of questionable integrity and
of limited use. It is possible that co-production of a simple model for addressing some predictions
required in a study area may be a fruitful activity as far as decision-support is concerned. To the extent
that construction of the simple model can be informed and tested through conjunctive use of a
complex model, the role of both of these models in decision-support may be strengthened. It is also
possible that the simple model could be used as a surrogate for the complex model in calibration of
the latter. This is further discussed below.
7.4 Surrogate and Proxy Models
Use of a simplified version of a complex model to expedite calibration and uncertainty analysis of the
latter is receiving increasing attention in the literature. Nevertheless it is still far from commonplace
in everyday environmental modelling practice. Razavi et al (2012) present a review of applications,
with particular attention given to surface and land use modelling. Asher et al (2015) do the same for
groundwater modelling.
Surrogate models (as distinct from simplified models) gain maximum utility in parameter estimation
and uncertainty analysis contexts where parameters are few in number. They have been used in
conjunction with so-called global optimisation methods, or Markov chain Monte Carlo uncertainty
analysis methods, where speed of execution is essential because of the need for these methods to
undertake many model runs.
To the authors’ knowledge, the only documented uses of simplified and surrogate models in direct
partnership with complex models in a gradient-based parameter estimation and uncertainty analysis
context are those described by Burrows and Doherty (2014) and Burrows and Doherty (2016). Both of
these make use of PEST’s “observation re-referencing” functionality, wherein the simple or proxy
model is used for filling of the Jacobian matrix while model runs required for testing parameter
upgrades are carried out using the complex model. This strategy vastly reduces the number of complex
model runs required for solution of the inverse problem. In the first of these publications parameter
estimation (using Tikhonov regularisation) and post-calibration uncertainty analysis (using null space
Monte Carlo) was effected in a highly parameterized context involving 600 pilot point parameters; the
simpler model used a much coarser numerical grid than the more complex model, this reducing its
simulation time to a fraction of that of the latter. In the second of these cases the simpler model was
not a model at all; instead a suite of PEST-calibrated polynomial proxies linked each model output
used in the calibration process to each of the eight parameters that required adjustment during
calibration and subsequent calibration-constrained uncertainty analysis.
It is of interest to note that the use of so-called “super parameters” by PEST when implementing “SVDassisted” parameter estimation can be considered as a form of model simplification. In this
application, simplification is restricted to definition of parameters. However such simplification can
53
be considered to be optimal according to concepts presented in this document, as it is based on
singular value decomposition of a complex parameterization scheme.
7.5 Suggested Improvements to Paired Simple/Complex Model Usage
7.5.1 General
It is considered that there is ample room for more innovative use of simple models to expedite
calibration and uncertainty analysis of complex models. For example, a Jacobian matrix, calculated
using a simpler version of a complex model, could be used for definition of super parameters
employed for calibration of the latter. To expedite run times, the simple model could, for example,
implement particle-tracking instead of solving the advection-dispersion equation to simulate
movement of contaminants, or use the SWI package of Barker et al (2013) to simulate salt water
intrusion instead of a three-dimensional groundwater model which implements density-dependent
flow. The role of super parameters is to span the calibration solution space; approximations used in
calculation of sensitivities should not affect definition of this space. Once a limited number of super
parameters are synthesised from base parameters, calibration and uncertainty analysis could then be
undertaken using the complex model. In doing this, the numerical burden of parameter estimation
and calibration constrained uncertainty analysis would be considerably reduced through use of the
smaller number of parameters.
There is a potential for very large efficiencies in calibration and calibration-constrained uncertainty
analysis to be realized for complex models if simple models with simple parameterization schemes
can be used as partial or total substitutes for them in calculation of the Jacobian matrix and/or in
evaluation of super parameter sensitivities where super parameters are actually simple model
parameters. A few possible options are now discussed.
7.5.2 Option 1: Using the Simple Model for Derivatives Calculation
If h from equation 5.3.1 is equated to h from equation 5.3.3, and measurement noise is ignored, we
have
Xp + Xdpd = Zk
(7.5.1)
If we are only concerned with derivatives of adjustable parameters the second term on the left
disappears, so that
Xp = Zk
(7.5.2)
For simplicity, suppose that the simple model possesses enough parameters to allow a good fit to be
achieved with the calibration dataset h, and that calibration of this simple model can be formulated
as a well-posed inverse problem. With the inverse problem for estimation of p being well posed, an
equivalent p to a complex parameter set k can be derived using the equation
p = (XtX)-1XtZk = Nk
(7.5.3)
where, obviously,
N = (XtX)-1XtZ
(7.5.4)
(The above equation can be easily altered to accommodate measurement weights; however this is
not done for the sake of notational efficiency.) If the matrix (XtX)-1XtZ were available, then the simple
model could be used for calculation of sensitivities for the complex model. On each occasion that an
element of k was varied for the purpose of finite-difference derivatives calculation, then an equivalent
54
p vector could be calculated using equation 7.4.3; the simple model would then be run to compute
changes in those model outputs which correspond to the calibration dataset h.
Unfortunately, calculation of Z for use in equation 7.5.3 may be numerically intensive as it requires as
many model runs for filling of this matrix as there are adjustable parameters employed by the complex
model. In some modelling contexts this may be considered a small price to pay in order to gain access
to simple-model-based super parameters for implementing the actual inversion process. However the
large model would need to be numerically stable so that finite-difference derivatives have integrity.
Also, the methodology would require that the complex model not display too much nonlinearity with
respect to parameters, so that the (XtX)-1XtZ matrix is useable for at least a few iterations of the
inversion process. If it were only useable for a single iteration, then the simple model would not be
required as the Z matrix could be used indirectly as a basis for computation of parameter upgrades.
Once the model was calibrated, standard null space Monte Carlo techniques could be employed for
generating realisations of k that fit the calibration dataset while exploring the null space of Z.
Generation of these parameter sets could be based on singular value decomposition of Z. (Definition
of the null space would, once again, rest on an assumption of the integrity of Z when calculated using
finite difference derivatives.) Adjustment of random parameters sets to respect calibration constraints
could be done using the simple model and the p parameter field in the manner described above. If
this methodology was successful in adjusting nearly-calibration-constrained realisations of k such that
the complex model respects the calibration dataset h, then large efficiency gains in making this
adjustment would be realized.
7.5.3 Option 2: Modifications to Accommodate Complex Model Numerical Problems
In practice, calculation of Z for a large model with many parameters may incur a large computational
burden. An alternative option is to approximate N through random field generation. Many realizations
of k could be generated, and the complex model run each time. The simple model could then be
calibrated against the h vector produced by this model to compute an equivalent p parameter set.
After enough model runs had been undertaken, an empirical correlation matrix C(p,k) could be
constructed. (It is possible that where the solution space of the large model is small, the number of
runs required for reliable construction of this matrix may be smaller than that required for filling of
the Z matrix as required by option 1 above.) Then, using the relationship
C(p,k) = NC(k)
(7.5.5)
an approximation to N could be calculated as
N = C(p,k)C-1(k)
(7.5.6)
This N could be used in the complex model calibration process in the manner discussed above. It may
also be able to support calibration constrained Monte Carlo analysis, as the null space of N would be
equivalent to the null space of Z. Singular value decomposition could thus be undertaken on N to
generate realisations of null space parameter combinations of k which could then be added to the
parameter field k of the calibrated model. Adjustment to respect calibration constraints could then
proceed as above.
7.5.4 Option 3: Direct Adjustment of Random Parameter Fields
This option has some resemblance to the Ensemble Kalman Smoother. However, in the spirit of
methodologies such as that described by Chen and Oliver (2013) it attempts to lend efficiency to the
method by using regression techniques. In the present case, efficiency gains would also be realised
through use of a complementary simple model.
55
In a similar fashion to option 2, a suite of complex model parameter fields is generated using the prior
probability distribution of complex model parameters. The matrix N of equation (7.5.6) is also
obtained as previously. However instead of using this matrix to obtain a minimum error variance
parameter field k, the random k parameter fields would be themselves adjusted to conform to
calibration constraints. This would be effected using the simple model for calculation of derivatives.
(Here lies the distinction with the Kalman smoother where random parameter field generation is used
to calculate C(h,k), the matrix which expresses correlation between model parameters and model
outputs which correspond to measurements of system state comprising the calibration dataset.)
Testing of updated model parameter fields would require that complex model runs be carried out. A
simple model p counterpart to reach revised random k could then be obtained through (supposedly
rapid) calibration of the simple model against this h-counterpart. The N matrix could then be updated.
At the same time, random generation of more k fields could take place with an evolving posterior
parameter covariance matrix, calculated using a modified form of equation 3.4.15.
56
8. Conclusions
The use of models is now ubiquitous in environmental decision-support. However the support that
they provide to the decision-making process is often far from optimal – and sometimes even counterproductive. There is a tendency for those who commission the building of models to request that
models be complicated. This is done in recognition of the complex nature of environmental systems.
Logic dictates, so the argument goes, that if models are to simulate these systems with integrity, then
they must also be complicated.
No model can be as complex as the environmental system that it purports to simulate. Every model is
simple. Problems are inevitably encountered as a modeller attempts to add complexity to his/her
simulator in order that another party who views the model will consider it to be an acceptable
simulator of environmental processes, unsullied by approximations that make the distinction between
the simulator and the real world too plane to the naked eye.
Those who have attempted to construct complex models are sadly familiar with the unsatisfying
nature of this task. Their run times are long. They are numerically delicate. Their fit with an observation
dataset is often poor notwithstanding their complexity. Attempts to improve that fit are often met
with frustration, whether these attempts are made manually, or employ high-end inversion software.
Basic mathematics shows that, even if a good fit with a field dataset can in fact be attained, there are
an uncountable number of other ways to obtain the same level of fit using other parameters. In most
cases of complex model construction, a modeller cannot even be sure that the manner in which
his/her fit with the measurement dataset was obtained is of minimum error variance. Yet important
predictions are made using that model and the single parameter field which is deemed to “calibrate”
the model. The model is too big, and the budget is too small, to try to find other parameter sets that
also fit the calibration dataset, and that can be used to explore the potential for wrongness in decisioncritical predictions made by the complex model.
Conceptually, complex models provide a mechanism for a modeller to understand environmental
processes. Conceptually, they can be used to explore the range of possibilities that are compatible
with expert knowledge of the set of processes that are operative at a specific study site and the
hydraulic properties which govern these processes. In doing so, they have the potential to contribute
much to environmental management. However the role which they are often forced to play is very
different from this. Expectations are that they can be used as surrogates for a real world system. As
such, different scenarios for management of that system can be tested on them; that for which a
model calculates favourable management outcomes can then be adopted for the real world.
Years of collective modelling experience, supported by basic mathematics, demonstrates that this is
not the correct way to view models that are built as a basis for decision-support. While they do indeed
have the capacity to provide such support, the notion that they can be used as surrogates for a realworld environmental system must be abandoned if this capacity is to be realized. The world is too
complex, its properties are too uncertain, and its details are too heterogeneous for this to be the case.
Instead, models should be viewed as scientific instruments, constructed (like any other scientific
instrument) to conduct carefully designed experiments at study sites where a great deal is unknown,
but where attempts are nevertheless being made to learn more about the system in order to support
proper management of it. Their construction, deployment and calibration must be such that they can
extract information from the historical behaviour of that system that is most pertinent to its future
management, and that they can store this information in ways that are easily accessible when
weighing up the merits of competing management scenarios. All of this must be done in full
57
recognition of what current computing technology can provide, and of the computing resources
available to those who manage a particular site.
In short, when used in the decision-making context, models should be considered as receptacles for
information – information which can be used to test hypotheses of interest to those who manage
environmental systems. Decisions pertaining to management of those systems will never be made
with certainty. Hence the decision-making process is best served when a model can provide an
environmental manager with an assessment of the risk that he/she may be making a wrong decision.
When used according to this precept, it is immediately apparent that the modelling process is far more
important than any model that will serve that process. It is also apparent that this process must be
capable of recognizing and, if possible, quantifying the uncertainties that determine the context of
real-world environmental management.
The assumption underpinning much modern-day modelling practice that a single, complex, simulator
will answer all questions that a manager will ask, and provide all information that he/she needs to
know, is completely unsupported – either by modelling history or by logic. Instead, it is apparent that
environmental management is best served by a suite of models, each optimized in its ability to provide
receptacles for certain types of information, and each able to deliver that information to the decisionmaker in a manner that best supports risk assessment – the vital ingredient of decision-making in all
fields of human endeavour. Some of these models may be complex. Many will be simple, this
enhancing their utility in uncertainty assessment and risk analysis as it pertains to some aspect of a
system for which scientifically-based management is required.
Simplicity in modelling brings lightness of step and flexibility, both in terms of what a simple model
can achieve, and in terms of what a modeller can achieve when using that model. A simple model can
serve a modeller well, thereby allowing him/her to serve his/her clients/stakeholders well. The same
can rarely be said of a complex model. Far too often a complex model becomes a modeller’s master,
commanding the modeller to do whatever is necessary for its capricious numerical fancies to be
served, with numerical nonconvergence being the punishment for failure to provide satisfaction. The
close attention to numerical detail that the complex model relentlessly demands moves the gaze of
the modeller from the decisions which he/she must support to the unrelenting numerical details
required for maintenance of the health of the bloated model. These details have little to do with the
real world, and less to do with the problems that must be solved so that the real world is properly
managed. An artificial reality is created wherein a modeller must solve a suite of problems that are of
little importance while ignoring those that are of over-riding importance.
While the use of simple models is not beset by the same problems, they must nevertheless be used
with caution, for simplicity comes at a cost. This document has attempted to outline the costs of
simplicity, while providing some suggestions on how they can be assessed and/or minimized. Some of
these costs are obvious. In particular, if a model is too simple, then it cannot provide receptacles for
information resident in the historical behaviour of a system (i.e. it cannot fit a calibration dataset).
Some of these costs are less obvious, but are more insidious. Thus a simple model may be capable of
fitting a calibration dataset; however the information that flows from that dataset into the model is
directed to receptacles which may corrupt, rather than enhance, the model’s capacity to assess future
risks.
In supporting the making of environmental decisions, those who undertake model-based data
processing are themselves faced with many decisions. Some of these will pertain to the level of
complexity that is required of a model if its role in decision-support is to be realized. The path taken
by a particular modeller will almost certainly be subjective; different choices will probably be made by
58
different individuals. However, whatever the path that a modeller chooses to take, he/she should
follow that path with a full understanding of where that path may lead, and of where other paths that
have not been taken may also have lead.
The idea that a single model can be used to answer all questions is challenged by the ideas and
mathematics presented in this document. A simple model may assist in the assessment of some
decision-critical risks. A more complex model may be warranted for the assessment of others. In still
other cases, it may be necessary to build a number of complementary models with different levels of
complexity, that can work with each other so that the contribution that the totality of these models
makes to the decision-making process is greater than the sum of what they can individually make.
Such sophistication of model usage is comparatively rare, as the intellectual and software tools to
support such usage are generally unavailable. It is hoped that the present document can provide some
justification for more flexible and adventurous model usage than is generally undertaken at present,
and that the making of environmental decisions will benefit from this. If this is the case, software
support for facilitated implementation of principles and suggestions espoused herein will naturally
follow.
59
9. References
Asher,M.J., Croke, B.F.W., Jakeman, A.J. and Peeters, L.J.M., 2015. A review of surrogate models and
their application to groundwater modelling. Water Resour. Res. 51 (8), 5957-5973
Aster, R.C., Borchers B. and Thurber, C.H., 2013. Parameter Estimation and Inverse Problems. Second
edition. New York: Academic Press.
Bakker M., Schaars, F., Hughes, J.D., Langevin, C.D. and Dausman, A.M., 2013. Documentation of the
Seawater Intrusion (SWI2) Package for MODFLOW. U.S. Geological Survey Techniques and Methods,
Book 6, Chap. A46, 60 p.
Beven, K., 2005. On the concept of model structural error. Water Sci.Technol., 52(6), 167–175.
Beven, K. J., Smith, P.J. and J. E. Freer, J.E., 2008. So why would a modeller choose to be incoherent?
J. Hydrol., 354, 15–32, doi:10.1016/j.jhydrol. 2008.02.007.
Burrows, W. and Doherty, J., 2014. Efficient calibration/uncertainty analysis using paired
complex/surrogate models. Groundwater, 53(4), pp531-541.
Burrows, W. and Doherty, J., 2016. Gradient-based model calibration with proxy-model assistance.
Journal of Hydrology, 533, 114-127.
Campbell, E.P. and Bates, B.C., 2001. Regionalization of rainfall-runoff model parameters using Markov
chain Monte Carlo samples. Water Resour. Res., 37(3), 731-739, doi:10.1029/2000WR900349.
Campbell, E. P., Fox, D. R., and Bates, B. C., 1999. A Bayesian approach to parameter estimation and
pooling
in
nonlinear
flood
event
models.
Water
Resour.
Res.,
35(1),
211–220,
doi:10.1029/1998WR900043.
Chen, Y. and Oliver, D.S., 2013. Levenberg-Marquardt forms of the iterative ensemble smoother for
efficient history matching and uncertainty quantification. Comput. Geosci. (17) 689-703.
Cooley, R.L., 2004. A theory for modelling ground-water flow in heterogeneous media. U.S. Geological
Survey Professional paper 1679, 220p.
Cooley, R.L. and Christensen, S., 2006. Bias and uncertainty in regression-calibrated models of
groundwater flow in heterogeneous media. Adv. Water. Resour., 29 (5),639-656.
Christensen, S., 2017. Methods to correct and compute confidence and prediction intervals of models
neglecting sub-parameterization heterogeneity – from the ideal to practice. Adv. Water. Resour., 100,
109-125.
Dausman, A.M., Doherty, J., Langevin, C.D., and Sukop, M.C., 2010. Quantifying data worth toward
reducing predictive uncertainty. Ground Water, 48 (5), 729-740.
Demissie, Y.K., Valocchi, A.J., Minsker, B.S., and Bailey, B.A., 2009. Integrating a calibrated
groundwater flow model with error-correcting data-driven models to improve predictions. J. Hydrol.,
364, 257-271.
Doherty, J., 2015. Calibration and uncertainty analysis for complex environmental models. Published
by Watermark Numerical Computing, Brisbane, Australia. 227pp. ISBN: 978-0-9943786-0-6.
Downloadable from www.pesthomepage.org.
60
Doherty, J., 2016. PEST: Model-Independent Parameter Estimation. Watermark Numerical Computing,
Brisbane, Australia.
Doherty, J. and Christensen, S., 2011. Use of paired simple and complex models in reducing predictive
bias and quantifying uncertainty. Water Resourc. Res doi:10.1029/2011WR010763.
Doherty, J. and Johnston, J.M., 2003. Methodologies for calibration and predictive analysis of a
watershed model, J. American Water Resources Association, 39(2):251-265.
Doherty, J. and Simmons, C.T., 2013. Groundwater modelling in decision support: reflections on a
unified conceptual framework. Hydrogeology Journal 21: 1531–1537
Doherty, J. and Vogwill, R., 2015. Models, Decision-Making and Science. In Solving the Groundwater
Challenges of the 21st Century. Vogwill, R. editor. CRC Press.
Doherty, J. and Welter, D., 2010, A short exploration of structural noise, Water Resour. Res., 46,
W05525, doi:10.1029/2009WR008377.
Draper, N.R. and Smith, H., 1998. Applied Regression Analysis. John Wiley & Sons, Inc. ISBN
9780471170822.
Fienen, M.N., Doherty, J., Hunt, R.J. and Reeves, H.W., 2010. Using Predictive Uncertainty Analysis to
Design Hydrologic Monitoring Networks: Example Applications from the Great lakes Water Availability
Pilot Project. USGS Scientific Investigations Report 2010-5159.
Freeze R.A., Massmann J., Smith L., Sperling T., James B., 1990. Hydrogeological decision analysis: 1 A
framework. Ground Water 28 (5),738–766.
Gamerman, D. and Lopes, H.F., 2006. Markov Chain Monte Carlo. Stochastic Simulation for Bayesean
Inference. Chapman and Hall/CRC, 342pp.
Graybill, F.A., 1976. Theory and Applications of the Linear Model. Duxbury press, North Scituate,
Mass., p704.
Higdon, D., Kennedy, M. , Cavendish, J. C., Cafeo, J. A. and Ryne, R. D., 2005. Combining field data and
computer simulations for calibration and prediction. SIAM J. Sci. Comput., 26(2), 448–466,
doi:10.1137/S1064827503426693.
Kennedy, M. C., and O’Hagan, A., 2001. Bayesian calibration of computer models. J. R. Stat. Soc., Ser.
B, 63(3), 425–450.
Kitanidis, P.K., 2015. Persistent questions of heterogeneity, uncertainty, and scale in subsurface flow
and transport. Water Resour. Res., 51, 5888-5904, doi:10.1002/2015WR017639.
Koch, K-R., 1999. Parameter Estimation and Hypothesis Testing in Linear Models. Springer. ISBN
9783540652571.
Kuczera, G., 1983. Improved parameter inference in catchment models: 1. Evaluating parameter
uncertainty. Water Resour. Res., 19(5), 1151–1172, doi:10.1029/WR019i005p01151.
Laloy, E. and Vrugt, J.A., 2012. High-diimensional posterior exploration of hydrologic models using
multiple-try DREAM(zs) and high-performance computing. Water Resources Research, 48 (1),
W01526, doi:10.1029/2011WR010608.
Menke, W., 2012. Geophysical Data Analysis: Discrete Inverse Theory. Academic Press.
61
Moore, C. and Doherty, J., 2005. The role of the calibration process in reducing model predictive error.
Water Resources Research. Vol 41, No 5. W05050.
Moore, C., Wöhling, T., and Doherty, J., 2010. Efficient regularization and uncertainty analysis using a
global optimization methodology. Water Resources Research. Vol 46, W08527,
doi:10.1029/2009WR008627.
Razavi, S., Tolson, B. and Burn, D., 2012. Review of surrogate modeling in water resources. Water
Resour. Res., 48: W07401. DOI:10.1029/2011,WR0011527.
Tonkin M., J. and Doherty, J., 2009. Calibration-constrained Monte Carlo analysis of highly
parameterized models using subspace techniques, Water Resour. Res., 45, W00B10,
doi:10.1029/2007WR006678.
Wallis, I., Moore, C., Post, V., Wolf, L., Martens, E. and Prommer, H., 2014. Using predictive uncertainty
analysis to optimise tracer test design and data acquisition. J. Hydrol., 515 pp. 191-204.
Watson, T.A., Doherty, J.E. and Christensen, S., 2013. Parameter and predictive outcomes of model
simplification. Water Resourc. Res. 49 (7), 3952-3977. DOI: 10.1002/wrcr.20145
White, J.T., Doherty, J.E. and Hughes, J.D., 2014. Quantifying the predictive consequences of model
error with linear subspace analysis. Water Resour. Res, 50 (2): 1152-1173. DOI:
10.1002/2013WR014767
White, J.T., Fienen, M.N., and Doherty, J.E., 2016. pyEMU: A Python framework for environmental
model uncertainty analysis. Environ Modell Softw, 85, 217-228.
62