Lectures 8/2 Uncertainty and sensitivity

Basic concepts in uncertainty and probability
• Measurable features of nature are nearly always random
variables – observations of the variable at any place, time and
scale will be random numbers drawn from a certain distribution
of more or less probable values
relative probability
• The probability density function (pdf) describes the
relative probability that observations of a random variable will fall
within a certain range:
probability (P)
of values in the
range x1 to x2
total area
under pdf
=1
random variable
x
x1 x2
• Many variables follow a distribution similar to the Gaussian or
normal distribution. For such variables:
• Values above or below the mean are equally likely (i.e. the
mean is equal to the median)
relative probability
• 70% of observations will fall within one standard deviation
(σ) of the mean (µ). 95% will fall within two standard
deviations:
P ≈ 0.70
P ≈ 0.95
x
µ
σ
2σ
3σ
Sources of uncertainty
• Uncertainty in the output data from a model can propogate (feed
through) from several sources:
• Incomplete knowledge of the system being modelled
→ Formulations of processes and relationships (the
equations and algorithms in the model)
→ Uncertainty as to the ”correct” parameter values
• Simplified representation of the system being modelled
→ ”missing” processes and parameters
→ Single parameter value to represent a random value from
the real world
→ Scale mismatch between model, input data and output
data
• Variability or error in the input data
→ variability in time
→ variability in space
→ errors in measurement, interpolation, output from another
model providing input to this one etc.
Stochastic versus deterministic models
• Most real-world processes are stochastic, i.e. the outcome is a
random variable
• Some processes are deterministic, i.e. have one specific outcome
for each specific set of values of the input variables
• Mainly a question of scale – small-scale stochastic processes often
seem deterministic when observed at larger scales
input
variable
input
variable
distribution of
values
stochastic
process
output
variable
input
variable
input
variable
input
variable
unique
value
deterministic
process
output
variable
input
variable
• For many stochastic processes, the ”most likely” or average outcome
dominates the role of the process in the system
→ then often convenient to represented the average behaviour as
a deterministic process
• In some systems, rare or extreme outcomes of stochastic processes
may have a large impact
→ stochastic representation necessary to correctly capture system
dynamics
Sensitivity versus uncertainty
• An output variable from a model is sensitive to an input variable (or
parameter, or process representation) if variability in the input variable
leads to a relatively large amount of variability in the output variable
Y
Low sensitivity
of Y to X
Y
High sensitivity
of Y to X
change
in Y
change
in Y
change in X
X
X
change in X
• The greater the sensitivity of an output variable to an input variable (or
parameter), the more uncertainty in the input value propogates to the
output variable
output
variable
Model 1:
high sensitivity
output
uncertainty
Model 1
Model 2:
low sensitivity
parameter
or input
variable
Model 2
input uncertainty
Assessing uncertainty:
Sensitivity analysis by Monte Carlo method
• Stochastic technique for:
• assessing uncertainty in the output variables of a model
propogating from the input variables or parameters
• assessing sensitivity, taking account of joint variation in several
parameters
• Requires assumptions as to the probability distribution (pdf) of the input
variables or parameters.
Uniform
relative probability
Gaussian
min
max
parameter
min
max
parameter
• A large number of parameter sets are constructed by drawing values at
random from the pdf of each parameter
• Model is run once for each random parameter set
parameter
parameter
parameter
model
frequency
• Distribution of output values describes the uncertainty
output variable
• Sensitivity to the parameters when they vary jointly is assessed by
correlation analysis
Assessing uncertainty example:
Non-rectangular hyperbola model of leaf photosynthesis
(Cannell & Thornley 1998)
ϕ ⋅ APAR + Amax − (ϕ ⋅ APAR + Amax ) 2 − 4θ ⋅ ϕ ⋅ APAR ⋅ Amax
A=
2θ
Input variable:
APAR = (absorbed PAR)
Output variable:
A=
gross leaf photosynthesis
Parameters:
ϕ=
quantum efficiency (mol CO2 fixed / mol
quanta absorbed)
Amax =
photosynthetic capacity (photosynthetic rate
at saturating light intensity)
θ=
shape parameter
Parameter
Units
Min
Max
”Best guess”
ϕ
mol CO2 mol quanta−1
0.04
0.10
0.07
Amax
µmol CO2 m−2 s−1
5
25
15
θ
-
0.50
0.95
0.75
relative probability
Assumed pdf for each parameter
P ≈ 0.95
ϕ
0.04
0.10
2σ
Parameter sensitivity in various light environments
APAR
(µmol PAR m−2 s−1)
ϕ = 0.07
θ = 0.75
Amax = 15
θ = 0.75
Amax = 15
ϕ = 0.07
A (µmol CO2 m−2 s−1)
Parameter sensitivity
Amax = 15
APAR = 1500
θ
ϕ
m
CO 2
l
o
(m
−1 )
a
t
n
a
ol qu
Monte-carlo analysis with APAR = 200 µmol PAR m−2 s−1
Part r = 0.754
Part r = 0.480
Part r = 0.306
95% confidence
Monte-carlo analysis with APAR = 1500 µmol PAR m−2 s−1
Part r = 0.997
Part r = 0.038
Part r = 0.056
95% confidence
Complexity v. uncertainty
• In general, the aim of building complex models is to generate more
realistic predictions (by incorporating more features from the real
system being modelled)
• i.e. to minimise error in the predictions relative to observations
BUT ...
• ... complex models tend to be more sensitive to uncertainty in the
input variables and parameters (there are more of them, and each adds
sensitivity)
sensitivity
error
(if parameters and
input data are
well known)
increasing complexity
• Uncertainty is minimised (model utility maximised) when both error
and sensitivity are as low as possible*
*Snowling & Kramer 2001
Ecological Modelling
138: 17-30
sensitivity
model 1
model 2
model 3
Model 3 is the
most useful model
model 4
error
U = 2 − ( S / Smax ) 2 + ( E / Emax ) 2
utility =
1 / uncertainty
sensitivity relative
to maximum possible
error relative
to maximum possible
Quantifying complexity
• The complexity of a model is a function of the
• number of state variables (or output variables)
• number of processes flowing to or from the state/output variables
• number of parameters in the description of each process
• number of mathematical operations in the description of each
process
PAR
PAR
photosynthesis
respiration
NPP
NPP
light
extinction
Simple model
(1 variable, 1 process)
allocation
roots
leaves
stems
number of
variables
N
nj
Complex model
(3 variables,
4 processes)
number of
processes flowing to
or from variable j
C = ∑∑ pi ⋅ ri
j =1 i =1
complexity
index
number of parameters
for process i
number of operations
for process i
Example: Choosing a model of
gross ecosystem photosynthesis
model 1
(bottom-up)
-2
-1
Gross photosynthesis (µmol CO2 m s )
14
12
10
8
model 2
(top-down)
model 3
(inversion)
6
4
2
0
0
200
400
600
-2
-1
Incoming PAR (µmol m s )
800
Model 1 (bottom-up)
gross ecosystem
photosynthesis
A
gross leaf
photosynthesis
A =
ϕ ⋅ APAR + Amax − (ϕ ⋅ APAR + Amax ) 2 − 4θ ⋅ ϕ ⋅ APAR ⋅ Amax
2θ
APAR
light extinction
within canopy
APAR = PAR ⋅ (1 − α ) ⋅ (1 − e − k ⋅LAI )
PAR
incoming PAR
Complexity
Variables
Processes
Parameters
Operations
A
gross leaf
photosynthesis
ϕ, Amax, θ
12
light extinction
PAR, (1−α), −k, LAI
4
C = 12 × 3 + 4 × 4 = 52
Model 2 (top-down)
net ecosystem
CO2 exchange
ecosystem
photosynthesis
A = b ln PAR − a
ecosystem
respiration
PAR
incoming PAR
Complexity
Variables
Processes
Parameters
Operations
A
ecosystem
photosynthesis
b, PAR, a
3
C=3×3=9
Quantifying sensitivity
Model 1
S = 19 µmol CO2 m−2 s−1
Model 2
S = 15 µmol CO2 m−2 s−1
Quantifying error
• Compare model predictions to independent data set (not used for
calibration) and quantify error
• e.g. root mean square error:
model prediction
of case i
observed
case i
N
∑ (m − d
RMSE =
i =1
i
i
)2
N
20
Model 2 (µmol CO2 m−2 s−1)
Model 1 (µmol CO2 m−2 s−1)
number of
cases
15
10
5
0
RMSE = 1.61
−5
−5
0
5
10
15
20
Observed (µmol CO2 m−2 s−1)
20
15
10
5
0
RMSE = 2.81
−5
−5
0
5
10
15
20
Observed (µmol CO2 m−2 s−1)
Model
Complexity
Sensitivity
Error
Utility
S
E
2 − ( S / 20) 2 + ( E / 3) 2
1
52
19
1.61
0.32
2
9
15
2.81
0.21
model 1
sensitivity
20
15
model 2
10
5
0
0
1
2
error
Choose Model 1
3
Model-based uncertainty:
Future climate change according to different GCMs
IPCC 2001
Temperature
change (°C)
Frequency (% of model runs)
Temperature (°C)
Parameter-based uncertainty:
Future climate change according to different parameterisations
of the same GCM
Temperature change (°C)
Frequency (% of model runs)
Stainforth et al. 2005
Nature 43: 403-406.
Parameter-based uncertainty:
Correlations (RPCC) between output variables and parameters
from a Monte Carlo sensitivity analysis of an ecosystem model
Zaehle et al. 2005
Global Biogeochemical Cycles 19
Parameter-based uncertainty:
Geographical variation in parameter importance
Photosynthesis parameter
Water balance parameter
Zaehle et al. 2005
Global Biogeochemical Cycles 19
Cumulative change in
soil carbon (PgC)
Cumulative change in
vegetation carbon (PgC)
Parameter-based uncertainty:
Consequences for future change in biosphere carbon storage
Zaehle et al. 2005
Global Biogeochemical
Cycles 19
Input-based uncertainty:
Future climate change under different climate forcing scenarios
IPCC 2001
Input- and model-based uncertainty
Change in ecosystem C stocks for Europe
Morales et al.
in press