Flood Risk Analysis considering 2 types of uncertainty

US Army Corps
of Engineers
Institute for Water Resources
Hydrologic Engineering Center
Flood Risk Analysis
considering 2 types of uncertainty
Beth Faber, PhD, PE
Hydrologic Engineering Center (HEC)
US Army Corps of Engineers
Flood Risk Management
• The US Army Corps of Engineers has a mission in
“flood control,” now flood risk management or reduction
– change in terminology either follows or attempts to instigate a
change in thinking
– traditionally, attempted to reduce flooding, but now focus on
reducing flood risk risk = likelihood & consequence
• For considering new projects, analysis is economic.
– invest Federal dollars to decrease flood damages by greater amount
• For existing projects, analysis is about safety, and
includes potential life loss
Risk Analysis
• In deterministic analysis, we look in detail at the
damages caused by a some large flood events.
• To do a stochastic RISK ANALYSIS, we must also
consider the likelihood of those events happening.
• Corps guidelines for risk analysis require both
likelihood of occurrence of damaging flood events,
and uncertainty in our estimates and modeling.
– RISK & UNCERTAINTY ANALYSIS
– We describe both likelihood and uncertainty with
probability distributions.
Concepts
• What is the difference between natural variability
(aleatory) and knowledge uncertainty (epistemic)?
• Do the differences matter in decision making?
If yes…
• How can we separately consider them in risk
analysis computations and decision metrics?
– What happens if we do not consider them separately…
• How can we estimate and describe them?
• How can decisions best consider them?
Two Types of Uncertainty
• Natural Variability (Aleatory) = some variables are
random and unpredictable by nature, and their
values differ event to event or place to place
• Knowledge Uncertainty (Epistemic) = some variables
are more or less constant, but we do not know their
values accurately
• Both variability and uncertainty
are described by
probability distributions
weir coefficient
Outline
• Expected Annual Damage (EAD)
– what it is, how it’s been computed
– other decision-making metrics
• Monte Carlo Simulation
– event sampling / modeling
• Natural Variability and Knowledge Uncertainty
–
–
–
–
definitions, how they affect EAD
how we sample them and when
performance indices
reducing compute time
Outline
• Expected Annual Damage (EAD)
– what it is, how it’s been computed
– other decision-making metrics
• Monte Carlo Simulation
– event sampling / modeling
• Natural Variability and Knowledge Uncertainty
–
–
–
–
definitions, how they affect EAD
how we sample them and when
performance indices
reducing compute time
Decision Making, Metrics
• For new investment, Cost / Benefit analysis is primary
– Cost is the expense of building and maintaining a
structure, or of changes to the damage potential of the
flood plain
– Benefit is the reduction in flood damages over time
• Spending Federal dollars, so need investment to have
positive expected cost/benefit ratio, and for portfolio
needs to be positive on average – use mean values
• Local decision-making is different…
Expected Annual flood Damage (EAD)
• The metric we evaluate is an average annual damage
– “expected value” is the mean or average of a probability distribution
• Expected Annual Damage can be interpreted as the average
damage over a very long period of time. This annualized
value can be compared to an equivalent annual cost in
cost/benefit analysis.
• benefits of project = reduction in EAD
• The “old way” of computing EAD was to condense the flood
frequency information, the hydraulics, and the economics into
summary relationships, and combine them.
Corps of Engineers
IWR-HEC
Flood Damage
Expected Annual Flood Damage
expected
annual
damage
years
Corps of Engineers
IWR-HEC
Summary Curves for Frequency,
Hydraulics and Economics
Hydrology
Hydraulics
Flow-Frequency
Stage-Flow
Economics
Stage-Damage
Peak Flow (cfs)
CDF
1
2
3
1%
1
0
Exceedance Prob
Flow (cfs)
PDF
Area = 1
Variable Value
Cumulative Probability
Probability per unit
Probability distribution of annual peak flow
1
0.8
CDF
0.6
0.4
0.2
0
Variable Value
Corps of Engineers
IWR-HEC
Summary Curves for Frequency,
Hydraulics and Economics
Hydrology
Hydraulics
Flow-Frequency
Stage-Flow
Economics
Stage-Damage
Peak Flow (cfs)
CDF
1
1
2
Exceedance Prob
0
3
Flow (cfs)
PDF
Area = 1
Probability per Unit
Variable Value
Variable Value
Probability distribution of annual peak flow
CDF
1
0.8
0.6
0.4
0.2
0
Exceedance Probability
12
Corps of Engineers
IWR-HEC
Computing EAD with summary curves
• The mean (expected value) of annual flood damage is
computed by combining summary curves:
flow-frequency curve
stage-flow function
stage-damage function
to obtain a:
damage-frequency curve
• The mean of the damage-frequency function is the
expected value of annual damage, or EAD.
Corps of Engineers
IWR-HEC
2
Stage (ft)
Peak Flow (cfs)
Computing EAD with Summary Curves
1
CDF
Flow-Frequency
Curve
captures year-to-year
variability in flow
p
0
1
Exceedance Probability
captures year-to-year
variability in damage
3
CDF
p
0
1
Exceedance Probability
Corps of Engineers
IWR-HEC
AREA = mean =
expected annual
damage, EAD
N
E[ D] 
 D p
i
i 1
Other Decision Metrics
• Annual Exceedance Probability
of interest to
local sponsor…
– the likelihood of flood impact in any year
– we’re familiar with the National Flood Insurance Program’s
100-year (1% chance) “base flood”
– based primarily on natural variability
• ‘Assurance’ of 1% protection
used for levee
certification
– chance that have AEP ≤ 1% , given uncertainty
– based primarily on knowledge uncertainty
• Dollars per statistical life saved, etc…
– currently, willingness to pay is 9.1 million$ per DOT, once
below tolerable risk guidelines
Outline
• Expected Annual Damage (EAD)
– how it’s been computed
– other decision-making metrics
• Monte Carlo Simulation
– Event sampling / modeling
• Natural Variability and Knowledge Uncertainty
–
–
–
–
definitions, how they affect EAD
how we sample them and when
performance indices
reducing compute time
Monte Carlo Simulation
• We’re interested in variable Y=damage, which is a
complex function of X=flow, ie, Y = g(X)
• X is a random variable, described by a probability
distribution
• How do we determine the
distribution of Y=damage?
PDF
probability / X
• So, Y is also a random variable
with a probability distribution
variable X
• If distribution of X is known, can develop the distribution of
Y analytically, or can use Monte Carlo Simulation
Monte Carlo Analysis
Relative Frequency
• Replace the probability distribution of variable
X=flow with a very large sample of values
PDF
of X
histogram
Value of X
• Then, for each
member of the
sample, compute
Y=g(X)
• This process creates
a large sample of the
variable Y (damage)
Monte Carlo Analysis
• From the generated sample of Y (damage), infer
its probability distribution with statistical analysis
Relative Frequency
distribution of Y
Value of Y
In the case of
Y=damage, we
have been
mostly
interested in
the mean of
the distribution,
or EAD
Why Monte Carlo?
• One value of Monte Carlo simulation is the ability
to use complex deterministic models
• It is easier to do math on a member of a sample
than on the probability distribution itself
• Can operate on (or evaluate functions of)
members of the sample, then recombine the
resulting sample into a new distribution
Slightly More Complex…
• When variable Y is a function of 2 random variables,
X and Z …
• Create a sample of variable X
• Create a sample of variable Z
(if X and Z are correlated, need a correlated
sample)
• Compute Y = h(X, Z) for every pair of X and Z
Generating the Sample
• How do we generate a sample of values from a
particular probability distribution?
• First, switch from a PDF to a CDF, ie cumulative
probability…
f(x)
variable X
probability that less than X
probability / X
CDF: F(x) = P[X<x]
1
F(x)
0
variable X
Generating the Sample
• Generate pseudo-random values, uniform Ui ~ U[0,1]
– “random number generators” usually produce U[0,1]
• Use Ui as the cumulative probability, and compute
xi as the inverse of the CDF of X
Ui = F(xi), xi = F-1(Ui)
F(X)
1
• A frequency analysis on
the sample xi provides
the original probability
distribution, ie P[X ≤ x]
0.8
0.6
.
Ui
CDF,
F(X)
0.4
0.2
0
x. i
X
How large a sample?
sam ple size = 100
sam ple size = 1000
sam ple size = 10000
• the input sample is large enough when its statistics
reproduce the parameters of the distribution
• the output sample is large enough when the statistics of
interest stabilize
Peak Flow (cfs)
Computing EAD with Summary Curves
Stage (ft)
CDF
Flow-Frequency
Curve
captures year-to-year
variability in flow
p
0
1
Exceedance Probability
captures year-to-year
variability in damage
CDF
p
0
1
Exceedance Probability
Corps of Engineers
IWR-HEC
AREA =
expected
annual
damage,
EAD
Monte Carlo Simulation for Flood Risk
• The peak flow frequency curve is the primary source
of natural variability in annual flood damages
• In Monte Carlo Simulation (Analysis), we replace a
probability distribution with a very large sample of
values from that distribution
– we can then deterministically model each member of the
sample to compute damage
– creates a large sample of damages
• Can consider many distributions at the same time, but
we’ll start by looking at just peak flow variability
Computing EAD by “Event Sampling”
Peak Discharge (cfs)
Simple Monte Carlo Simulation
Stage (ft)
One Event
(sample
member)
CDF
0
1
Exceedance Probability
1 N
EAD   Damage(i )
N i 1
Corps of Engineers
IWR-HEC
replace flowfrequency curve
with a sample…
…end with
a sample of
damages
Replacing Flow-Frequency Curve
with a Large Sample of Peak Flows
1000 events (annual peak flows, or annual max X-duration flows)
1000000
1000000
PDF form
(histogram)
100000
Peak Annual Flow (cfs)
peak annual flow (cfs)
100000
CDF form
(frequency curve)
10000
1000
100
each point is
an “event”
10000
events are ranked and
plotted against relative
frequency of exceedance
(plotting positions)
1000
100
0
20
40
count
60
80
100
0.99
0.95
0.9
0.8
0.5
0.2
Exceedance Probability
0.1
0.05
0.02 0.010.005 0.002
From each flow, compute a damage,
create a sample of damages
1000 event damages -- the average or mean of these is the EAD
2000
2000
1800
1800
PDF form
(histogram)
1400
CDF form
(frequency curve)
1600
1400
1200
Annual Flood Event Damage (1000$)
Event Damage (1000$)
1600
1000
800
600
400
200
0
1200
1000
800
each point is
an “event”
600
400
200
0
0
100
200
count
300
400
0.99
0.95
0.9
0.8
0.5
0.2
Exceedance Probability
0.1
0.05
0.02 0.010.0050.002
How many is enough? Convergence
– average damage
– 1% exceedance damage, …
• This is convergence
probability / X
• We continue creating and evaluating new events until
the statistic of interest stabilizes
avg
variable X
1%
How many is enough? Convergence
• We continue creating and evaluating new events until
the statistic of interest stabilizes
probability / X
– average damage
– 1% exceedance damage, …
• This is convergence
1000000
800000
600000
400000
0
100 events
1% exceedance damage
2000000
1500000
1000000
500000
0
100 events
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
1200000
200000
2500000
average of damage
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
Average Damage
1400000
1%
variable X
1% exceedance damage
1600000
avg
How many is enough? Convergence
• We continue creating and evaluating new events until
the statistic of interest stabilizes
probability / X
– average damage
– 1% exceedance damage, …
• This is convergence
1000000
800000
600000
400000
0
1000 events
1% exceedance damage
2000000
1500000
1000000
500000
0
1000 events
1
26
51
51
101
76
151
101
201
126
251
151
301
176
351
201
401
226
451
251
501
276
551
301
601
326
651
351
701
376
751
401
801
426
851
451
901
476
951
1200000
200000
2500000
average of damage
1
51
101
151
201
251
301
351
401
451
501
551
601
651
701
751
801
851
901
951
Average Damage
1400000
1%
variable X
1% exceedance damage
1600000
avg
Using models rather than summary curves
One Event
(sample
member)
Peak Discharge (cfs)
Simple Monte Carlo Simulation
CDF
0
1
Exceedance Probability
1 N
EAD   Damage (i )
N i 1
Corps of Engineers
IWR-HEC
replace flowfrequency curve
with a sample…
…end with
a sample of
damages
Outline
• Expected Annual Damage (EAD)
– how it’s been computed
– other decision-making metrics
• Monte Carlo Simulation
– Event sampling / modeling
• Natural Variability and Knowledge Uncertainty
–
–
–
–
definitions, how they affect EAD
how we sample them and when
performance indices
reducing compute time
Variability and Uncertainty
• Natural Variability (Aleatory) = some variables are
random and unpredictable by nature, and their values
differ event to event or place to place
• Knowledge Uncertainty (Epistemic) = some variables
are more or less constant, but we do not know those
values accurately
• Both variability and uncertainty
are described by
probability distributions
PDF
weir coefficient
How do these affect EAD?
• We estimate average damage (EAD) because the
natural variability in flooding prevents us from knowing
what future damages will be
• Natural Variability: All random
variables that vary event-to-event
or vary spatially are captured
within the distribution of damage,
and so in the mean damage
– flood magnitude, forecasts,
channel roughness
mean =
EAD
annual
damage
PDF
annual damage
How do these affect EAD?
• Knowledge Uncertainty: Watershed parameters that we
do not know exactly introduce uncertainty into the
damage distribution and so into
the mean damage
– flood likelihood, hydraulic
coefficients, channel capacities
EAD
distribution
• This uncertainty creates a
probability distribution of EAD
annual damage
Including Uncertainty in the
EAD computation
• So far, the Monte Carlo simulation we looked at
sampled only natural variability from the flood
frequency relationship
• We need to include uncertainty in the sampling
and modeling to include it in the evaluation of EAD
• In the flood frequency relationship, the uncertainty
stems from sampling error, which is the error from
estimating probabilities from a small sample
Computing EAD with Summary Curves
Peak Flow (cfs)
no uncertainty considered
Stage (ft)
CDF
only capture
natural
variability
0
1
Exceedance Probability
AREA =
expected
annual
damage, EAD
0
1
Exceedance Probability
Corps of Engineers
IWR-HEC
How do we capture knowledge
uncertainty in MC event modeling?
Nested Monte Carlo:
A. Sample instances of natural variabilities as flood events,
with enough events to capture the distribution of damage.
B. Sample instances of knowledge uncertainties in model
parameters for each realization of the damage distribution.
1 outer loop B =
a realization
A
inner loop A varies natural
variability, computes EAD
B
outer loop B varies
knowledge uncertainty,
computes EAD distribution
Sampling Variability and Uncertainty
Peak Flow (cfs)
Nested Monte Carlo Simulation
sample uncertain
model parameters
sample variabilities
Corps of Engineers
IWR-HEC
One Event
(sample
member)
CDF
Sample new
frequency curve
(uncertainty)
and then sample
events (variability)
0
1
Exceedance Probability
For each realization,
get an EAD estimate:
1 N
EAD   Damage (i )
N i 1
…still end with
One
a sample of Realization
damages
After repeating for
many realizations:
sample of mean damage
(EAD) from all realizations
(spans knowledge
uncertainty)
provides distribution of EAD
0.4
0.3
0.2
0
35
30
20
15
0
0
250,000
500,000
750,000
1,000,000
1,250,000
1,500,000
1,750,000
2,000,000
2,250,000
2,500,000
2,750,000
3,000,000
3,250,000
3,500,000
3,750,000
4,000,000
4,250,000
4,500,000
4,750,000
5,000,000
5,250,000
5,500,000
5,750,000
6,000,000
Relative Frequency
0.5
0
250,000
500,000
750,000
1,000,000
1,250,000
1,500,000
1,750,000
2,000,000
2,250,000
2,500,000
2,750,000
3,000,000
3,250,000
3,500,000
3,750,000
4,000,000
4,250,000
4,500,000
4,750,000
5,000,000
5,250,000
5,500,000
5,750,000
6,000,000
Relative Frequency
sample of annual damage
from one realization
(spans natural variability)
provides 1 estimate of
EAD
0.6
mean =
average
= EAD
Annual Damages
0.1
Annual Damage ($)
40
100 realizations
25
EAD estimates
10
5
Average Damage (EAD) $
Model Parameters
Variability and Uncertainty
Hydrologic Frequency
Natural Variability
Knowledge Uncertainty
Annual Maximum Flow
Flood frequency curve parameters
(flood frequency curve)
Snowmelt forecasting
Model Parameters
Variability and Uncertainty
Reservoir Modeling
Natural Variability
Starting Storage/Elevation
Demands (water, power)
Current Power Capacity
(outages)
Sedimentation changes
Knowledge Uncertainty
Stream routing coefficients
Reservoir physical data:
storage/elevation,
release capacity, etc
Model Parameters
Variability and Uncertainty
Channel Backward Routing
Natural Variability
Manning’s n
Bridge Debris
Ice thickness
Dam/levee breeching
parameters
Knowledge Uncertainty
Weir Coefficients
Gate Coefficients
Bridge/culvert coefficients
Manning’s n
Contraction/Expansion coefficients
Boundary Conditions
Terrain Data
Model Parameters
Variability and Uncertainty
Floodplain Damage
Natural Variability
Structure value
Content Value
Car Value
Other Value
Depth/Damage functions
Fatality Rates
Mobilization Curve
Knowledge Uncertainty
Foundation Height
Ground Elevation
Note, many of these are
captured as spatial variability
rather than uncertainty
100 realizations
0.14
0.12
0.1
0.08
0.06
0.04
average
EAD
0.18
0.16
100 realizations
EAD
distribution
Average Damage (EAD) $
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0.02
0
0
300,000
400,000
500,000
600,000
700,000
800,000
900,000
1,000,000
1,100,000
1,200,000
1,300,000
1,400,000
1,500,000
1,600,000
1,700,000
1,800,000
1,900,000
2,000,000
2,100,000
2,200,000
2,300,000
0.16
Relative Frequency
0.18
300,000
400,000
500,000
600,000
700,000
800,000
900,000
1,000,000
1,100,000
1,200,000
1,300,000
1,400,000
1,500,000
1,600,000
1,700,000
1,800,000
1,900,000
2,000,000
2,100,000
2,200,000
2,300,000
Relative Frequency
How many realizations?
• Optimally, until convergence…
• The number of realizations needed to define the
resulting distribution depends on its use
10,000 realizations
average
EAD
10,000 realizations
EAD
distribution
Average Damage (EAD) $
Using the Sample of EAD
estimated probability
distribution
Probability / $
histogram
P=10%
that EAD
< EAD10
EAD10
Corps of Engineers
IWR-HEC
P=10% that EAD > EAD90
Expected Annual Damage ($)
EAD90
M ean EAD
What can I do with this?
If also have a probability distribution of cost…
(because cost is also uncertain)
...can consider cost and benefit to compute:
– probability B/C ratio is less than 1
– probability Net Benefit is less than 0
49Corps of Engineers
IWR-HEC
Net Benefit Distributions for 2 Projects
Probability Density
Project 2
Mean NB = $1 million
P (NB<0) = 9%
Project 1
Mean NB = $3 million
P (NB<0) = 27%
-10
-5
0
5
Net Benefit (million $)
50Corps of Engineers
IWR-HEC
10
15
Other metrics – AEP, Assurance, LTEP
AEP = Annual Exceedance Probability (variability)
= percent of events that exceed certain stage
= percent of events that get given structure wet
Like EAD, get 1 estimate of AEP in every realization
After all realizations,
Assurance:
have AEP distribution
78% chance
distribution of
0.3
0.2
0.15
0.1
0.05
uncertainty in
AEP
AEP ≤ 1%
100 realizations
0
0.038
0.036
0.034
0.032
0.03
0.028
0.026
0.024
0.022
0.02
0.018
0.016
0.014
0.012
0.01
0.008
0.006
0.004
0.002
0
Assurance = likelihood
that AEP is less than
a specified value
(uncertainty)
Relative Frequency
0.25
AEP of stage of interest (61')
Other metrics – AEP, Assurance, LTEP
Long-term Exceedance Probability (LTEP)
(formerly called “long-term RISK”)
= the likelihood of exceeding a stage or getting wet
at least once in N years (estimate with binomial distr)
LTEP = 1 – (1 – AEP)N
The chance of exceeding the 1% event (100-yr) at least
once in 30 years is:
LTEP = 1 – (1 - .01)30 = 26%
The chance of exceeding the 5% event (20-yr) at least
once in 30 years is:
LTEP = 1 – (1 - .01)30 = 79%
Computational Effort
• There are at least two methods planned for managing the
computation effort of running the system models for 100s
or 1,000s of events.
1. Distributed Computing
–
–
Different instances of the life cycle or realization can be run on
different computers, and results returned
100 computers reduces time to 1%
2. Intelligent / Importance Sampling (selective compute)
–
–
Not all events can cause flooding. Events with no chance of
causing damage are not computed
Might run only 2% to 3% of events.
Summary
• To evaluate flood damage for a project life, must consider
Natural Variabilties in flooding the watershed and
Knowledge Uncertainties in our modeling of flooding of the
watershed.
• Both variabilities and uncertainties are described with
probability distributions
• Monte Carlo analysis lets us replace probability distributions
with large samples from those distributions – event sampling
• Variability is captured in EAD and AEP, Uncertainty is
captured in the distribution of EAD and in Assurance
Questions
• Do the differences between natural variability and
knowledge uncertainty matter in decision making?
If yes…
• How can we separately consider them in risk analysis
computations?
– What happens if we do not consider them separately?
• How can we estimate and describe them?
• How can decisions best consider them?