International Conference on Statistical and

18. Total probability methods for
problems in flood frequency estimation
S. Rocky Durrans
Department of Civil and Environmental Engineering
The University of Alabama, Tuscaloosa, Alabama
U.S.A.
Abstract
The theorem of total probability, when applied in concert with deterministic methods of flood routing,
yields an integrated deterministic-stochastic tool which may be employed to salve some difficult
problems in flood frequency estimation. Most notably, the integrated modeling approach cari be
employed for flood frequency estimation at regulated sites, and it cari also be employed to study the
suitabilities of implied structures in schemes that have been proposed for regional flood frequency
analysis. Because the method involves a deterministic comportent, it cari also be used in a predictive,
and even predscriptive, fashion. It is the put-pose of this paper to present the integrated modeling
approacb, and to illustrate its application. N&s and opportunities for additional research are also
identifkd.
RQumC
Le th&xbme des probabilit& totales, lorsqu’on l’applique de concert’des mathodes d6terministes de
transfert de crues, donne un outil intkgr6 d6terministe et stochastique qui peut étre employ6 pour
rbsoudre quelques problémes difficiles pour l’estimation des fn?quences de crues.
Plus
particulibrement, l’approche de modt5lisation intCg& peut être employ& pour l’estimation des
f%quences de cmea sur des sites r@ul& et elle peut Ure aussi appliqut?e pour &udier I’ad6quation des
structures impliqu6es dans les schbmas propos& pour l’analyse rbgionale des frQuences de crues.
Puisque la m&hode cornprt& une composante dkterministe, elle peut aussi are utiliske sous un mode
prtiictif
et m&me prescriptif. L’objectif de ce papier est de prbsenter l’approche de modélisation
intCgr& et d’en ilhrstrer l’application. Les besoins et les pistes de recherches additionelles sont aussi
indentif%s.
18.1 Introduction
18.1.1
General
It is a great pleasure and honor for me to have had the opportunity to visit Paris for the
purpose of attending and presenting my work at the Tntemational Conference on Statistical and
Bayesian
Methods in Hydrology, which conference was held in honor of Professor Jacques
Bemier. 1 fïrst met Professor Bernier at the conference which was held at the University of
299
11
1900
I
1910
I
I
l
I
I
I
l
1920
1930
1940
1950
1960
1970
1980
YEAR
Figure 18.1 fiktoricd
trend in aruruul V.S. losses (1983 dollars) due tojbod
Source: Nationul Weatlrer Service
damage
Waterloo in Ontario, Canada,in the summerof 1993, and 1 have corne to know several of bis
colleaguesand fîiends (most notably the group at INRS-Eau in Québec City) rather well since
that time. Unfortunately, however, 1 have not had the opportunity to collaborate Jirectly with
Prof. Bemier himself. 1 am certainly aware of his signifïcant contributions to statistical
hydrology, and especially to Bayesian methods, and 1 am greatly impressed by bath their
quality and by their depth and breadth. It is my hope that the work presented in the following
pages W ill be interpreted as being logically connected to Prof. Bemier’s efforts (through the
well-known and fundamental connection between the theorem of total probability and Bayes’
theorem ). It is the intent of this presentation to establish and lay down the framework for an
integrateddeterm inistic-stochasticapproachto flood frequency analysis, with the hope that the
inclusion of a determ insticcomportentin a problem which has historically been treated usually
in only a statistical way W ill increase the credibility of flood quantile estimates deriving
therefrom .
The problem of estimation of the magnitudes and corresponding probabilities of floods
is one of considerable importance. Despite efforts to control the effects of floods by means
of bath structural and nonstructuralmeasures,statisticson their effects in the U.S. demonstrate
that they are exacting a continually increasingeconomicand flnancial drain on society. Figure
18.1 shows the historical trend in annual U.S. lossesdue to flood damage, and indicates that
over the time period from 1900 to 1980 annual damageshave increased by over an order of
magnitude. Hoyt and Langbein (1955) have suggestedthat the lion’s share of the increase is
due to increasedproperty values, as well as the continued development of flood-prone lands.
Improved flood loss reporting, as well as possible climatic changes, may have some effect as
well. The cost of flooding in terms of loss of life is also of significant concem. When
compared to the population-adjusted death rates caused by three other natural hazards
(lightning, tomadoes, and tropical cyclones), that due to flooding has shown that little real
progressbas been made. Figure 18.2 demonstratesthat death rates in the U.S. due to the three
compared lnuards bave either dropped dramatically or remained nearly constant over time
whereasthat due to flooding appearsto be slightly increasing. Other effects of flooding relate
300
2.8 r
2.6
R
\
2.4
0
i=
a
2.2
i
3, Lighlnmg
\
01
l
1941
to
1945
1946
10
1950
Figure 18.2
I
1951
10
1955
I
I
1956
10
1960
Population-a&sted
Source: National
1961
10
1965
I
1966
10
1970
l
1971
10
1975
1
1976
to
1980
deatll rates in the V.S. from four stom
Chatic
Data Center
irazardr
to riparian ecosystems and geological processes. The nutrients in sediments which are
naturally deposited by floods are essential for biological production and habitat regeneration
in the riparian zone. The selective degradation and aggradation of river reaches has farreaching effects in terms of changes to landforms and river meander pattems.
As already noted, the objective of this paper is to present an integrated deterministicstochasticapproachwhich has been devisedas a consistent framework for approaching various
problems in flood frequency analysis. A key component of the approach consists of an
application of the theorem of total probability, which is a comerstone of Bayesian theory The
motivation behind the development of this framework is that of providing a consistent and
physically meaningful basis for flood frequency estimation.
18.2 Total probability
applications
The theorem of total probability is an elementary result of an application of the classical
axioms of probability (Stuart and Ord, 1987) to a set of mutually eurlusive and collectively
exhaustiveevents. Despite the intrinsic merit of the theorem, however, some would argue that
little has corne of it. This is evidently due to the difficulty, in some applications, of evaluation
of both the mixture coefficients (the probabilities of the collectively exhaustive and mutually
exclusive events) and the conditional probability distributions in an objective and meaningful
way.
Within the field of flood frequencyanalysis, there have been several types of applications
of the total probability idea. By far the most common of these is that in which flood events
are viewed as arising from differing causal mechanisms. That is, flood events are viewed as
being causedby rainfall events, or by snowmelt, or by other similar effects. Mixture models
301
are built as a weighted combination of probability distributions, each of which is descriptive
of flood events arising from a single causative mechanism. Examples of this application are
widespread,examplesof which are given by Hazen (1930), Singh and Sinclair (1972), Waylen
and Woo (1982), Jarrett and Costa (1982), Hirschboeck (1985,1986), and Diehl and Potter
(1986).
Another widespreadapplication of mixture modelsto flood frequency analysis arises when
one must consider ephemeral streams, where there is a finite probability that an annual
streamflow maximum Will be equal to zero. In such casesthe mixture mode1Will consist of
a combination of both a discrete probability distribution (to represent the single spike of
probability massat zero), and at least one continuous distribution to represent the probability
density for peak discharges greater than zero. Examples of this application are given by
Jennings and Benson (1969) and Haan (1977).
A third area in which total probability ideas have been applied in flood frequency analysis
is that in which they are embeddedin applicationsof Bayesiantheory. This type of application
is not nearly as widespread as the others that have been mentioned above, primarily because
of the generalinability to objectively specify a prior distribution. An area in which there has
been some work done, however, is that of developing bias correctors for estimators of the
coefficient of skewness. Lall and Beard (1982) and Durrans (1994) are two examples of this.
In this paper it is intended to demonstrate how the theorem of total probability cari be
coupled with deterministic simulation tools to develop consistent and physically meaningful
solutions to two classesof problems in flood analysis. The flrst problem lypc considcred is
that of development of flood frequency curves for regulated sites, such as cioLvIlstrcarnof
dams. The second application concems the regionalization of flood frequency information.
18.1.3 Outline of paper
Methods of estimating flood frequencies have a long history which dates to at least the early
part of the 20th Century. Section 18.2 of this paper provides a very brief summary of the
various types of methodsthat have been developed, and also contains a detailed discussion of
some of the fundamental statistical properties of regulated flood peaks. The formal
developmentof an integrated deterministic-stochastic approach to flood frequency analysis is
presented in Section 18.3, as is an application of the method for the development of a
regulated flood frequency curve. Section 18.4 presents remarks on the way in which the
integratedapproachmay be applied to regionalize flood frequency information. In particular,
it is shown how it may be employed to validate (or invalidate) the very rigid and rather ad hoc
assumptionsthat are intrinsic to current regionalization schemes,most notably the index flood
method. Conclusionsand additional research needs are presentedin the closing Section 18.5
of this paper.
18.2 Flood Frequency Analysis
18.2.1 Overview
Flood frequencyanalysisinvolves the estimation of exceedanceprobabilities corresponding to
flood peaksof various magnitudes,or vice-versa. Data used to support the estimation process
usually consistof the maximum instantaneousdischarge rates from each water year of record
(an annual series), which is sometimes approximated by the annual maximum average daily
discharge. Other data types of interest may consist of flood volumes, maximum stages, or of
302
a11flood discharge peaks which are greater than some threshold. The peaks-over-threshold
(POT), or partial duration series, approachis basedon the recognition that the second- or even
third-largest peaks in some years may be greater than the largest peaks in other years. The
brief reviews presentedin the following Sections 18.2.2 and 18.2.3, as well as the techniques
that are presentedlater in Sections 18.3 and 18.4, relate to the annual series approach, though
the modeling approachcould be applied to distributions developed from partial duration series.
18.2.2 Statistical methods
Early approachesto flood frequency analysiswere all statistical in nature. That is to say, they
involved the fitting of a probability distribution to an observed series of flood peak data.
Statistical methods of flood frequency analysis cari be classified into at-site estimation
techniquesand into regionahzationtechniques. They cari also be classified as to whether they
are parametricor nonparametric. The discussionsin the bulk of this paper are focused on the
problem of at-site estimation; a discussion of issuesassociatedwith regionalization is delayed
until Section 18.4.
The parametric approach to statistical flood frequency estimation is the classical one and
is undoubtedly the most widely applied. In this approach, one must Select a probability
distribution for modeling of the data, and one must also choose a procedure for estimation of
the pamrnetersof the distribution. Integration of the fïtted density may then be accomplished
to estimate the various quantiles of interest.
The need to Selectboth a distribution and a parameter estimation method in parametric
methodsof flood frequency analysis leads to a certain amount of subjectivity in the resulting
quantile estimates. In the tails in particular, where little if any data are available, the choice
of one probability mode1over another cari have a significant impact on the resulting quantile
estimates. Most models perform quite compambly to one another in their mid-ranges, and this
tends to make it very diffïcult to discriminate one from another. The concept of robustness
(Kuczera, 1982) is a way in which some of these selection difficulties may be overcome, but
a demonstrationof robustnesscari often involve a time-consumingand costly simulation study.
An application of a pasametricmethod of flood frequency analysis involves making some
assumptionspertaining to the statistical properties of the data being described. In particular,
the data should be random, independent, homogeneous, and stationary. A number of tests
have been presentedin the literature for judging the quality of data in terms of these attributes.
A description of a number of these tests are provided by Kite (1977); Loucks, Stedinger and
Haith (1981); and Bob& and Ashkar (1991).
Even before the widespreaduse of parametric methods of flood frequency analysis, there
was a good deal of use of’nonparametric methods. The early nonparamehic approaches
involved primarily the use of plotting position formulas and probability paper, followed by the
sketching of a frequency curve to smooth the trend of the data. The subjectivity of the
sketching, as well as the diffrculty of extrapolating a sketched curve, are what ultimately led
to the demise of this method, and it was replaced by the more objective methods involving
estimation of the parameters of a parametric distribution.
As noted in the previous subsection,however, the parametric approach to flood frequency
estimationis not entirely objective either. A measureof subjectivity is introduced by the need
to choose the distribution and estimation method. Partly because of the difficulties and
uncertaintiesthat are inherent to these choices, but also becauseof the growing belief that no
one parametric distribution is adequateto represent all cases(or even the full range of flood
values at a single site), there has been a surge of interest in the past decade on nonparametric
303
‘methodsof density estimation (Adamowski, 1985). These methods of density estimation are
typically basedon a superposition, or convolution, of kemel functions, and cari provide very
good fits to observed data samples, though they do experience problems when one must
extrapolate beyond the range of the data sample.
The nonparametric approach to estimation, like the parametric one, also requires that
somechoicesbe made. First one must choosea kemel type that is desired to be used, and one
must then decide on how best to estimate the kemel bandwidth. These problems are directly
analogousto the choicesthat must be made in the parametric approach, but Silverman (1986)
bas indicated that there is really very little to choose between the various kemels, at least on
the basisof the integratedmean squareerror. Adamowski and Feluch (1990) have considered
the use of a skewed kemel (the Gumbel kemel) in an attempt to reduce the bias of quantile
estimatesin extrapolation, but found that little was to be gained by this. Moon and Lall (1994)
have adopted a different approach, and have employed SO-calledkemel quantile estimators.
Estimation of the kemel bandwidth-in nonparametric density estimation has usually been
accomplishedby minimizing the integratedmean square error (IMSE) ofthe density estimator
over the full range of the ‘distribution. It is argued here that one should instead focus on
minimization of either the mean square et-roc(MSE) or the bias of estimators for particular
quantiles. This, of course, is motivated by the observation that the interest in flood analysis
is the prediction of quantiles, not density functions, and the fact that minimization of the IMSE
does not imply that MSEs and/or biases of quantile estimators are also minimized.
A signifïcant aspect of nonparametric methods of density estimation when compared to
parametric methods is that the observations do not necessarily need to be homogencous.
Because of the flexibility that is inherent to kemel-based estimators, they cari exhibit the
unusual, and sometimes multimodal, density shapesthat arise when mixtures of populations
are’present. With respect to the qualities of randomness, independence, and stationarity,
however, nonparametric methods are subject to the same limitations as are parametric
methods.
An additional advantage of nonparametric estimators arises when one must consider
multivariate modeling. In flood frequency analysis, this would occur if one were interested
in both flood peaks and volumes simultaneously. Parametric modeling using multivariate
densitiesis tractableonly in the few caseswhere multivariate distributions are known, or when
the variablesare statisticallyindependentof one another. The multivariate normal distribution
bas been widely used, but it cari be very difficult to put multivariate flood data into this form,
even through the use of normalizing transformations, and this has become a major stumbling
block in attempts to mode1 more than one random variable at a time. Multivariate kemelbaseddensity estimators,like their univariate counterparts, are very flexible and cari describe
the joint behavior of variables in a nonrestrictive way. Some applications are described by
Lall and Bosworth (1994) and Silverman (1986). Silverman also indicates that kemel-based
multivariate densities cari be estimated with much less data than cari multivariate histograms
or other characterizations of the joint behavior; this is particularly attractive in hydrologie
applications where there is often a paucity of data available.
18.2.3 Runoff modeling methods
The runoff modeling approach to flood frequency estimation has developed primarily as a
consequenceof the continueddevelopmentof computers and hydrologie simulation codes. TO
some degree, however, estimatesof flood quantiles were available through the use of runoff
models long before these modem accomplishments. A case in point is that of the use of the
304
rational method for peak runoff estimation. A fundamental assumption in that case is that the
rainfall and nmoff rates have the same frequency of occurrence. It is known that this is not
generally truc, but the mtional method continuesto be one of the most widely applied methods
in day-to-day engineering practice.
Analytical solutions for the derivation of flood frequency distributions from rainfall
distributions have also been applied. Eagleson(1972) was the pioneer in this area. A number
of other investigatorshave followed this path, but Moughamian, McLaughlin and Bras (1987)
have concludedthat these methods do not perform very well. They suggest that fundamental
improvements are needed before any confidence cari be assignedto these methods.
Rainfall-runoff simulation models may be classifïed as being either event-based or
continuous. For the purposeof simulating flood frequency relationships, however, models of
the continuous type are the most widely applied. This is due to the diffculty in practice of
specifying appropriate antecedentconditions for event-basedmodels. Inputs to continuous
simulation models may consist of historical records if they are available and of sufficient
length, but they are probably more frequently obtained as the output of stochastic simulation
models. Peaks in the continuous streamflow hydrograph which are generatedby the runoff
simulation mode1are subjectedto statisticalanalysesas describedin Section 18.2.2. Examples
of this approach are provided by Bras et al. (1985); and Franz, Kraeger and Linsley (1986).
An attractive aspect of the runoff modeling approach to flood frequency estimation, like
other approaches such as those afforded by the geomorphic instantaneousunit hydrograph
(Rodrfguez-Rurbeand Valdes, 1979), is that it representsan attempt to understand and mimic
the physical processesthat are important in the transformationof rainfall to runoff. It has bcen
suggestedby the National ResearchCouncil (NRC, 1988, p. 56) that runoff models might be
useful for regionalization of flood frequency behavior, the thought being that differences in
flood frequency curves from one site to another are due only to differences in the catchments,
and not in the meteorology. That is, if meteorological variables could be regionalized, then
runoff models could be used to account for the runoff response differences due to the
catchment properties.
The goal of obtaining flood frequency estimates which are physically based is certainly
a laudable one, but it appears as though the runoff modeling approach is simply unable to
achieve the desired performance. The complexity of the runoff generation processes,
combined with the spatial and temporal heterogeneitiesand variabilities in the forcing and
catchmentsystemvariables, conspire to yield a runoff responsebehavior which is beyond the
abilities of models to reproduce. Indeed, when flood frequency curves developed using
rainfall-runoff models are compared with those based on actual historical data, the
inadequaciesof models becomequite apparent. Figures 18.3 and 18.4 present results obtained
by Thomas (1982) and Muzik (1994), and demonstrate that distributions generated from the
outptit of rainfall-runoff models display a variante that is smaller than that exhibited by
historical data. Thomas referred to this as a “10s~of variante” problem; it is analogous to a
s’imilarproblem in time seriessynthesisand forecasting where highs and lows are consistently
under- and over-predicted.
Given the problems with runoff modeling and derived distribution methods that have been
highlighted above, and the objective of the present work to develop a physically meaningful
approach to some problems in flood frequency analysis, one is left to question what is being
offered here that would surmount the diffïculties discussed. The answer lies in the use of the
derived distribution concept, but to derive flood frequency curves from other flood frequency
curves rather than from precipitation frequency curves. The physical linkage between the
streamflow discharges at different points along a river or stream is much better understood,
305
Figure 18.3
Observed and simulated (Qhetic)fiood
Source: Thomas (1982)
1.003
Figure 18.4
1.05 1.25
RECURRENCE
frequency
curves
5 10 50 100 500
2
INTERVAL
(years)
Comparison of observed and syntheticfloodfrequency
Source: hfuzik (1994)
curves
for tlle Link
Red Deer River
and is subjectto much less intrinsic variability due to antecedentconditions and the like, than
is the linkage between precipitation and the resulting storm runoff. In other words, the
transformationof dischargefrom one site to anotheralong a stream is much more determ inistic
306
than is the transformation from rainfall to runoff, at least in terms of our current abilities to
describetheseprocesses. Early work by Laurenson (1973,1974) along the same lines as that
presented here demonstratesthe promise of this ides.
18.2.4 Effects of regulation
Whereasone of the applicationsof the total probability methodsthat are presentedin this paper
is directed to the determination of flood frequency curves at locations downstream of
regulating structures, it is appropriate before proceeding to review some of the statistical
characteristics of regulated flood peak sequences. Of particular interest are the qualities of
randomness, independence, homogeneity, and stationarity. The observations that are made
with respectto thesequalities are employed in Section 18.3 to develop a modeling framework
which is consistent with them.
The first quality of concem is that of randomness. In a hydrologie context, it is generally
acceptedthat randomnessmeansessentiallythat the fluctuations of the variable of interest arise
from natural causes. It is therefore generally considered by hydrologists that flood flows
which have been appreciablyaltered by the operation of a regulating structure are not random.
It is argued here, however, that this is not truc. Becauseflood events occur randomly in time
(even though they tend to occur in particular seasons), and because of the randomness
associated with the stage (and other conditions) of a regulating reservoir when flood events
occur (due to the randomnessof antecedentconditions), the regulated flood events downstream
of the regulating structure must also be random. This is truc becausea function of a random
variable is also a random variable, and it must be truc even if the reservoir were operated in
exactly the same way every time a flood event occurred (which is not very likely).
The property of independence, in the context of the at-site approach to flood frequency
analysis, relates to whether the annual flood event in year t has any predictive ability with
respect to flood events in years t+ 1, t+2, and SOon. That is, it refers to the lack of serial
cor-relation. In regional analyses,the effects of spatial cor-relation must be considered as well.
It is generally true in flood frequency analysis, especially when annual as opposed to partial
duration seriesare being modeled, that sequential flood events are independent of one another
in time. Exceptions to this may occur in caseswhere this is a significant amount of storage
present upstream of the location of interest. Lye (persona1 communication, 1993) has
considered such problems for Canadian rivers. It is assumedin the sequel that annual flood
events cari be considered to be independent of one another; additional work is needed to
generalize the results that are presented.
Becauseof the effect of initial reservoir conditions when flood events occur, as well as
the effects of operating the reservoir in different ways, regulated flood events cannot be
consideredto be homogeneous. That is, regulatedflood peaks derive from different population
distributions, which may be indexed by the initial and boundary conditions pertinent to the
reservoir when flood events occur. A graphical depiction of this is provided by figure 18.5,
which shows conditional regulated flood frequency distributions downstream of a reservoir.
That figure was generatedfor the samehypotheticalreservoir discussedin Section 18.3.3 using
a Monte Carlo procedure. The dotted curve represents the unregulated flood frequency
distribution upstream of the regulating facility, and the solid curves show some of the
conditional distributions that result. The first of the two numbers shown for each conditional
distribution representsthe (dimensionless)initial stage of the reservoir (0 = empty, 1 = full),
and the second represents the (dimensionless) outlet gate opening amount (0 = closed, 1 =
fully open). It bas been assumedthat the gate opening amount is held constant throughout the
307
~._ .“. -. ..
95 90
80
Exceedance
probability,
percent
Figure 18.5 Unregulated (dotted curve) and conditional regulated (solid curves) flood freqrretlcy
distributions bared on sinrulation of a hypothetical reset-voir
duration of the flood event; this would be true for an unattended reservoir. In other words,
the curve with the label (0.9;O) representsthe regulatedflood frequency distribution that would
arise if, every time a flood event were to occur, the reservoir had an initial dimensionless stage
of 0.9 and a zero outlet gate opening amount. Of course, real reservoirs, because of the
effects of antecedentconditions and operating policies, have initial and boundary conditions
(gate openings) that vary from one time to another. For any possible combination of initial
and boundary conditions, there is a regulated flood frequency distribution that is conditional
on that combination.
It is clearly evident in figure 18.5 that the population distribution from which a regulated
flood event derives is very much dependent on the conditions of the reservoir when the flood
event occurs. This observation is the basis for the use of the total probability theorem in the
integrateddeterm inistic-stochasticapproach presented in Section 18.3.2. A point which may
also be noted from figure 18.5, however, is that it tends to yield rather nonsensicalresults on
the left-hand-sideof the diagram ; i.e. when the exceedanceprobability is large. In particular,
it indicates flood magnitudes of zero over considerable portions of some of the conditional
distributions, pa.rticularly those in which the initial reservoir stage is considerably below the
crest of the emergency spillway. This behavior is apparent because the Monte Carlo
simulation was accomplished in an event-basedrather than continuous manner. The results
make sense from a conservation of mass viewpoint, but they do not make sense from a
flooding viewpoint. This is true becauseeven low-flow releasesmade during the year would
be greater than zero.
Regardlessof the behavior of the left-hand-side of figure 18.5, the right-hand-side does
make senseand it is in that region that one is primarily interested anyway. The problems in
the’left-hand-side are therefore not believed to be of any significant concem, and this is
reinforced by the fact that the nonsensicalresults arise only when the initial reservoir stage is
308
very low. Since the likelihood of this occurring in real reservoirs is usually very small, except
perhapsin extremely arid regions, the limitations are not believed to be of too much concem.
The net consequence of ail of this is that there Will be an implicit assumption in the
developments to follow that annual floods upstream of the regulating reservoir cause annual
floods downstream of the regulating reservoir. This is certainly true for the most extreme
events, and such is evident in figure 18.5. A POT, or partial duration series, approach to the
problem may be able to be applied to lift this assumption, and future work should address this
possibility.
The final statisticalcharacteristicof interest is that of stationarity. In the present context,
flood sequencesWill be taken to be stationary if the reservoir operating policy is stable. A
stableoperating policy, according to Loucks et al. (1981), is one in which the operating rules
are consistent from one year to the next, even though there are within-year variations due to
the annual streamflow cycle. Nonstationaryregulatedflood peak sequencesarise when a stable
operatingpolicy is not in effet; i.e., when there have been changes made in the way in which
the reservoir is operated.
In summary, regulatedflood sequencesmay be considered to be random but they are not
homogeneous. Whether they are independent or stationary dcpends on the circumstances of
individual cases. For the purpose of this presentation, however, it Will be assumedthat they
are both independentandstationary. The issue of independenceis an area in which additional
work is needed. Where flood sequencesare nonstationarydue to operational changes, the total
flood sequenceshould be subdivided into subsequenceswhich are intemally stntinnary. This
cari be accomplished on the basis of recorded changes in the operating policy.
18.3 Frequency estimation for regulated sites
18.3.1 Overview
Section 18.2.4 provided an exposé of the fundamental statistical characteristics of regulated
flood peak sequences. It is the purpose of the next Section 18.3.2 to present a generalized
flood frequency modeling framework that is consistent with those characteristics, and which
preservesthe physical linkage that must exist betweenflood frequency relationships at different
locations along a stream. Section 18.3.3 then provides a detailed example of an application
of the developed method, and Section 18.3.4 discussesSO~Cof its inherent attributes.
The same integrated deterministic-stochastic modeling framework that is presented in
Section 18.3.2 for treatment of regulated flood frequency problems cari also be applied for
problems in regionalization. Discourse on this latter application area is contained in Section
18.4.
18.3.2 Integrated
modeling framework
There are a number of previous investigators who have presented methods for estimation of
regulatedflood fîequency relationships. What is believed to be a fairly comprehensive list is
Langbein (1958); Laurenson (1973,1974); Sanders et al. (1990), and Bradley and Potter
(1992). Al1 of these approaches have involved the theorem of total probability, though in
different sorts of ways. Other methodswhich may be used for regulated frequency estimation
derive from the theory of storage (Moran, 1959), as well as from the application of various
types of mathematical programming techniques (Loucks et al., 1981). These latter methods
yield the probability distributions of releasevolumes instead of peaks,‘however, and they are
309
~-
---
-
therefore not as useful as methodsthat cari yield the distribution of peaks directly. The method
presented by Bradley and Potter (1992) is also fundamentally based on modeling of flood
volumes, and obtains peakson the basisof an observedrelationship between the two variables.
As the title of this paper suggests, the theorem of total probability is also used here to
permit the modeling of regulatedflood frequency behavior. The approach used here is rather
unique in comparison with the previous approaches,however, and it tends to emphasize the
physical propertiesof the regulating reservoir that are important determinants of the regulated
flood frequencybehavior. An introduction to thesephysical effects, and the way in which they
induce heterogeneity into regulated flood sequences,was presentedin Section 18..2.4.
In addition to the theorem of total probability (the stochastic component), the integrated
modeling approach presented here also involves a deterministic component. It is becauseof
the presence of this deterministic component, of course, that the modeling approach enjoys
some physical meaning, and it is also becauseof this component that the physical linkages
between flood frequency relationships are able to be preserved. In application, the
deterministic component amounts to no more than a hydrologie (or hydraulic) routing
algorithm.
The framework and example presentedin this and the subsequentsection are intended to
establish the regulated flood frequency relationship immediately downstream of a regulating
reservoir, basedon knowledge of the unregulatedflood frequency relationship upstream of the
reservoir. If the flood frequency relationship is needed some distance downstrcam of the
regulating reservoir, then the techniquespresentedin this section must be combincd with those
presented in Section 18.4. As already noted, there are also several assumptions that are
intrinsic to the framework that is presented. Recapping, these assumptionsare:
(1) regulated annual floods downstream of a dam are causedby the unregulated annual
floods occurring upstream of the dam;
(2) regulated floods are independent events; and
(3) the reservoir operating policy is stable.
Because of the need to route flood hydrographs through the regulating reservoir, which
involves volume as well as peak dischargeconsiderations,it is necessaryto treat flood analysis
in this work in a multivariate way. This need also arises becauseof the several different but
interrelated variables that must be considered in order to quantify the initial and boundary
conditions pertinent to the reservoir itself. Becauseof the need to work with multivariate
distributions, and becauseof the complications and inadequaciesthat arise with multivariate
normal modeling, the use of nonparametric methods is believed to be called for.
In the following, let x = [xi x, .-lT denote a random vector of unregulated flood
characteristics. Also let y = bl yZ -.]r denote a corresponding random vector of regulated
flood characteristics. The individual elements ‘ii and yi of these vectors represent the
instantaneouspeak flow, the flood volume, and possibly other but more difficult to quantify
hydrograph characteristics such as hydrograph shape (multi-peakedness, etc.). Defme FAX)
= Pr(X, <x,, X*<x*, -*) as the joint distribution function of the unregulated flood
characteristics, and define F,,(y) analogously as the desired unconditional joint distribution
function of regulated flood characteristics. In actuality, F’Jy) is dependenton the operating
policy that is in effect for the reservoir, but as long as the operating policy is stable that
distribution may be viewed as an unconditional one.
The random vectors x and y pertain to the flood variables of interest. TO account for the
reservoir, one also needs to introduce a random vector A = [A1 A2 -.]’ and corresponding
310
density yA(l) of initial and boundary conditions relevant to the reservoir. The individual
elements li of this vector represent the initial reservoir stage (at the beginning of a flood
event), outlet gate opening amounts, and possibly other variables as well such as the rate of
change of outlet gate openings during the passageof a flood event.
It is necessaryin applications to quantify the distributions of the random vectors x and A,
and to judge whether they are correlated with one another. That is, one is required to develop
estimators for F.&) and fA(L) as well as the ‘correlation matrix between x and 1. If the
correlations are judged to be suffrciently large that a hypothesis of independencecannot be
supported, then one should develop an estimator for the joint distribution of x and A, which
Will be denoted as F,(x,l).
The deterministic component of the integrated procedure involves routing of flood
hydrographs through the regulating reservoir to develop a distribution function F,,,,,QlA) of
regulated flood characteristics conditioned on a particular combination A of reservoir
conditions. This deterministic component of the procedure cari be summarized in a general
form as
&&l~)
(18.la)
= WXWI
for the case where x and A are independent, or as
FY,*cylv= GVL(~, 91
(18.lb)
for the casewhere x and A are correlated. In theseexpressions,G, and G, are functions which
map the unregulated flood frequency relationship into a conditional regulated one. Actual
performanceof this mapping must be accomplished using a Monte Carlo method. It is clear
from theseexpressionsthat the conditional regulated flood frequency distribution is a derived
distribution, but that it has been derived from another flood distribution rather than from a
rainfall distribution as is done by Eagleson (1972) and others.
The theorem of total probability permits determination of the unconditional distribution
F,(y) of regulated flood characteristics as
c
(18.2)
w9 = FY,AYlwL(oQ
I
where the integration is performed over the complete space of feasible reservoir conditions.
A discrete analogue of this application of total probability may be written in the form
fXY) = aIF,@) -t- azF2Cy)+ ... +
a.F,,(y)
(18.3)
where {ai) is a set of weighting factors that sum to unity, and where ~ioI> may be regarded as
a component distribution. In other words, a; is the probability of the reservoir conditions
being in the i-th of a total of n discrete states, and Fio) is the conditional regulated flood
distribution corresponding to that reset-voir state.
The dimensionalitiesof the vectors x, y and I is an issue that is certainly of some concem
in applications. Clearly, the smaller are these dimensionalities, the easier Will it be to
determine the regulated flood frequency relationship. However, the unjustifkd use of
dimensionalities that are too small Will obscure some of the important physical determinants
of the regulatedflood frequency behavior and Will lead to a result which may not accord with
reality. It is suggestedthat the minimum dimension of the vectors x and y be equal to 2, with
311
the elements representing the instantaneouspeak and the flood volume. With respect to the
reservoir conditions, the required dimensionalityof the vector I Will depend on the particulars
of each application. In the case of a reservoir with an uncontrolled outlet, only the initial
reservoir stage would need to be considered. In the case of a reservoir with a controllable
outlet, but in which the outlet gate settings are not adjusted during the passageof a flood (an
unattended reservoir), two dimensions.would be necessary (see the discussion in Section
18.2.4). More complex reservoirs with multiple outlets and in which gate settings may be
modifîed during the passageof a flood Will require correspondingly greater dimensionalities
in the vector 1. A goal in practice should be to make the vector dimensionalities as small as
possible for computational reasons without adversely affecting the net result. This cari be
accomplishedin an iterative way by successivelyadding to the dimensionalities of the vectors
and checking to seewhether the derived unconditional flood distribution appreciably changes,
18.3.3 Example application
An example application of the integrated modeling framework to develop a regulated flood
frequency curve downstream of a hypothetical reservoir is illustrated in this section. For
simplicity of presentation, the reservoir is assumedto have a controllable outlet gate, but the
gate settings are not modified during the passageof flood events; that is, the reservoir is an
unattendedone. It is also assumedin this example that the vectors x and A are independent.
It is not the intent of this section to solve an actual real-world r:oblem, but ratfli:r to
demonstratehow the proceduremay in fact be implemented, and to illustrate the various types
of information that are required. The overall procedure is presented in a number of
subsections,each of which de& with a specific aspect of the problem.
(i) Marginal distribution of unregulatedjlood peaks
It may be observed that the integrated deterministic-stochastic framework permitting the
estimation of regulated flood frequency curves that was described in Section 18.3.2 is
nonparametricin nature. That is, there are no assumptionsmade with respect to the forms of
either the unregulatedor regulatedflood frequencydistributions, nor are there any assumptions
made as to the form of the distribution of reservoir conditions. For the purpose of this
illustrative example, however, it is assumedthat the marginal distribution of unregulated flood
peaks is the Gumbel, or extreme value Type 1 (EVl) distribution. This assumption is an
expedient only, as it is a simple matter to draw random samples from that distribution using
methods of simulation. The EV1 distribution is also widely regarded as being reasonably
flood-like.
Denoting unregulated annuel flood peaks by the random variable X,, and expressing the
EV1 distribution function in inverse form, i.e. as a quantile function, one cari generate
synthetic unregulated flood peaks for simulation purposes as
Xl = rn - Q In(-ln u)
(18.4)
where u is a uniformly distributed random variable on the interval (0,l) and a and m,
respectively, are scale and location parametersof the EV1 distribution.
In the present example, E(X,) = 300 m3S’and the coefficient of variation of X, is 0.3.
The parametersa and M in equation (18.4) are therefore equal to 70.2 and 260, respectively.
Theseassumptionsmake the probability of generationof negativevalues of x, extremely small.
312
Negative values, if and when generated in the simulations, were discarded and replaced by a
subsequently generated positive value.
In mal-world applications, estimation of the unregulatedflood frequency distribution must
be accomplished using streamflow data observed upstream of the reservoir. If a gaging site
is some distanceupstream of the reservoir, then the procedures discussedin Section 18.4 cari
be employed. In other cases, it may be possible to use data for the reservoir itself, such as
stagesand releases,to derive the reservoir inflow hydrograph and hence the unregulated flood
frequency distribution as well.
(ii) Conditional distriblttion of unregulatedjlood volumes
Denote the random variable representativeof unregulated flood volumes by X,, and condition
the distribution of flood volumes on the magnitude of flood peaks. Rogers (1980,1982),
Rogersand Zia (1982), Mimikou (1983), and Singh and Aminian (1986) have concluded that
a relationship between flood peaks and volumes cari be expressedby
h21, (Qpm = b + r log,, V
(18.5)
where Qp = x,/A is the peak discharge rate per unit area, V = x,/A is the runoff volume per
unit area, and A is the area of the drainagebasin. Singh and Aminian (1986) considered x, and
x, as the peak and volume of the direct runoff hydrograph. Base flow needs to be ndded
separately, and has been assumed to be a constant 20 rn’s-’in this illustrative example.
Equation (18.5) was originally established by means of a linear regression of log(Q,,#)
on log V. Bradley and Potter (1992) have also used simulation and the nonparametric
LOWESS smoother (Cleveland, 1979) to develop a relationship between flood peak and
volume. The intent of these previous studies has been to predict flood peaks from flood
volumes. In the present example it is intended to do the opposite; that is, it is intended to
predict flood volumes from flood peaks. Becauseof the analytic form of equation (18.5), as
well as the desire to keep the example relatively simple, that expressionWill be employed here.
Shictly speaking, a relationship developed by regressing a variable y on another variable
x should not be inverted to develop a predictor for x as a function of y. This is SObecause
there is not in general a “reverse causality”, and also becausethe parametcrs in the functional
relationship Will in general be different for the inverse relationship than for the forward one.
Equation (18.5), however, does not imply a causal relationship (flood peaks are not causedby
flood volumes); it is simply the consequenceof an empirical observation. The linearity of the
logarithmic plot of the data would have been present regardless of which of the variables had
been taken as the predictor. It is for this reason that it is assumedhere that equation (18.5)
cari be ïnverted and rearranged, and that the resulting expression given as follows cari be
interpreted as the expected value of log,, Vgiven log,, Qp:
al%lo v) = c(lcgY = (log,, Q, - W(r +
2)
It is also assumed for the present example that the conditional distribution of log,, V given
log,, Qp is normal with a standarddeviation of ab, v = 0.1, that the drainage basin has an area
ofA = 1300 km’, and that the values of the parameters in equation (18.6) are b -= -1.75 and
r = -1. These values, basedon the work of Singh and Aminian (1986), are reasonable, even
though their original relationship has been inverted.
313
Under the foregoing assumptions, an unregulated flood volume x1 may be generated for
simulation pur-posesas
x2
= A antilog(p, V + tar, J
(18.7)
where z is a standard normal variate with zero mean and unit variante. Note that becauseof
the log transformation, it is not possible to generate negative flood volumes using this
relationship.
The unregulatedflood peaksx1 and flood volumes x2 determined based on the procedures
discussed in this and the previous subsection are used in this example to quantify reservoir
inflow hydrographs. For pur-posesof illustration, some rather analytical expressions have been
used for thesevariables, but this should not be construed to imply that the assumptions made
to achieve those expressions are necessary. In actual applications, it may be preferable to
mode1 the joint distribution of xi and x2 using nonparametric multivariate kemel methods.
Silverman (1986) and Lall and Bosworth (1994) provide examples of this technique.
(iii) Reservoir inflow hydrographs
Basedon the values of x, and x2 generatedas described in the previous subsections, one must
construct a synthetic direct runoff hydrograph. This hydrograph, when combined with the
base flow, may then be routed through the reservoir to obtain the outflow hydrograph
properties. Naturally, the outflow hydrograph properties Will be conditional, based on the
initial and boundary conditions pertaining to the reservoir.
As an expedient, the U.S. Soi1 Conservation Service (SCS) dimension& triangular
l:\Pdrograph (SCS, 1969) is used in this example as a standard shape to represent the direct
runoff component of the reservoir inflow hydrograph. The triangular hydrograph is
characterizedby linear rising and recedinglimbs, with a hydrograph base time Tb equal to 2.67
times the time to peak TP. Since the peak of the direct runoff hydrograph is equal to x,, and
since the volume of the direct runoff hydrograph must be equal to x2, the direct runoff
hydrograph base time is Tb = 2x2/x, and its time to peak is TP = 3x2/4x,.
Use of the SCS triangular hydrograph in this way implies that flood hydrographs Will
always have only a single peak. Should it be desired to permit the possibility of multiple
peaks, a greater dimensionality would need to be considered for the random vector x.
(iv) Reset-voirinitial and boundary conditions
The reservoir considered in this example is of a very simple nature, but is adequate to serve
the demonstration purposes of this presentation. The reservoir is considered to have vertical
sides, a single outlet gate whose opening is controllable, and an emergency overflow spillway
which is modeled as a weir. For ease of presentation, the outlet gate opening amount is
assumedto be fixed throughout the passageof a flood event through the reservoir. As noted
earlier, this is not a limitation of the method described in this paper as additional reservoir
variables could be included to account for the rates and/or times of change of gate opening
amounts.
For the caseof this simple reservoir, the reservoir variables comprising the vector 1 are
the initial reservoir depth (an initial condition), denotedby 1,, and the outlet gate opening area
(a boundary condition), denoted by AZ. Modeling of the distributions of these variables is
accomplishedby defining a dimensionlessinitial reservoir depth D. and a dimensionless gate
314
opening area A. defined by
D. = &lD,
(18.8)
A. = 12/A,
(18.9)
and
The terms D, and A, in these expressionsdenote, respectively, the full reservoir depth (to the
crest of the emergencyspillway), and the full gate opening area. Reservoir depth is measured
with respect to the outlet gate opening, whose hydraulic behavior is modeled as an orifice.
Whereas both of the variables D. and A. are defined only on the inter-val [O,l], they are
modeledarbitrarily in this example using the beta distribution. For illustrative pur-poses,the
marginal density of the dimensionless depth is taken to be
fD(D.) = 30.2
(18.10)
It is clear from this definition of the marginal density that a full reservoir is the most probable
initial condition of the reservoir when a flood event occurs.
The distribution of the dimensionless gate opening amount is assumedin this example to
dependonly on the dimensionless depth. Its conditional density function is assumedto have
the form
&,(A.
ID.)
= A.=-‘( 1-A.)‘-?(a
+ p)/lr(a)r(p)]
(18.11)
where
a = 1+
9D.
(18.12)
P = 10 -
9D.
(18.13)
This specitïcation of the conditional distribution of outlet gate opening amounts states that
when the reservoir is empty, a zero gate opening amount is the most probable situation. When
the reset-voir is full; a full gate opening amount is the most probable situation, and when the
reset-voir is half full, the most probable gate opening amount is also one-half of the full
amount. Figure 18.6 is an illustration of a histogram that is representative of the joint
distribution of D. and A. as defined by equations (18.10) through (18.13). The heights of the
columns; i.e. the (ai) values for use in equation (18.3), were determined by numerical
integration. Modeling of the joint distribution of D. and A. in this way is again only an
expedientthat has been employed for this illustrative example. In applications it would likely
be preferable to mode1the joint distribution using a nonparametric kemel method.
It is clear that the distributions of the random variables D. and A., and hence of the
variables )Li and A,, Will depend on the operating policy in effect for the reservoir. Changes
in the operatingpolicy, if and when they occur, Will result in changesin these distributions and
hencein changesin the downstreamregulatedflood frequency relationship. Where actual data
relevant to reset-voir conditions are not available to permit the estimation of the joint
distribution of reservoir conditions, or in caseswhere one might be interested in predicting the
effects that would occur as a consequenceof operational changes, one cari resort to methods
315
-.
-
0 41
i
Figure
18.6 Joinf dett.siry&rtction
of A. and D. for example probkm
of simulation to derive the necessary data. Note, however, that the inabilities of simulators
to accurately depict the whole range of streamflow responses is not an issue in this case. This
is SObecause reservoir conditions at the beginning of flood events are c3ntrolled by antecedent
conditions, and these in tum tend to be dominated by relatively average streamflow conditions.
Hydrologie simulators are quite good at being able to reproduce system behaviors in such
situations.
The discharge from an orifice with an opening area 1, and a discharge coefficient C, when
the head on the orifice, i.e. the reservoir depth, is equal to h is
QO=
(18.14)
C,A,J(2gh)
The discharge from a rectangular weir of length L with a weir coefficient C, and a head h, is
Q,
(18.15)
= C,&h;12
For the purposes of this example, C,, = 0.6 is used in the orifice equation (18.14) for a
representation of the reservoir’s principal spillway (a conduit type of spillway), and the weir
equation (18.15) with C, = 3 and L = 50 m is used for the overflow spillway. Weir flow is
assumed to occur only if the reservoir is surcharged during a flood event such that the depth
h becomes greater than the full depth DP In such cases the head on the weir is taken to be h,
= h’- D,. Other variables pertinent to the reservoir used in the simulations are presented in
table 18.1.
316
Tahle 18.1 Reset-voir propoerties for example problem
Property
Symbol
Value employed
Full reset-voir depth
Full gate opening area
Reset-voir surface area
Overflow weir length
Orifice discharge coefficient
Weir coefficient
Dl
4
A,
L
c,
cw
60 m
5 m’
1.05 x 10’ m’
50 m
0.6
3.0
(Y) Simulation procedure
The simulation procedure that should be employed to compute the regulated flood frequency
curve for a given reservoir operating policy depends on whether the random vectors x and I
are independentor correlated. Since it has been assumedthroughout this example application
that they are independent,that procedureWill be given first. The procedure for the case where
they are correlated Will then be given.
A step-by-step procedure which may be followed for the case where the independence
of x and A is true is as follows:
(1)
Develop an estimator of the distribution FAX) of the upstream unregulated floods.
Also develop an estimator of the density fA(A) of reservoir initial and boundary
conditions. These estimators may be developed using either parametric or
nonparametric techniques.
(2) Randomly sample values of x, and x2 from the distribution of unregulated flood
characteristics. Construct the direct runoff componentof a synthetic reservoir inflow
hydrograph using these two values, and add base flow to obtain the total synthetic
inflow hydrograph.
(3) Randomly sample values of D. and A. from the distribution of dimensionless
reservoir conditions, and compute values of A1 and A2 using equations (18.8) and
(18.9).
(4) Route the inflow hydrograph through the resexvoir using the continuity equation
dhldt = [I(t) - Q(h)]/A,
(5)
(18.16)
where h is the reservoir depth at time t, I(t) is the synthetic inflow hydrograph
developedin step (2), Q(h) is the depth-dependentreservoir outflow rate, and A, is
the reservoir surfacearea. Integration of equation (18.16) was accomplished for this
example using a predictor-corrector, or Heun, method (Chapra and Canale, 1988)
with a time step of At = TJlO.
Repeat steps (2) through (4) many times (say N times) to obtain N outflow
hydrograph peaks. Rank and assign plotting positions to these values and use them
to empirically define the regulated flood frequency distribution Fyo1). The value of
Nshould be chosen sufficiently large that the empirical distribution is not sensitive
to small variations in N; it is suggestedthat N should be at least several thousand.
When performing steps (2) through (4) in the above procedure, one could also obtain N
regulatedflood hydrograph volumes as well. One would then have the necessaryinformation
317
to empirically quantify the joint distribution of both regulated flood peaks and volumes.
It may be noted that this procedure applies the theorem of total probability in a rather
implicit sort of way. An alternative and more explicit application of the theorem may be
accomplished through discretization of the joint density of reservoir conditions in a manner
similar to that shown in figure 18.6, and use of equation (18.3). This alternative procedure
was used to generate figure 18.5 as it yields the conditional distributions, which may be of
interest in some applications, as well as the final unconditional distribution.
The simulation procedure that should be used when the random vectors x and A are
correlated is essentially the same as !!tat given above for the independent case. The primary
difference is that one would first develop an estimator for the joint distribution F,(x,À) of
both flood characteristicsand reservoir conditions. The values of xi, xz, Ii, and AZwould then
all be sampledfrom that distribution. The remaining stepsof the procedure would be the same
as for the independent case.
18.3.4 Discussion
The result of the application of the step-by-stepprocedure discussedin the previous subsection
is shown in figure 18.7. The dotted curve shown there is the marginal distribution of
unregulated flood peaks upstream of the reservoir, and the solid curve is the marginal
distribution of regulated flood peaks immediately downstream of the reset-voir. For reasons
discussed in Section 18.2.4, it is not clear that the regulated flood distribution shown is
sensicalin the left-hand portion of the figure. However, the right-hand portion of tlrc ti~urc,
which is the region of prime interest in applications, does make sense. Indeed, it may be
observed that the two frequency curves Will converge to one another as the flood magnitude
increases, i.e. as the exceedanceprobability decreases. This must be SObecause of the
diminishing effect of a reservoir in flood peak attenuation as the flood magnitude increases.
The fact that this consistencyis attainedis made possible only becauseof the integrated nature
of the approach. In effect, the integratedapproachis able to preserve the physical linkage that
must exist between the two flood frequency relationships.
An additional point worthy of note is that the simulation procedures described above are
very well suited to implementation in parallel processing environments. This is clearly
desirablebecauseof the computationalintensivenessof the required Monte Carlo simulations.
18.4 Regionalization of Frequency Information
18.4.1 Overview
Regionalization techniques in the field of flood frequency analysis are motivated by the
recognition that quantile estimates based only on at-site data, because of the shortness of
streamflow records and the need to extrapolateto long recurrence inter-vals, have large degrees
of variability, and hence uncertainty, becauseof sampling variations. The use of historical
data cari be employed to. ameliorate these problems to some degree, but the practice of
regionalizing flood fmquency behavior is likely the more common approach. Where possible,
the use of both historical data and regionalization should be employed.
TO a certain degree, the use in hydrology of the term regionalization has corne to refer to
two different but related techniques. This is rather unfortunate, and it has likely led to some
confusion among practitioners. In the first type of regionalization, one is interested in
predicting flood quantilesat ungagedsites. While this cari be accomplished using rainfall and
318
3
Exceedanco
20
probability,
Figure 18.7 Regulated (solid) and unregulated
5
10
2
1
percent
(dotted) jloodji-equency
distributions
runoff modeling methods (see Section 18.2.3), the term regionalization usually refcrs to the
use of multivariate regression models (Benson, 1962). The U.S. Geological Survey has
devoted a considerableamount of effort to develop such models for use throughout the United
States. The second type of regionalization, which is the more prominent one in the recent
flood frequency literature that has appeared in the archiva1 joumals, involves the use of
information at gaging sites remote from the one of primary interest to improve the statistical
properties of quantile estimators. The focus here is on improving estimates at gaged sites,
though it is recognized that this should ultimately enable improved estimates at ungaged sites
to be obtained as well. There are a number of methods that have been proposed for
accomplishmentof this second type of regionalization. The most prominent among them are
the index flood method (DaQmple, 1960) and regionalization of distributional parameters
(Houghton, 1978a,b) and statistics(namely skewness)(Hardison, 1974; Tasker, 1978), though
this latter method may be counterproductive (Landwehr, Matalas and Wallis, 1978).
It is the objective of this section of this paper to show how the integrated modeling
framework developed in Section 18.3.2 may be employed for regionalization. The issues
motivating this additional application area are discussedin the following Section 18.4.2, and
Section 18.4.3 provides an overview of the extension of the approach to the problem of
regionalization. It is noted here at the outset that the regionalization method suggestedhere
cari be employed for both types of regionalization problems mentioned above. That is to say,
it cari be employed for the estimation of flood frequency relationships at ungaged sites, and
it cari also be employed to improve the estimates at gaged sites. Section 18.4.4 remarks on
the statistical estimation gains which may be realized in the latter type of regionalization.
18.4.2 Motivation
There are two primary issues that are motivating the extension of the integrated modeling
framework to permit it to be employed in a regionalization context as well. The first issue
319
_-~~7.
--
------
motivating this discussion stems from the recognition made in Section 18.3.4 that the
integratedmodeling approach cari preserve the physical linkage that must exist between flood
frequency relationships at different spatial locations (in that case, at locations upstream and
downstream of a regulating reservoir). This leads one immediately to ponder whether the
same approach might be useful for regionalization of flood frequency information. It is
maintained by this author that the answer to this must be in the affirmative, and that the
integrated modeling framework which has been devised is essentially a “comprehensive
statistical model” as called for by the National Research Council (NRC, 1988).
The second motivating issue stems from some perceived shortcomings in the currently
applied regionalization procedures, most notably in the index flood method. This method is
purely statistical and makes use of some very rigid and ud hoc assumptions which tend to be
very difficult to rationalize and validate based on physical and hydrologie reasoning. In
particular, the index flood method presumesthat the flood frequency distributions at all sites
in a homogeneousregion are identical except for scale. In other words, it is assumedthat all
sites in the region have the same coefficients of variation and skewness. Other statistical
methodsof regionalization involve similar assumptionsas to the spatial stationarity of one or
more statisticalcharacteristics. Lettenmaier, Wallis and Wood (1987) and Hosking and Wallis
(1988) have shown that the index flood method is reasonably robust to departures from truly
homogeneousregions, but this is still not very comforting in view of the lack of any physical
or hydrologie reasoning to support it. In fact, it is argued shortly that physical reasoning
implies that the index flood method is not suitable for flood frequency regionalization, despite
the fact that it sometimes seemsto work reasonably well.
An additionaI issueconfounding statisticalmethodsof regionalization, and one of the most
difficult to overcomein practice, is that of the need to identify homogeneousregions of gaging
sites. A number of methodshave been presented in the literature for accomplishing this task,
but they again tend to be purely statistical in character. Most frequently, the pooling of sites
into homogeneousregions is basedon whether significant differences cari be discemed between
like statistics computed for different sites. Unfortunately, the statistics of interest in this
respectare usually the moment or L-moment ratios of relatively high order, and these tend to
have sufficiently large sampling variantes that any tests for discrimination which might be
devisedare necessarily not very powerful. In effect, subtle differences in statistics from one
,site to another cari be very difficult to detect. Such methods of pooling sites into regions are
basedentirely on statisticalconsiderations,and take no account of the physics of flood events.
The only ways in which the most common assumptionsused in regionalization cari be justified
are basedon statisticalarguments,and thesemust be consideredto be weak becauseof the lack
of power of discriminating tests.
TO illustrate the type of problem that cari arise, consider a gaging site for which the
random variable representingannual flood peaks is denoted as X. Consider also an additional
site downstrearnand along the same stream, and denote the same random variable there as Y.
Becausethesetwo sites are along the same stream, and therefore are nearly identical in terms
of their flood frequency behavior, most would agree that these two sites should be ‘pooled
together into thesame homogeneousregion. Inde& it is difficult to imagine a case where two
sites would be considered more homogeneous. Now, if the random variable Y at the
downstreamsite is a simple linear function of the random variable X at the upstream site, i.e.
if Y = cX, where c is a constant, then it is easy to show that the coefficients of variation and
skewness for the two sites are identical. In this case the common index flood assumption
would be justifiable, at least on statistical grounds. If, on the other hand, however, the
physical linkage that must exist between the two sites indicates that the relationship is more
320
likely nonlinear, such as Y = axb, b # 0,1, then the index flood assumption would be invalid;
Given that the hydrologie and hydraulic behaviors of real rivers and streams are generally
nonlinear, this observation casts some serious doubt on the suitability of the index flood
method.
18.4.3 Extension of integrated modeling
As already noted in the previous section, the recognition of the ability of the integrated
deterministic-stochasticmodeling framework to preservethe physical linkage between different
sites leads one to ponder its potential application for regionalization as well. In the present
section is considered an approach which may be employed for development of a flood
frequencyrelationship for an ungagedsite. This is the first of the two types of regionalization
that were discussedin Section 18.4.1.
TO accomplish this estimation at an ungaged site, it Will be necessary(at least initially)
to consider sites only on streams on which there is also a gaged site. Denote the gaged
location as site X, and denote the flood frequency distribution which may be estimated from
the records for that site,asFAX). Denote the ungagedlocation as site Y, and denote the desired
flood frequency distribution at that site as Fr@). Denote the joint density of initial and
boundary conditions relevant to the stream reach between the two sites asfA(
It is clear that this notation is virtually identical to that which was employed for
developmentof regulatedflood frequency curvesin Section 18.3. The elements of the random
vectors x and y will again refer to instantaneousflood peaks, flood volumes, and possibly other
flood hydrograph characteristics. In the present regionalization case, however, the elements
of the vector A of initial and boundary conditionswill have somewhatdifferent meanings. One
of the elements in this vector Will be the initial stage or discharge in the river reach at the
beginning of flood events, and the remaining elements Will correspond to both boundary
conditions and forcing relevant to the streammach behveenthe two sites. Boundary conditions
may exist within the reach or may exist somewhereoutside of the reach, but they should be
chosen such that they do in fact have an effect on the hydraulic behavior of the reach. An
example of a boundary condition outside of the reach would be one in which the stream
dischargesinto a large Me, and in which the lake causesa backwater effect within the stream
reach of interest. Forcing that would be relevant to the reach would consist of lateral inflows
and/or outflows to and from the reach. These could be accounted for using a runoff model.
The procedure for deriving the desired flood frequency distribution E’Jy) in this
regionalization, or information transfer, application would be essentially the same as that used
to derive a regulated flood frequency distribution in Section 18.3. The only real difference
is that channel routing would be used instead of reservoir routing. One could also choose
betweenhydrologie and hydraulic routing schemes(this is true as well for the reservoir case,
but there one would almost always choose a simple hydrologie router). If the ungaged site
were upstream of the gaged site, then inverse flood routing would need to be accomplished.
An extension of this information transfer idea for ungaged sites could also be extended
to ungaged sites at other locations within a drainage network. That is, it is not absolutely
necessarythat the ungagedsite be on the same link in the overall network as is the gaged site.
This type of an application would, however, require the consideration of the complicating
factors at confluencesof streams. As shown by Dyhouse (1985), however, this is yet another
area in which the theorem of total probability finds application. In effect, the theorem of total
probabihty, when used in conjunction with other, deterministic hydrologie and hydraulic tools,
cari be employeclto facilitate the prediction of the flood frequency behavior almost anywhere
321
in a streamnetwork basedon knowledge of the behavior at one or more other locations in the
network. This modeling framework is therefore extremely powerful, but its potential is
currently limited by the loss of variante problem associatedwith the runoff modeling tools that
would be necessaryto account for lateral inflows and outflows.
18.4.4 Variante
reduction through optimal interpolation
The problems that arise in the regionabzation, or information transfer, problem as a
consequence of the need to use runoff models cari be overcome by considering gaged sites
only. That is, rather than employing one gaged site and one ungaged site in the modeling
effort, one cari employ two gaged sites.
Without loss of generality, consider two sites X and Y on the same stream link, and
assume that site X is upstream of site Y. Because both sites are gaged and hence have
streamflow records that have been collected for some period of time, one cari estimate their
respective flood frequency distributions F&) and FJy) using standard methods of statistical
analysis. One cari also, becauseof the records available, quantify the joint densityf,(A) of
streamflow conditions and incrementalflows between the two sites. This would require some
use of a routing mode1to account for peak attenuation within the reach, but it would obviate
the requirement of a runoff simulation model.
Now, given the flood frequency distribution F&) and the joint densityf,(l), one could
again employ the integrated modeling framework to develop a flood frequency distribution at
site Y. Since this derived distribution Will be different from the distribution determined from
the records at site Y, it will be denotedhere as FAz). The net result of this exercise is that one
will have two estimatorsfor the flood frequency distribution at site Y. That is, one Will have
redundantestimators for various flood quantiles. Based on the ideas of optimal interpolation
(Gelb, 1989), one could then combine the redundant estimators for any desired quantile in a
linear fashion SOas to develop a quantile estimator with a smaller variante than that possessed
by either of the two original estimators. The improved flood frequency distribution at site Y
might then be employed with a reverse application of modeling to improve the distribution at
site X. This might then be used again to improve the estimator at site Y, and SOon in an
iterative way.
The net result of this application is that one cari accomplish the most fundamental
objective of regionalization, namely that of improving the statistical properties of quantile
estimatorsby permitting information at sitesremote fi-om the one of immediate interest to have
some bearing on the estimation process. In contrast to purely statistical methods of
regionalization, however, the integrated modeling approach accomplishes the task in a
meaningful way.
18.5 Summary
It bas been argued in this paper that an integrateddeterministic-stochasticmodeling framework
may be employed to consistently and effectively approach some of the more difficult and
elusive problems in the field of flood frequency analysis. In particular, it cari be employed to
develop flood frequency curves at regulated sites downstream of dams and reservoirs, and it
cari also be used for the transfer of information from one spatial location to another. It
combines the best features of both statistical and deterministic modeling tools, and moulds
them into a new tool whose power is arguably greater than that of the sum of its component
parts. In effect, it establishesa framework for a “comprehensive statistical model” (NRC,
322
1988) which car-tbe employed to resolve the differences between the statistical and runoff
modeling approachesto flood frequency analysis.
It is important to recognizethat the developmentalaspectsof the integrated modeling tool
are by no means complete. Several assumptions have been made in the discussionsin this
paper, and more work is necessary to generalize the method even further. Of particular
relevance in this respect are the issue of independenceof regulated armual floods, as well as
the treatmentof partial duration series. Additional work related to regionalization (information
transfer) should also be given a high priority.
Flood frequency modeling with the integrated tool involves the use of multivariate
probability distributions. Thesedistributions are considerably more difficult to work with than
are univariate models, and are therefore more exacting in terms of the educational background
requirements on the part of mode1users. Multivariate modeling is also more demanding in
terms of data requirements(the amount of data needed), and this is certainly a cause for some
concem, particularly in an application area such as flood frequency analysis where there never
seemsto be enough data. Planet Earth Will continue to tum, however, and data Will continue
to be collected. At the same time, more and more rivers and streams Will become regulated,
and the need to be able to estimate regulated flood frequency relationships Will become more
acute. But what are the most important types and quantities of data that Will be needed to
accomplishthis estimation? The integrated modeling approach presented here is a tool which
cari be applied in a systematicway to answer this question. Use of this modeling approach cari
thereforebe employed as a guide to point the way in future data collection and archival efforts.
323
.-
Bibliography
Adamowski, K. (1985) Nonparametric kemel estimation of flood frequencies. Water Resour.
Res., 21, 18851890.
Adamowski, K., and W. Feluch. (1990) Nonparametric flood-frequency analysis with
historical information. ASCE Jour. Hydr. Engr., 116, 10351047.
Benson, M. A. (1962) Evolution of methodsfor evaluaring the occurrence of jloods. U.S.
Geological Survey Water Supply Paper 1580-A, Washington, D.C.
Bob& B., and F. Ashkar. (1991) The gammafamily ami derived distributions in hydrology.
Water ResourcesPublications, Littleton, Colorado.
Bradley, A.A., and K.W. Potter. (1992) Flood frequency analysis of simulated flows. Water
Resour. Res., 28, 23752385.
Bras, R.L., D.R. Gaboury, D.S. Grossman,and G.J. Vicens. (1985) Spatially varying rainfall
and floodrisk analysis. ASCE Jour. Hydr. Engr., 111, 754-773.
Chapra, S.C., and R.P. Canale. (1988) Numerical methodsfor engineers. 2nd ed. McGrawHill, New York.
Cleveland, W.S. (1979) Robust locally-weighted regression and smoothing scatterplots. Jour.
Amer. Stat. ASSOC.,74, 829-836.
Dalrymple, T. (1960) Floodfiequency analyses. U.S. Geological Survey Water Supply Paper
1543-A, Washington, D.C.
Diehl, T., and K.W. Potter. (1986) Mixed flood distributions in Wisconsin, paper presented
at the International Symposiumon Flood Frequency ami Risk Analysis, Louisiana State
University, Baton Rouge, Louisiana.
Durrans, S.R. (1994) Bayesian approach to skewness bias correction for Pearson Type 3
populations. Jour. Hydrol., 161, 155-168.
Dyhouse, G.R. (1985) Stage-frequencyanalysis at a major river junction. ASCE Jour. Hydr.
Engr., 111, 565-583.
Eagleson, P.S. (1972) Dynamics of flood frequency. Water Resour. Res., 8, 878-898.
Franz, D.D., B.A. Kraeger, and R.K. Linsley. (1986) A system for generating Iong
streamflow records for study of floods of long retum period, paper presented at the
International Symposium on Flood Frequency ami Risk Analysis, Louisiana State
University, Baton Rouge, Louisiana.
Gelb, A. (ed.) (1989) Applied optimal estimation. The MIT Press, Cambridge, Mass.
Haan, CT. (1977) Statistical methodr in hydrology. Iowa State University Press, Ames.
Hardison, CH. (1974) Generalizedskew coefficients of annual floods in the United States and
their application. Water Resour. Res., 10, 745.
Hazq A. (1930) Floodflows: A study offiequencies and magnitudes. John Wiley and Sons,
New York.
324
--~
-..
Hirschboeck, K.K. (1985) Hydroclimatology offlow eventsin the Gila River basin, central and
southem Arizona, Ph.D. Dissertation, University of Arizona, Tucson, Arizona.
Hirschboeck, K.K. (1986) Hydroclimatically-defined mixed distributions in partial duration
flood series,paper presentedat the International Symposiumon Flood and Risk Analysis,
Louisiana State University, Baton Rouge, Louisiana.
Hosking, J.R.M., and J.R. Wallis. (1988) The effect of intersite dependenceon regional flood
frequency analysis. Water Resour. Res., 24, 588-600.
Houghton, J.C. (1978a) Birth of a parent: The wakeby distribution for modeling flood
flows. Water Resour. Res., 14, 1105-l 110.
Houghton, J.C. (1978b) The incomplete means estimation procedure applied to flood
frequency analysis. Water Resour. Res., 14, 111l-l 115.
Hoyt, W.G., and W.B. Langbein. (1955) Floods. Princeton University Press, New Jersey.
Jarrett, R.D., and J.E. Costa. (1982) Multi-disciplinary approach to the flood hydrology of
foothill streams in Colorado, in International Symposium on Hydrometeorology, AI.
Johnson and D.A. Clark (eds.), 565-569, Amer. Water Resour. Assoc., Bethesda,
Maryland.
Jennings,M.E., and M.A. Benson. (1969) Frequency curves for annual series with some zero
events or incomplete data. Water Resour. Res., 5, 276-280.
Kite, G.W. (1977) Frequency and risk analyses in hydrology. Water ResourcesPublications,
Littleton, Colorado.
Kuczera, G. (1982) Robust flood frequency models. Water Resour. Res., 18, 315-324.
Lall, U., and L.R. Beard. (1982) Estimation of PearsonType 3 moments. Water Resour. Res.,
18, 1563-1569.
Lall, U., and K. Bosworth. (1994) Mutivariate kemel estimation of functions of space and
time, in T%e SeriesAnalysis in Hydrology and Environmental Engineering, K. W. Hipel,
A.I. McLeod, U.S. Panu, and V.P. Singh (eds.), 301-315, Kluwer Academic Publishers,
Dordrecht, The Netherlands.
Landwehr, J.M., N.C. Matalas, and J.R. Wallis. (1978) Some comparisons of flood statistics
in real and log space. Water Resour. Res., 14, 902.
Langbein, W.B. (1958) Queuing theory and water storage.Jour. Hydr. Div., ASCE, 84, 181l1 to 1811-24.
Laurenson,E.M. (1973) Effect of dams on flood frequency. Proc., International Symposium
on River Mechanics, 9-12 January, Bangkok, Thailand, International Association for
Hydraulic Research.
Laurenson, E.M. (1974) Modeling of stochastic-deterministic hydrologie systems. Water
Resour. Res., 10, 955-961.
Lettenmaier, D.P., J.R. Wallis, and E.F. Wood. (1987) Effect of regional heterogeneity on
flood frequency estimation. Water Resorrr. Res., 23, 3 13-323.
Loucks, D.P., J.R. Stedinger, and D.A. Haith. (1981) Water Resource SystemsPlanning and
Analysis. Prentice Hall, Englewood Cliffs, New Jersey.
Mimikou, M. (1983) A study of drainage basin linearity and nonlinearity. Jour. Hydrol., 64,
113-134.
Moon, Y-.-I., and U. Lall. (1994) Kernel quantile function estimator for flood frequency
analysis. Water Resour. Res., 30, 3095-3 103.
Moran, P.A.P. (1959) The theory of storage. Methuen, London.
Moughamian, M.S., D.B. McIaughlin, and R.L. Bras. (1987) Estimation of flood frequency:
an evalualion of two derived distribution procedures. Water Resour Res., 23, 1309-1319.
325
Muzik, 1. (1994) Understandingflood probabilities, in atreme Values: Floods and Droughts,
K.W. Hipel (ed.), 199-207, Kluwer Academic Publishers, Dordrecht, The Netherlands.
NEK (1988) Estimatingprobabilities of extremefloods: methods and recommendedresearch.
National Research Council, Washington, D.C.
Rodriguez-Iturbe, I., and J.B. Valdés. (1979) The geomorphologic structure of hydrologie
response. Water Resour. Res., 15, 1409-1420.
Rogers, W.F. (1980) A practical mode1for liiear and nonlinear runoff. Jour. Hydrol., 46, 5 l78.
Rogers, W.F. (1982) Some characteristics and implications of drainage basin linearity and
nonlinearity . Jour. Hydrol., 55, 247-265.
Rogers, W.F., and H.A. Zia. (1982) Linear and nonlinear runoff from large drainage basins.
Jour. Hydrol., 55, 267-278.
Sanders,CL., Jr., H.E. Kubik, J.T. Hoke, Jr., and W.H. Kirby. (1990) Floodfrequency of
the SavannahRiver at Augusta, Georgia. U.S. Geological Survey Water Resour. Invest.
Rpt. 90-4024.
SCS (1969) National Engineering Handbook, Section 4, Hydrology. U.S. Soi1 Conservation
Service, Washington, D.C.
Silverman, B.W. (1986) Density estimationfor stafisticsand data anai’ysis.Chapman and Hall,
New York.
Singh, K.P., and R.A. Sinclair. (1972) Two-distribution method for flood-frequency analysis.
ASCE Jour. Hydr. Engr., 98, 29-45.
Singh, V.P., and H. Amin&. (1986) An empirical relation between volume and peak of
direct runoff. Water Resour. Bull., 22, 725-730.
Stuart, A., and J.K. Ord. (1987) Kendall’s advanced theory of statistics. Oxford University
Press, New York.
,
Tasker, G.D. (1978) Flood frequency analysis with a generalized skew coefficient. Water
Resour. Res., 14, 373.
Thomas, W.O., Jr. (1982) An evaluationof flood frequency estimatesbased on rainfall/runoff
modeling. Water Resour. Bull., 18, 22 l-230.
Waylen, P., and M.-K. Woo. (1982) Prediction of annual floods generated by mixed
processes. Water Resour. Res., 18, 1283-1286.
326