Journal of Hydrology (2008) 360, 67– 76 available at www.sciencedirect.com journal homepage: www.elsevier.com/locate/jhydrol Homogeneity testing: How homogeneous do heterogeneous cross-correlated regions seem? A. Castellarin a,* , D.H. Burn b, A. Brath a a DISTART, School of Civil Engineering, Viale Risorgimento, 2 University of Bologna, I-40136 Bologna, Italy Department of Civil and Environmental Engineering, University of Waterloo, 200 University Avenue West, Waterloo, ON, Canada N2L 3G1 b Received 30 November 2007; received in revised form 8 July 2008; accepted 11 July 2008 KEYWORDS Regional flood frequency analysis; Probability weighted moments (PWM); L-moments; Variance of sample estimators; Hosking and Wallis heterogeneity test Summary The homogeneity of the flood frequency regime for a given pooling-group of sites is a fundamental assumption for many regional flood frequency analysis techniques. Assessing regional homogeneity is a critical step, which may be complicated by the presence of cross-correlation among flood sequences. The scientific literature proposes a number of statistical homogeneity tests and documents that inter-site correlation of floods is normally not negligible, but does not specifically address the impact of cross-correlation on such statistical tests. This paper analyzes the effectiveness of a well-known homogeneity test proposed in the scientific literature in the presence of inter-site cross-correlation through a series of Monte Carlo experiments. The numerical experiments enable us to comment on a possible theoretical correction for the test and to identify an empirical tool that accounts for the impact of inter-site cross-correlation of floods. ª 2008 Elsevier B.V. All rights reserved. Introduction A crucial task in designing, constructing and operating river engineering works or hydraulic structures is flood risk assessment, which is usually quantified for a given site as the flood magnitude associated with the recurrence interval T (the so-called T-year flood). Regional (or pooled) flood * Corresponding author. Tel.: +39 051 209 3365. E-mail address: [email protected] (A. Castellarin). frequency analysis is widely employed in the estimation of the flooding potential to avoid unreliable extrapolation when dealing with data record lengths that are short as compared to the recurrence interval of interest. The traditional approach to regional flood frequency analysis involves the identification of regions, or poolinggroups, of sites that are homogeneous in terms of flood frequency regime (see e.g., Dalrymple, 1960; Burn, 1990). The homogeneity of the group of sites is a fundamental requirement in order to perform an effective regional estimation of the T-year quantile (e.g., Lettenmaier et al., 1987; Stedinger and Lu, 1995). 0022-1694/$ - see front matter ª 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.jhydrol.2008.07.014 68 The literature proposes a number of homogeneity tests; a few examples are summarised here. Since the introduction of the first statistical tests for assessing the homogeneity degree of a given group of sites, hydrologists were very clear about the necessity of identifying effective measures of regional heterogeneity. Dalrymple (1949, 1960) proposed a test, described in several classical textbooks (e.g., Chow, 1964; Singh, 1992), that judges homogeneity by analysing the variability of the sample coefficients of variation (Cv) and/or skewness (Cs) among sites. The Dalrymple test became very popular among practitioners, until Wiltshire (1986a,b,c) highlighted its limited discriminatory power and proposed two alternatives. Lu (1991) and Lu and Stedinger (1992) recommended a homogeneity test based upon the variability of normalised at-site Generalized Extreme Value (GEV, see e.g. Jenkinson, 1955) distribution flood quantiles. Hosking and Wallis (1993, 1997) proposed a heterogeneity measure that is a standardised measure of the intersite variability of L-moment ratios (see e.g., Hosking, 1990). The Hosking and Wallis (1993, 1997) heterogeneity measures are now routinely used by hydrologists to test regional homogeneity (see e.g., Viglione et al., 2007). Even though a plethora of homogeneity tests have been proposed, the literature presents only a few studies that compare systematically the power of tests (see e.g., Fill and Stedinger, 1995; Viglione et al., 2007). The recent analysis by Viglione et al. (2007) compares the performance of L-moment based tests (e.g., Hosking and Wallis, 1993) and non-parametric rank tests (Scholz and Stephens, 1987; Durbin and Knott, 1971) and shows that rank based tests outperform L-moment based tests for highly skewed flood frequency regimes. Classical studies document that intersite correlation among flood flows observed at different sites is typically not negligible (see e.g., Matalas and Langbein, 1962; Stedinger, 1983). The impact of cross-correlation on trend tests is clearly pointed out by Douglas et al. (2000). Research by Madsen and Rosbjerg (1997) and Madsen et al. (2002) indicates that cross-correlation can affect homogeneity testing. Recently it has been shown how a probabilistic statement can be attached to regional envelope curves (RECs) identified for homogeneous regions (Castellarin et al., 2005; Castellarin, 2007; Vogel et al., 2007). The authors present a probabilistic interpretation of RECs and propose an empirical estimator of the exceedance probability of a REC that takes into account the effect of intersite correlation. Regional homogeneity is a fundamental prerequisite for applying the estimator proposed by the authors, thus homogeneity testing in presence of cross-correlation becomes a critical step. We assess how intersite correlation impacts the homogeneity test proposed by Hosking and Wallis (1993, 1997). This issue is discussed through theoretical considerations of the variance of a regional estimator of L-moments for cross-correlated annual sequences and a series of numerical experiments. The results of the study enable us to: (1) show and quantify the loss of performance of the test associated with the presence of intersite correlation; (2) comment on a possible theoretical correction of the test; and (3) identify an empirical tool for adjusting the original test for cross-correlated regions. A. Castellarin et al. Hosking and Wallis homogeneity test Hosking and Wallis (1993, 1997) proposed a statistical test for assessing the homogeneity of a group of basins at three different levels by focusing on three measures of dispersion for different orders of the sample L-moment ratios (see Hosking, 1990, for an explanation of L-moments). 1. A measure of dispersion for the L-Cv , R R X X 2 V1 ¼ ni ðt2ðiÞ t2 Þ ni : i¼1 ð1Þ i¼1 2. A measure of dispersion for both the L-Cv and the L-Cs coefficients in the L-Cv–L-Cs space , R R h i1=2 X X 2 2 V2 ¼ ni ðt2ðiÞ t2 Þ þ ðt3ðiÞ t3 Þ ni : ð2Þ i¼1 i¼1 3. A measure of dispersion for both the L-Cs and the L-kurtosis coefficients in the L-Cs–L-kurtosis space , R R h i1=2 X X 2 2 V3 ¼ ni ðt3ðiÞ t3 Þ þ ðt4ðiÞ t4 Þ ni ; ð3Þ i¼1 i¼1 where t2 , t3 , and t4 are the group mean of L-Cv, L-Cs, and L-kurtosis, respectively; t2(i), t3(i), t4(i), and ni are the values of L-Cv, L-Cs, L-kurtosis and the sample size for site i; and R is the number of sites in the pooling group. The underlying concept of the test is to measure the sample variability of the L-moment ratios and compare it to the variation that would be expected in a homogeneous group. The expected mean value and standard deviation of these dispersion measures for a homogeneous group, namely lV k and rV k , are assessed through repeated simulations, by generating homogeneous groups of basins having the same record lengths as those of the observed data. To avoid any undue commitment to a particular three-parameter distribution, the authors recommend the four-parameter kappa distribution to generate the synthetic groups of flood sequences. The kappa distribution includes as special cases several well known two- and three-parameter distributions (see e.g., Hosking and Wallis, 1997; Castellarin et al., 2007). The heterogeneity measures are then evaluated using the following expression Hk ¼ V k lV k ; rV k for k ¼ 1; 2; 3: ð4Þ Hosking and Wallis suggest that a group of sites may be regarded as ‘‘acceptably homogeneous’’ if Hk < 1, ‘‘possibly heterogeneous’’ if 1 6 Hk < 2, and ‘‘definitely heterogeneous’’ if Hk P 2. According to Hosking and Wallis (1993, p. 277–278), ‘‘if H were used as a significance test, then the criterion for rejection of the hypothesis of homogeneity at significance level 10%, assuming normality for the distribution of V would be H = 1.28. In comparison, a criterion H = 1 may seem very strict, but as noted above, we do not seek to use H in a significance test’’. The authors regard the reference values as guidelines instead, regarding for instance the amount H = 1 as the borderline of whether a redefinition of Homogeneity testing: How homogeneous do heterogeneous cross-correlated regions seem? Cross-correlated Region H values: the region may lead to a meaningful increase in the accuracy of the quantile estimate. Concerning the possible effects of cross-correlation on the test the authors state that positive correlation among sites is the most likely cause for negative values of Hk, and large negative values, say Hk < 2, are likely to be associated with a large amount of cross-correlation (Hosking and Wallis, 1997, p. 71). Since the synthetic sequences are uncorrelated by definition, the sample variability of L-moment ratios for the synthetic group of sites is expected to be higher than the sample variability for the original sequences when cross-correlation is present, even when real and synthetic regions are characterised by the same degree of heterogeneity. Therefore, cross-correlation may result in large negative values of Hk when the group of sites is homogeneous as suggested by the authors. More importantly, lower (rather than negative) values of Hk may cause a miscategorisation of the group of sites, suggesting to regard a heterogeneous group of sites as possibly homogeneous. Fig. 1 schematically illustrates three possible categorisation errors. The figure reports the Hk values computed as recommended by Hosking and Wallis for an un-correlated region on the x-axis, and the values of the same measure for a cross-correlated region with the same degree of heterogeneity in terms of L-moments on the y-axis. Practically speaking, the y-axis reports the Hk values that are actually obtained from the application of the test to the real group of sequences, the x-axis reports the Hk values for a hypothetical uncorrelated group of sites with the same degree of heterogeneity. The homogeneity testing should be based upon these latter Hk values but, unfortunately, they are unknown. Depending on the amount of cross-correlation of the real group of sequences, the scheme identifies the following categorisation errors (the darker the colour, the worse the error): Error 1 – possibly heterogeneous cross-correlated region categorized as acceptably homogeneous; Error 2 – definitely heterogeneous region categorized as possibly heterogeneous; Error 3 – definitely heterogeneous region categorized as acceptably homogeneous. The remainder of our paper analyzes the possibility of occurrence of these errors from both a theoretical and an empirical perspective. Information content of regional moments It is well known that inter-site correlation is generally not negligible for flood sequences (see e.g., Matalas and Langbein, 1962; Stedinger, 1983; Troutman and Karlinger, 2003; Rosbjerg, 2007 and Fig. 2) and leads to increases in the variance of regional flood statistics (see for instance Hosking and Wallis, 1988). For the case of R spatially correlated flood series with constant population mean and variance, each with record length n, Yule (1945) and Matalas and Langbein (1962, Eq. (16)) document that the variance of a regional mean is inflated by a factor that depends on , the average correlation among the sites q Var½xj q ¼ 2 Err. 2 1 Err. 1 Err. 3 1 2 H values: Uncorrelated Region Figure 1 Possible categorization errors due to the presence of cross-correlation. 69 r2X ðR 1Þ; ½1 þ q Rn ð5Þ where x indicates the regional sample mean and r2X the pop ¼ 0 the ulation variance of the Rseries, each of length n; if q variance of x reduces to r2X =ðRnÞ. Matalas and Langbein (1962) defined the relative information content of R spatially correlated flood series, each of length n, as the ratio, I¼ Var½x ðR 1Þ1 : ¼ ½1 þ q Var½xj q ð6Þ The information content of the mean, I, in Eq. (6), is measured relative to the variance of the mean associated with spatially and serially uncorrelated flows. Hence I = 1 when ¼ 0 and I < 1 when q > 0. Values of I < 1 reflect the fact q that intersite correlation reduces the overall information content of the regional sample. The effective number of Figure 2 Empirical cross-correlation coefficients for a group of 32 Italian and 226 US annual flood sequences (see Castellarin, 2007 and Vogel et al., 2007). 70 A. Castellarin et al. Stedinger (1983, Eq. (35)) derived the variance of an estimate of the regional variance of R cross-correlated and normally distributed series, each of length n, as, h i 2r4X 2 ðR 1Þ Var s2X j ½1 þ q q ¼ Rðn 1Þ h i 2 ðR 1Þ; ¼ Var s2X ½1 þ q ð8Þ Probability Weighted Moments Information content of the mean 0.10000 0.01000 0.00100 b0 b1 b2 b3 0.00010 where s2X stands for the estimator of the regional variance that, for samples of different length, is a weighted average in which each sample variance is weighted proportionally to the record length of the corresponding site, and q2 is the average squared correlation of concurrent flows. Analogous to the effective number of regional samples (sites) for estimation of the regional mean given in Eq. (7), the effective number of regional samples (sites) for the estimation of the regional variance, Rs2x , can be computed as follows, 0.10000 0.01000 0.00100 0.00010 0.00001 0.00001 Monte Carlo experiments 2 ðR 1Þ1 ; Rs2x ¼ R½1 þ q L moments Information content of conventional moments 0.10000 and therefore Eq. (9) returns In general, q2 is smaller than q a number of effective sites that is higher than the number returned by Eq. (7). The indexes Hk (with k = 1, 2 and 3) proposed by Hosking and Wallis (1993) measure the inter-group variability of sample L-moments and therefore are impacted by the presence of intersite correlation, which inflates the variance of regional moments (as well as L-moments and L-moment ratios). The derivation of the relative information content of regional L-moments and L-moment ratios is critical to the quantification of the impact of cross-correlation on the Hosking and Wallis (1993) homogeneity test. To derive indications of the information content of regional L-moments and L-moment ratios one may refer to the Probability Weighted Moments (PWM, see Greenwood et al., 1979), of which L-moments are linear combinations. We considered the class of PWMs for which the moment of order r reads, 0.01000 0.00100 l1 l2 l3 l4 0.00010 0.10000 0.01000 0.00100 0.00010 0.00001 0.00001 Monte Carlo experiments L moments ratios Information content of conventional moments 0.10000 br ¼ EfX½F X ðxÞr g; 0.01000 ð10Þ where FX(x) is the cumulative distribution function (CDF) of the random variable X and E{Æ} is the expectation. The unbiased estimator of br (see e.g., Greenwood et al., 1979) reads, 0.00100 t2 t3 t4 0.00010 ð9Þ 0.10000 0.01000 0.00100 0.00010 0.00001 0.00001 Monte Carlo experiments Figure 3 Variance of regional average PWMs, L-moments and L-moment ratios for cross-correlated regions obtained from Monte Carlo simulation (20,000 replicates) and computed through the information content of the mean and conventional moments (see Matalas and Langbein, 1962; Stedinger, 1983). regional samples associated with estimation of the regional mean Rx is then, Rx ¼ RI: ð7Þ Figure 4 Schematic of the simulation options: 1HETSITE, single discordant site, and BIMODAL, half discordant series (regular catchments: dark grey; discordant catchments: light grey). Homogeneity testing: How homogeneous do heterogeneous cross-correlated regions seem? br ¼ n1 n1 r 1 X n j1 x j:n ; r j¼rþ1 ð11Þ in which n is the length of the series and xj:n is the jth order statistic, that is the jth value of a sample of length n arranged in ascending order (b0 is the sample mean). An unbiased sample estimator of the L-moment of order r + 1 is then defined as, ‘rþ1 ¼ r X pr;k bk ; k¼0 where pr;k ¼ ð1Þrk r rþk ; k k r ¼ 0; 1; . . . ; n 1 ð12Þ (‘1 is the sample mean). L-moment ratios of order 2 and r+1 > 2 can then be estimated as, t2 ¼ ‘2 =‘1 and trþ1 ¼ ‘rþ1 =‘2 ; ð13Þ where we adopt the same notation used in Eqs. (1)–(3), that is t2 is the sample estimator of L-Cv, t3 of L-Cs, and t4 of L-kurtosis. The regional average PWM or L-moment can in general be written as (see e.g. Hosking and Wallis, 1997), PR i¼1 ni Ms;i MReg: ¼ ð14Þ P s M i¼1 ni where MReg: is the sample estimator of the regional moment s of order s (i.e., PWM, L-moment, L-moment ratio), Ms,i is the at-site sample estimate of the same moment for site i, ni is the length of the sequence at site i and R is the number of sites in the region (or pooling group). For the sake of simplicity, the remainder of the paper considers regional samples consisting of concurrent annual sequences of equal length n. Sampling properties of L-moments are analysed by Sankarasubramanian and Srinivasan (1999), whereas Elamir and Seheult (2004) present the exact variance structure of sample PWMs and L-moments, but the sampling properties of regional PWMs and L-moments for cross-correlated sequences has not been addressed yet. We analysed this issue through Monte Carlo simulation experiments. We generated 20,000 synthetic regional samples from the multivariate normal distribution with constant mean (i.e., 10) and variance (i.e., 1). Regional samples consist of R = 10, 20, 30 concurrent and cross-correlated sequences of length n = 10, 25, 50 years and cross-correlation q ranging from 0 to 0.8 with step 0.2. We then computed the values of the empirical variance of regional moments (PWMs up to order 3, L-moments up to order 4 and corresponding L-moment ratios) for the 20,000 replicates of each set of R, n and q values and we adopted the same form of the regional information content derived by Matalas and Langbein (1962) and Stedinger (1983) to express these values as a function of: (i) the empirical variance of the corresponding regional moment for the uncorrelated case (R, n, q = 0); (ii) rþ1 , depending on the considered moment. R; and (iii) q or q The results can be summarised as follows, h h i i q Va ^r bReg: ^r bReg: Va ½1 þ qðR 1Þ r r for r ¼ 0; 1; 2 and 3; h i h i ^r ‘Reg: ^ Reg: rþ1 ðR 1Þ Va rþ1 q Var ‘rþ1 ½1 þ q for r ¼ 0; 1; 2 and 3; h i h i ^r tReg: ^ Reg: rþ1 ðR 1Þ Va rþ1 q Var trþ1 ½1 þ q for r ¼ 1; 2 and 3; 5 BIMODAL: 20 sites; 25 years 10 Uncorr . 0.2 0.4 0.6 0.8 7.5 H value H value 7.5 2.5 0 -2.5 0.1 ð15Þ ^r [Æ] indicates the empirical variance resulting from where Va Monte Carlo simulations. The scatterplots of Fig. 3 report on the x-axis the left term of Eq. (15) and the right terms on the y-axis. Each point refers to a particular set of R, n, and q values. This evidence can be summarised by saying that the information content of regional PWMs coincides with the information content of the mean, regardless of the order of the moment, whereas the information content of regional L-moments and L-moment ratios coincides with the information content of conventional moments of the same order. This results is consistent with the fact that sample PWMs of any order are linear combinations of observations, and therefore employing a first order Taylor series approximation to the variance of sample PWMs (see e.g., Castellarin et al., 2005) all covariance terms are equal to zero. The same consideration does not apply to sample L-moments, which are linear combinations of sample PWMs of different orders. 1HETSITE: 20 sites; 25 years 10 71 5 Uncorr. 0.2 0.4 0.6 0.8 2.5 0 0.2 0.3 0.4 0.5 0.6 Cv* (discordant series) 0.7 -2.5 0.1 0.2 0.3 0.4 0.5 0.6 Cv* (discordant series) 0.7 Figure 5 Monte Carlo simulations: average H1 values computed from 10,000 replicates with different degrees of cross-correlation and heterogeneity (1HETSITE: single discordant site with Cv = Cv*, BIMODAL: half discordant series with Cv = Cv*). 72 A. Castellarin et al. 1HETSITE: 10 sites (10,25,50 years) 3 H values: Cross-correlated Region H values: Cross-correlated Region 3 2 1 0 -1 Uncorr. 0.2 0.4 0.6 0.8 -2 -3 2 1 0 -1 Uncorr. 0.2 0.4 0.6 0.8 -2 -3 3 1 2 3 H values: Uncorrelated Region 2 H values: Uncorrelated Region 1HETSITE: 20 sites (10,25,50 years) BIMODAL: 20 sites (10,25,50 years) 3 2 1 0 -1 Uncorr. 0.2 0.4 0.6 0.8 -2 0 H values: Cross-correlated Region 0 H values: Cross-correlated Region BIMODAL: 10 sites (10,25,50 years) -3 3 2 1 0 -1 Uncorr. 0.2 0.4 0.6 0.8 -2 -3 0 1 2 3 0 H values: Uncorrelated Region 1HETSITE: 30 sites (10,25,50 years) H values: Cross-correlated Region 2 1 0 -1 Uncorr. 0.2 0.4 0.6 0.8 -2 -3 0 1 2 H values: Uncorrelated Region 2 H values: Uncorrelated Region 3 2 1 0 -1 Uncorr. 0.2 0.4 0.6 0.8 -2 -3 3 1 BIMODAL: 30 sites (10,25,50 years) 3 3 H values: Cross-correlated Region 1 0 1 2 H values: Uncorrelated Region 3 Figure 6 Monte Carlo simulations, average H1 values: uncorrelated vs. cross-correlated regions for different degrees of crosscorrelation and heterogeneity. The relative information content of regional L-moment ratios is critical to the interpretation of how cross-correlation among flood sequences may impact Hosking and Wallis (1993) heterogeneity measures. For example, H1 measures the dispersion of sample L-Cv values for the group of series. If cross-correlation is present, the expected variance of regional L-Cv for the original R series corresponds to the variance of Rs2x < R independent sequences, with Rs2x expressed by Eq. (9). Using as reference in Eq. (4) lV 1 and rV 1 values estimated for R synthetic and independent series may severely diminish the significance of the test. Analogous considerations hold for measures H2 and H3. Monte carlo experiments We assessed the sensitivity of H1 to cross-correlation by adopting a Monte Carlo simulation algorithm similar to the algorithm used by Hosking and Wallis (1988) and by Castellarin et al. (2005) (see Appendix). We repeatedly generated 10,000 synthetic regions with given degrees of regional heterogeneity and cross-correlation. Each synthetic region consists of R spatially correlated flood series of length n, with cross-correlation among the sequences equal to q. The following values were adopted for the experiments: R = 10, 20, 30 series; n = 10, 25, 50 years and q = 0.2, 0.4, 0.6, 0.8 and Homogeneity testing: How homogeneous do heterogeneous cross-correlated regions seem? 30 Sites H values: Cross-correlated Region H values: Cross-correlated Region 10 Sites 1.5 0 73 1.5 0 -1.5 -1.5 0 1.5 H values: Uncorrelated Region 0 1.5 H values: Uncorrelated Region Figure 7 Monte Carlo simulations, average H1 values: uncorrelated vs. cross-correlated regions for different degrees of crosscorrelation, heterogeneity and parent distributions (black markers: GEV; white markers: EV1; average correlation: 0.2 triangles; 0.4 circles; 0.6 diamonds; length of the series 10 and 50 years). 0.0 (uncorrelated case). We selected as regional parent distribution for generating the sequences of regular sites the Gumbel distribution with unit mean and Cv equal to 0.4, EV1(1, 0.4) (see e.g., Hosking and Wallis, 1988), whereas we generated the discordant series from an EV1(1, Cv*), with Cv* = 0.1, 0.2, . . . , 0.7. The study adopts two different generation options. The first option (1HETSITE) considers synthetic regions in which all series but one are regular, and only one discordant series is present. The second option (BIMODAL) generates R/2 sequences from an EV1(1, 0.4) and R/2 from an EV1(1, Cv*). Fig. 4 illustrates a schematic of the two options, reporting regular (dark grey) and discordant (light grey) catchments. For both options, 1HETSITE and BIMODAL, and all considered cases (R, n, q and Cv*), we averaged the 10,000 H1 values. Fig. 5 illustrates for R = 20 series and n = 25 years the relationship between average H1 and Cv* for different degrees of cross-correlation, showing the strong control of cross-correlation on H1. Fig. 6 reports the results of the simulation experiments for 1HETSITE and BIMODAL options. The figure illustrates the relationship between average H1 values for uncorrelated and cross-correlated regions with the same degree of crosscorrelation, identified by Cv*. Since the results presented in Figs. 5 and 6 may be dependent on the particular parent distribution adopted for the numerical experiment, we repeated the 1HETSITE simulations for R = 10, 30 series, n = 10, 50 years and q = 0.0, 0.2, 0.4, 0.6 using the GEV distribution instead of the EV1 distribution. The parameters of the distribution were set to obtain a skewness coefficient c = 2.25, which is almost twice as high as c for a EV1 distribution, a Cv equal to 0.4 for regular series and Cv* = 0.2, 0.4, 06 for discordant series and a unit mean. Fig. 7 compares the results obtained for the EV1 and GEV regional parents in terms of average H1 values. The analysis of Figs. 5–7 suggests that the effects of cross-correlation on the discriminatory power of the Hosking and Wallis (1993) homogeneity test (i.e., H1 heterogeneity measure) are not negligible for plausible values of the cross-correlation coefficient among annual flood sequences (q = 0.2–0.6, see e.g., Matalas and Langbein, 1962; Stedinger, 1983; Hosking and Wallis, 1988; Vogel et al., 2001; Castellarin, 2007). In particular, Figs. 6 and 7 show that, due to the presence of cross-correlation, categorization errors of type 1 and 2 (see Fig. 1) occur frequently, and categorization errors of type 3 may also occur. Fig. 6 clearly shows that the impact of cross-correlation in terms of the relationship between H1 values for correlated and uncorrelated regions is associated with the degree of cross-correlation and the number of sequences. Results for 1HETSITE and BIMODAL options show nearly coincident patterns, regardless of the length of the series n. Additionally, Fig. 7 points out that the results are independent of the regional parent distribution (EV1 or GEV) as long as the heterogeneity degree is of the same nature, as for the simulations performed here in which parent distributions of regular and discordant series differ in terms of Cv (or equivalently L-Cv) only. Finally, it is interesting to observe that the results reported in Figs. 6 and 7 for a given degree of cross-correlation among series show a significant linearity between average H1 values for cross-correlated and uncorrelated synthetic regions, at least for the range of H1 values considered in the figures. These considerations suggest identifying an empirical corrector of the homogeneity test based upon H1 for providing an approximate indication of the actual degree of heterogeneity of the region in the presence of cross-correlation. The form of the selected empirical corrector reads as follows, Discussion where H1,adj is the adjusted value of the heterogeneity measure, H1 is the value resulting from the homogeneity test, C is the empirical coefficient of the corrector that is assumed to be constant, q2 is the average squared correlation of The results reported in Figs. 5–7 lead to a number of considerations. We present below the most significant ones. H1;adj ¼ H1 þ C q2 ðR 1Þ; ð16Þ 74 A. Castellarin et al. 1HETSITE BIMODAL 4 Empirical Corrector Empirical Corrector 4 3 2 1 3 2 1 0 0 0 1 2 Monte Carlo 3 4 0 1 2 Monte Carlo 3 4 Figure 8 Average H1 values for uncorrelated regions having the same heterogeneity degree of original cross-correlated regions obtained from simulation (Monte Carlo) and by applying Eq. (16) (black markers: GEV; white markers: EV1). concurrent flows and R the number of sequences in the region. The expression selected for the corrector reflects: (1) the discussion of Section 3 on the information content of regional L-Cv; and (2) the marked linearity between average H1 values for cross-correlated and uncorrelated synthetic regions obtained via Monte Carlo simulation. We identified the value of C in Eq. (16) from the average H1 values from simulation 1HETSITE – EV1 (i.e., option – regional parent distribution) through an ordinary least squares regression procedure. The identified value, C = 0.122, is associated with a Nash and Sutcliffe, 1970 efficiency measure E = 0.981, with E 2 [1, 1] and E = 1 for a perfect fit. The same C value applied to the average H1 values obtained for BIMODAL – EV1 returns E = 0.997, and the application for 1HETSITE – GEV results in E = 0.961. Fig. 8 shows the scatterplots resulting from the application of Eq. (16) for 1HETSITE – EV1 and 1HETSITE – GEV (left panel) and for BIMODAL – EV1 (right panel), highlighting the reduction of categorization errors due to the application of the empirical corrector. It is important to note that the nature of Eq. (16) is empirical and the proposed value of the coefficient C is inevitably associated with the Monte Carlo simulation experiments performed in this study. Also, all simulations performed here refer to hypothetical regions (i.e., group of flood sequences) with constant cross-correlation coefficients among sequences and concurrent series of equal length. Generalisation of these aspects (see e.g., Castellarin, 2007) is an open problem for future analyses. Conclusions The main objective of the present study is to show that the presence of cross-correlation among annual flood sequences, which generally cannot be ignored in practice, may significantly reduce the power of statistical homogeneity tests. The study refers explicitly to the heterogeneity measure proposed by Hosking and Wallis (1993) based on L-moments, but analogous considerations hold for all parametric and nonparametric tests proposed in the scientific literature. First, the limitations of the considered test are examined from a theoretical viewpoint by considering the effects of sampling properties of L-moments for cross-correlated regions. We adopted the concept of information content of regional moments (see e.g. Matalas and Langbein, 1962; Stedinger, 1983) to express the impact of cross-correlation on the variance of regional L-moments, and we showed via a numerical experiment that the information content of regional L-moments corresponds to the information content of regional estimators of conventional moments. Second, we assessed the effects of cross-correlation on the power of the homogeneity test through a series of Monte Carlo experiments. In particular, we quantified the impact of cross-correlation on one of the heterogeneity measures proposed by Hosking and Wallis, the measure that quantifies the regional heterogeneity in terms of the dispersion of sample L-Cv. We considered a number of different hypotheses on the size of the regional sample, heterogeneity degree, and regional parent distribution. The results indicate that cross-correlation exerts a strong control on the considered homogeneity test and may lead to mis-categorization of the groups of sequences (e.g., the group of sequences may be regarded as possibly homogeneous, when it should be regarded as heterogeneous). Finally, we proposed an empirical corrector of the test that provides an approximate indication of the actual degree of heterogeneity of the region (or group of sequences) in the presence of cross-correlation. Our study is approximate and represents a preliminary effort at a comprehensive insight into the effects of crosscorrelation on homogeneity tests. We are persuaded that this is an important issue, fundamental to regional frequency analysis, which unjustifiably received very limited attention from the scientific community. Further aspects should be addressed by future numerical analysis for a generalisation of our study. Some relevant examples are: (i) the utilization of a plausible intersite correlation model instead of using a constant theoretical correlation coefficient among all sequences; (ii) a realistic variability of the series lengths within the region (e.g., missing data at some gauges, installation of new gauges, dismantlement of obsolete gauges, etc.) instead of considering concurrent sequences of equal length; and (iii) the assessment of the effects of cross-correlation on other homogeneity tests proposed in the scientific literature, parametric and non-parametric. We refer in our study to one particular homogeneity test, in virtue of its notoriety and widespread utilization. Never- Homogeneity testing: How homogeneous do heterogeneous cross-correlated regions seem? theless, the conclusions we draw are general and suggest a scrupulous revision, or at least an informed application, of all homogeneity tests in the presence of significant intersite dependence. Acknowledgements The research was partially supported by the Italian MIUR (Ministry of Education, University and Research) through the research Grant titled ‘‘Characterisation of average and extreme flows in ungaged basins by integrated use of data-based methods and hydrological modelling’’. The suggestions and comments of two anonymous reviewers are gratefully acknowledged. Appendix A. Simulation algorithm The study adopts a simulation algorithm analogous to the algorithm introduced by Hosking and Wallis (1988) to generate a large number of cross-correlated synthetic regions. The algorithm assumes that if each site’s flood frequency distribution were transformed to normality using the transformation F, then the joint distribution for all sites in the region would be multivariate-normal. The simulation algorithm involves two main steps: (1) generation of a multivariate vector y having a multivariate normal distribution with a given correlation matrix P; (2) application of the inverse transformation F1 to obtain data with the required marginal distribution. A brief description of the simulation algorithm is given below: 1. Assume a region with R sites having record length n and parent distribution of the annual flood denoted as FX for R R* regular sites, and F X for R* discordant sites. Also assume the cross-correlation coefficient among all normalized floods is q, that is the diagonal elements of the R-by-R matrix P are equal to one while non-diagonal elements are all equal to q. 2. Generate the regional sample: (1) Generate a matrix z = [z1, z2, . . . , zR] with R columns and n rows. Each column zj, j = 1, 2, . . . , R, contains n multivariate normal deviates with zero mean, unit variance, and covariance matrix P. (2) Transform the elements of the matrix z belonging to regular sites into a realization from the cori rect marginal distribution by setting x ij ¼ F 1 X ðUðzj ÞÞ, with i = 1, 2, . . . , n and j = 1, 2, . . . , R R* and where U is the cdf of the standard normal distribution. (3) Analogously, transform the elements of the matrix z belonging to discordant sites into a realization from their marginal i distribution by setting x ik ¼ F 1 X ðUðzk ÞÞ, where i = 1, 2, . . . , n and k varies from (R R* + 1) to R. 3. Calculate the heterogeneity measure H1 for the synthetic region as indicated by Hosking and Wallis (1993, 1997). 4. Repeat steps 2 and 3 10,000 times and calculate the average of all H1 values. The algorithm described above was applied by considering two different options that we termed 1HETSITE and BIMODAL. Option 1HETSITE refers to a number of discordant sites R* = 1, while R* = R/2 for option BIMODAL. R was arbi- 75 trarily set to 10, 20 and 30 sites, n was set to 10, 25 and 50 years, and q = 0.0, 0.2, . . . ,0.8. A Gumbel distribution with unit mean and CV = 0.4, EV1(1, 0.4), was used as parent distribution of regular annual floods, FX. The same distribution was used for generating cross-correlated annual floods in homogeneous regions by Hosking and Wallis, 1988 [see section 4, p. 591]. The discordant sequences were generated from a EV1(1, Cv*) regional parent with Cv* = 0.1, 0.2, . . . , 0.7. The algorithm was applied for option 1HETSITE a second time by referring to a different set of regional parents. In particular, a Generalized Extreme Value (GEV) distribution with skewness coefficient c = 2.25, Cv = 0.4 and unit mean, GEV(1, 0.4, 2.25), was considered as parent distribution for regular sites. A GEV(1, Cv*, 2.25) with Cv* = 0.2, 0.4, 0.6 was used for discordant sites. This second group of 1HETSITE simulations considered R = 10, 30 sites and n = 10, 50 years. It should be noted that the transformation from the multivariate normal deviates to a set of realizations from the correct marginal distribution does not preserve the crosscorrelation structure. We performed a series of test runs generating a number of cross-correlated pairs of sequences of length 106, under various hypotheses for the marginal distributions. We found that the differences between the theoretical cross-correlation coefficient for the multivariate normal deviates and the empirical cross-correlation coefficient after the transformation were always negligible for practical purposes (maximum absolute difference 0.012). References Burn, D.H., 1990. Evaluation of regional flood frequency analysis with a region of influence approach. Water Resources Research 26 (10), 2257–2265. Castellarin, A., 2007. Probabilistic envelope curves for design flood estimation at ungauged sites. Water Resources Research 43, W04406. doi:10.1029/2005WR004384. Castellarin, A., Camorani, G., Brath, A., 2007. Predicting annual and long-term flow-duration curves in ungauged basins. Advances in Water Resources 304, 937–953. doi:10.1016/ j.advwatres.2006.08.006. Castellarin, A., Vogel, R.M., Matalas, N.C., 2005. Probabilistic behavior of a regional envelope curve. Water Resources Research 41, w06018. doi:10.1029/2004wr003042. Chow, V.T.Editor in Chief, 1964. Handbook of Applied Hydrology, Section 8-1. McGraw Hill, New York. Dalrymple, T., 1949. In: Regional Flood Frequency: Presentation at the 29th Annual Meeting of the Highway Research Board, Washington, DC, 22 p., December 13. Dalrymple, T., 1960. Flood frequency analyses, US Geology Survey on Water Supply Paper 1543-A, Reston, VA. Douglas, E.M., Vogel, R.M., Kroll, C.N., 2000. Trends in floods and low flows in the United States: impact of spatial correlation. Journal Of Hydrology 240 (1-2), 90–105. Durbin, J., Knott, M., 1971. Components of Cramér – von Mises Statistics, London School of Economy and Political Science, UK. Elamir, E.A.H., Seheult, A.H., 2004. Exact variance structure of sample L-moments. Journal of Statistical Planning and Inference 124, 337–359. Fill, H.D., Stedinger, J.R., 1995. Homogeneity tests based upon Gumbel distribution and a critical appraisal of Dalrymple’s test. Journal of Hydrology 166, 81–105. Greenwood, J.A., Landwehr, J.M., Matalas, N.C., Wallis, J.R., 1979. Probability weighted moments: definition and relation to 76 parameters of several distributions expressible in inverse form. Water Resources Research 15, 1049–1054. Hosking, J.R.M., 1990. L-moments: analysis and estimation of distributions using linear combination of order statistics. Journal of Royal Statistical Society, Series B 52 (1), 105–124. Hosking, J.R.M., Wallis, J.R., 1988. The effect of intersite dependence on regional flood frequency-analysis. Water Resources Research 24 (4), 588–600. Hosking, J.R.M., Wallis, J.R., 1993. Some useful statistics in regional frequency analysis. Water Resources Research 29 (2), 271–281. Hosking, J.R.M., Wallis, J.R., 1997. Regional frequency analysis – an approach based on L-moments. Cambridge University Press, New York, p. 224. Jenkinson, A.F., 1955. The frequency distribution of the annual maximum (or minimum) of meteorological elements. Quarterly Journal Royal Meteorological Society 81, 158–171. Lettenmaier, D.P., Wallis, J.R., Wood, E.F., 1987. Effect of regional heterogeneity on flood frequency estimation. Water Resources Research 23 (2), 313–323. Lu, L.H., 1991. Statistical Methods for Regional Flood Frequency Investigations. Ph.D. Dissertation, Cornell University, Ithaca, NY. Lu, L., Stedinger, J.R., 1992. Sampling variance of normalized GEV/ PWM quantile estimators and a regional homogeneity test. Journal of Hydrology 138, 223–245. Madsen, H., Rosbjerg, D., 1997. Generalized least squares and empirical Bayes estimation in regional partial duration series index-flood modeling. Water Resources Research 33 (4), 771– 781. Madsen, H., Mikkelsen, P.S., Rosbjerg, D., Harremoes, P., 2002. Regional estimation of rainfall intensity-duration-frequency curves using generalized least squares regression of partial duration series statistics. Water Resources Research 38 (11), 1239. doi:10.1029/2001WR001125. Matalas, N.C., Langbein, W.B., 1962. Information content of the mean. Journal of Geophysical Research 67 (9), 3441–3448. Nash, J.E., Sutcliffe, J.E., 1970. River flow forecasting through conceptual models, Part 1-A discussion of principles. Journal of Hydrology 10 (3), 282–290. A. Castellarin et al. Rosbjerg, D., 2007. Regional flood frequency analysis. In: Vasiliev, O.F. et al. (Eds.), Extreme Hydrological Events: New Concepts for Security, pp. 151–171. Sankarasubramanian, A., Srinivasan, K., 1999. Investigation and comparison of sampling properties of L-moments and conventional moments. Journal of Hydrology 218, 13–34. Scholz, F.W., Stephens, M.A., 1987. K-sample Anderson–Darling tests. Journal of American Statitistical Association 82, 918–924. Singh, V.P., 1992. Elementary Hydrology. Prentice-Hall, Englewood Cliffs, NJ, pp. 824–829. Stedinger, J.R., 1983. Estimating a regional flood frequency distribution. Water Resources Research 19, 503–510. Stedinger, J.R., Lu, L., 1995. Appraisal of regional and index flood quantile estimators. Stochastic Hydrology and Hydraulics 9 (1), 49–75. Troutman, B.M., Karlinger, M.R., 2003. regional flood probabilities. Water Resources Research 39 (4), 1095. doi:10.1029/ 2001WR001140. Viglione, A., Laio, F., Claps, P., 2007. A comparison of homogeneity tests for regional frequency analysis. Water Resources Research 43, W03428. doi:10.1029/2006WR005095. Vogel, R.M., Zafirakou-Koulouris, A., Matalas, N.C., 2001. Frequency of record breaking floods in the United States. Water Resources Research 37 (6), 1723–1731. Vogel, R.M., Matalas, N.C., England, J.F., Castellarin, A., 2007. An assessment of exceedance probabilities of envelope curves. Water Resources Research 43, W07403. doi:10.1029/ 2006WR005586. Wiltshire, S.E., 1986a. Regional flood frequency analysis I: homogeneity statistics. Hydrological Science Journal 31, 321–333. Wiltshire, S.E., 1986b. Regional flood frequency analysis II: multivariate classification of drainage basins in Britain. Hydrological Science Journal 31, 335–346. Wiltshire, S.E., 1986c. Identification of homogeneous regions for flood frequency analysis. Journal of Hydrology 84, 287–302. Yule, G.U., 1945. A method of studying time series based on their internal correlations. Journal of Royal Statistical Society 108, 208.
© Copyright 2025 Paperzz