Statistics as a means to construct knowledge in climate and related sciences -- a discourse -Hans von Storch Institute for Coastal Research GKSS, Germany 9IMSC, Cape Town, 24-28 May 2004 The basic approach … … is to combine systematically empirical knowledge („data“) with dynamical knowledge („models“) in order to determine • characteristic parameters (“inference”) • consistency of models and data (“testing”) The knowledge represented by data and models are both uncertain. This uncertainty makes us to resort to statistical concepts. The resulting additional knowledge is • best guesses of numbers (ideally together with confidence intervals) • evaluation of the consistency of theoretical concepts with observational evidence. These new knowledge claims are based on the amount of available data. In general: If more data are available, the confidence in the numbers increases, but the consistency of the concepts decreases. In general, the problem may be conceptualized by the state space formalism, with - a state space equation, e.g., Ψt+1 = F(Ψt, α, η) + ε (M) with the state variable Ψt, external parameters η and internal parameters α. The term ε is a random component, which supposedly represents the uncertainty of the model M. - an observation equation xt = B(Ψt) + δ (B) with the observable x, and the random component δ. Examples: 1. Goodness of fit 2. Extreme value 3. PIPs and POPs 4. Downscaling 5. Detection and attribution 6. Determination of parameters 7. Analysis 1. Goodness of fit ( M ) u ~ W ( , ) Weibull distributi on shape; scale parameter x 1 ( ) fW (x) ( ) e x (B) in case of wind in the extratropi cs OWS M : good fit with 2.64 (JJA) or 3.04 (DJF) (M ) () k with 0 k 1 ( being the autocovari ance function) 2. Extreme ( B) values Long memory? 1 ( r / Rq ) k Pq (r ) e with r 0 Rq the probabilit y density function of waiting time r between tw o events of exceeding the q level. Rq E[ Pq (r )]. Synthetic example with k =0.4 Bunde et al., 2004: Return intervals of rare events in records with long-term persistence … 722-1284 annual water levels of the Nile Distribution Pq(r) of return times between consecutive extreme values r. Rq is the expected value. Significance: Extremes are not uniformly distributed in time, as described by a Poisson process, but appear in clusters. Synthetic examples with k =0.4 Expected waiting time for next exceedance event conditional upon length of previous waiting time r0. 722-1284 annual water levels of the Nile Bunde et al., 2004: Return intervals of rare events in records with long-term persistence … State space equation in lowdimensional subspace (M) … and POPs t 1 F ( t , , ) Special form Observational equation in highdimensional space. (B) xt P t Parameters P, α determined such that E xt 1 PF ( t , , ) min 3. PIPS … (M) (B) t 1 t xt P t Ψ, λ complex numbers; (M) describes the damped rotation in a 2-dimensional space spanned by complex eigenvectors of E(xt+1xtT) E(xtxtT)-1. All eigenvectors form PT. Example: POP of MJO Real and imaginary part of spatial pattern in equatorial velocity potential at 200 hPa 10-day forecast using state space equation in 2-d space von Storch, H. and J.S. Xu, 1990: Principal Oscillation Pattern Analysis of the Tropical 30- to 60- day Oscillation: Part I: Definition of an Index and its Prediction. - Climate Dyn. 4, 175-190 4. Downscaling The state space is simulated by ”reality” of by GCMs. The observation equation relates largescale variables, which are supposedly well observed (analysed) or simulated, to variables with relevant impact for clients. Large scale state: JFM mean temperature anomaly Example: snow drops Flowering date anomaly of snow drop (galanthus nivalis) Maak, K. and H. von Storch, 1997: Statistical downscaling of monthly mean air temperature to the beginning of the flowering of Galanthus nivalis L. in Northern Germany. - Intern. J. Biometeor. 41, 5-12 The state space dynamics is given by the assumption that the complete state of the atmosphere may be given by (M) t ak (t ) g k k The “patterns” gk represent the influence of a series of external influences, while ε represents the internal variability of the climate system. Ψ describes the full 3-d dynamics of the climate system. The observation equation is formulated in a parameter space (A), and the state variable is projected on a space of observed variables (L[ψ] ) (B) Ak L( t ) g k r ,ad 5. Detection and attribution Here, L is the projector of the full space on the space of observed (and considered) variables, and gr,ad is the adjoint pattern of gk in the reduced space. Detection means to test the null hypothesis H 0 : Ak 0 while attribution means the assessment that Ak is consistent with ak. (i.e. Ak lies in a suitable small confidence “interval” of ak) Attribution diagram for observed 50-year trends in JJA mean temperature. Detection and attribution (cont’d) The ellipsoids enclose nonrejection regions for testing the null hypothesis that the 2dimensional vector of signal amplitudes estimated from observations has the same distribution as the corresponding signal amplitudes estimated from the simulated 1946-95 trends in the greenhouse gas, greenhouse gas plus aerosol and solar forcing experiments. Courtesy G. Hegerl. Zwiers, F.W., 1999: The detection of climate change. In: H. von Storch and G. Flöser (Eds.): Anthropogenic Climate Change. Springer Verlag, 163-209, ISBN 3-540-65033-4 6. Determination of parameters In general, when many observations are available, optimal parameters α may be determined by finding those α which minimize the functional E xt 1 BF ( t , , ) Example: Determination of parameters – oceanic dissipation M2 tidal dissipation rates, estimated by combining Topex/Poseidon altimeter data with a hydrodynamical tide models. The solid lines encircle high dissipation areas in the deep ocean From Egbert and Ray [32] Egbert GD, Ray RD (2000) Significant dissipation of tidal energy in the deep ocean inferred from satellite altimeter data. Nature 45:775-778 7. Analysis Skillful estimates of the unknown field Ψt are obtained by integrating the state-space equations and the observation equation forward in time: t*1 F ( t , , ) xt*1 B ( t*1 ) and, as best guess t 1 t*1 K ( xt*1 xt 1 ) Example: spectral nudging in RCMs State space equation: RCM Observable xt: large-scale features, provided by analyses or GCM output. Correction step: nudging large-scales in spectral domain Percentile-percentile diagram of local wind at an ocean location as recorded by a local buoy and as simulated in a RCM constrained by lateral control only, and constrained by spectral nudging The purpose of statistics is … • to specify pre-defined „models“ of reality by fitting characteristic numbers to observational evidence. developing and extending models and theories • to analyze states and changes by interpreting empirical evidence in light of a pre-specified model. monitoring weather (and climate) • to test theories and models as to whether they are valid in light of the empirical evidence. falsifying theories and models Potential of „professional statisticians“ The specification of the models is usually not a statistical problem, but needs guidance by dynamical knowledge. Therefore, when applying advanced method in climate science „professional“ statisticians often fail to achieve significant knowledge gains. We need market places, where a) method-driven mathematical (and theoretical physics) statisticians meet problem-driven people from climate science b) other problem-driven scientists (e.g., geostatistians, econometricians) to allow the export of methods to climate science. So what?
© Copyright 2026 Paperzz