Estimation of positive sum-to-one constrained parameters with ensemble-based Kalman filters: application to an ocean ecosystem model Ehouarn Simon1 , Annette Samuelsen1 , Laurent Bertino1 Dany Dumont2 1 Nansen 2 Institut Environmental and Remote Sensing Center, Norway des sciences de la mer, Université du Québec à Rimouski, Canada 16 November 2012 1 Outline 1 Estimation of positive sum-to-one constrained parameters with ensemble-based Kalman filters 2 Numerical results in GOTM-NORWECOM Assimilation of ocean color data CHLA: forecast mean (13-08-2008) CHLA: observations (13-08-2008) Time evolution of the errors RMS error and standard deviations (mg/m3) 2.5 RMS Ensemble STD Ensemble 2 1.5 1 0.5 0 01−Jan−2007 01−Jan−2008 01−Jan−2009 01−Jan−2010 Difficulties Nonlinear dynamics and positive variables (tracer concentration). Numerous poorly known parameters. Seasonal lack of observations (Arctic) and large error (satellite), few in-situ data. Diet of the herbivorous / predators ? Zooplankton grazing preferences (relative diet). B No prior knowledge, one set of values for the whole domain. Adaptation of zooplankton to their local environment? Non-Gaussianity and parameter estimation with the EnKF Truncations of negative values Potential depletion of components of the ensemble B Large corrections on parameters in erroneous directions. Grazing efficiency f Grazing efficiency f 1.2 Mean and standard deviations (d!1) Mean and standard deviations (d!1) 1.2 1 0.8 0.6 0.4 0.2 0 0 500 1000 Time (d) 1500 1 0.8 0.6 0.4 0.2 0 0 500 1000 Time (d) 1500 Gaussian anamorphosis extension of ensemble-based Kalman filters Analysis with transformed variables and observations (Bertino et al., 2003) Applicability of the methods in large systems Effectiveness of the Gaussian anamorphosis extension of the EnKF to estimate 4 parameters in nonlinear frameworks of positive variables. Outline 1 Estimation of positive sum-to-one constrained parameters with ensemble-based Kalman filters 2 Numerical results in GOTM-NORWECOM Positive sum-to-one constrained parameters (πi )i∈NN Estimation Constraints: ∀i = 1 : N, πi > 0 and N X πi = 1 i =1 B Cannot be handled by the EnKF in that form. Dirichlet distribution of order N ∀i = 1 : N, πi = φi N X φk with φi ∼ Γ(θi , 1) with φi ∼ N (θi , Σi ) k=1 Gelman’s formulation (1995) ∀i = 1 : N, πi = e φi N X k=1 e φk Positive sum-to-one constrained parameters (πi )i∈NN Hyperspherical coordinates: N − 1 parameters 8 π1 > > > > > > > ∀i = 2 : N − 1, > > > > > < πi > > > > > > > > > > > > πN : = = cos2 ( π2 φ1 ) iY −1 k=1 = π π sin2 ( φk ) cos2 ( φi ) 2 2 N−2 Y k=1 π π sin2 ( φk ) sin2 ( φN−1 ) 2 2 with (φi )i=1=N−1 distributed on the segment line [0, 1] Inversion of the hyperspherical coordinate system Recursively. Information on the distribution of the (πi )i=1:N or available samples: B Prior values for the (φi )i=1:N−1 . 7 Expected values and variances of the (πi )i=1:N Tuning of the parameters of the distributions of the (φi )i=1:N−1 ∀i = 1 : N − 1, E [πi ] = mi , var(πi ) = σi2 Assumption: (φi )i =1:N−1 ∼ (Di ([0, 1], Θi ))i =1:N−1 are independent. Expected values of the (πi )i=1:N−1 8 1 1 > > (E [e jπφ1 ] + E [e −jπφ1 ]) = m1 − > > 4 2 > > > < mi 1 1 − ∀i = 2 : N − 1, (E [e jπφi ] + E [e −jπφi ]) = > > iX −1 4 2 > > > > 1− mk > : k=1 Variances of the (πi )i=1:N−1 8 1 3 1 > (E [e 2jπφ1 ] + E [e −2jπφ1 ]) = − + σ12 + m12 − (E [e jπφ1 ] + E [e −jπφ1 ]) > > 16 8 4 > > > > > > > < ∀i = 2 : N − 1, 1 (E [e 2jπφi ] + E [e −2jπφi ]) = − 3 − 1 (E [e jπφi ] + E [e −jπφi ]) 16 8 4 > σi2 + mi2 > > > + > > k−1 i > X Y 1 > > > (−2)k−1 (σi2−k + mi2−k ) (E [e jπφi −l ] + E [e −jπφi −l ]) : 4 8 k=1 l =1 Example with the triangular distribution Triangular distribution: φi ∼ T (0, 1, ci ) Tuning of the mode ci . Characteristic function: ∀t ∈ R, E [e jtφi ] = −2 Equal preferences: E [πi ] = (1 − ci ) − e jci t + ci e jt π 2 ci (1 − ci ) 1 N A system of N − 1 nonlinear equations: ∀i = 1 : N − 1, cos(πci ) + 2ci − 1 N −i −1 + = 0. π 2 ci (1 − ci ) 2(N − i + 1) N > 3: non existence of solutions. Outline 1 Estimation of positive sum-to-one constrained parameters with ensemble-based Kalman filters 2 Numerical results in GOTM-NORWECOM GOTM-NORWECOM-ASSIM Assimilation Deterministic ensemble Kalman filter (DEnKF, Sakov and Oke, 2008). State vector: biogeochemical state vector and parameters. 100 members. Gaussian anamorphosis (Bertino et al., 2003) for the biogeochemical state variables (and parameters for the spherical formulation) . Reference solution xt Deterministic simulation with different preferences: B Mesozooplankton: πdia = 0.6, πmic = 0.15, πden = 0.25. B Microzooplankton: πfla = 0.6, πden = 0.15, πdia = 0.25. Chlorophyll Nitrate The data assimilation system Configuration of the experiments One year to warm up the ensemble, then four years with assimilation. Observations: y = xtP ∗ G , with G ∼ Γ( 1 , σ 2 ), σo2 o σo = 0.3. B Surface chlorophyll (2 first layers) every seven days. Truncated-Gaussian perturbations in the whole water column every twelve hours: phytoplankton and zooplankton only. Robustness of the estimation: experiments repeated 20 times. Layer 1 Reference Reference (observation time) Observations 7 6 5 4 3 2 1 0 Jan00 12 Layer 2 Jan01 Jan02 Jan03 Time (d) Jan04 8 Chlorophyll concentration (mg/m3) Chlorophyll concentration (mg/m3) 8 7 6 5 4 3 2 1 0 Jan00 Jan01 Jan02 Jan03 Time (d) Jan04 e φi π i = PN Preferences: Gelman’s formulation k=1 MesoZ πdia 0.6 0.4 0.2 0 Jan02 Jan03 Time (d) 0.6 0.4 0.2 0 !0.2 Jan00 Jan04 MicroZ πfla 0.6 0.4 0.2 0 13 Jan02 Jan03 Time (d) 0.4 0.2 0 Jan04 Jan04 0.8 0.6 0.4 0.2 0 !0.2 Jan00 Jan01 Jan02 Jan03 Time (d) Jan01 Jan02 Jan03 Time (d) Jan04 MicroZ πdia MicroZooplankton: detritus (N) Mean and standard deviations Mean and standard deviations 0.8 Jan01 Jan02 Jan03 Time (d) 0.6 MicroZ πden MicroZooplankton: flagellates !0.2 Jan00 Jan01 0.8 !0.2 Jan00 MicroZooplankton: diatoms Mean and standard deviations Jan01 0.8 MesoZooplankton: detritus (N) Mean and standard deviations Mean and standard deviations Mean and standard deviations MesoZooplankton: MicroZooplankton 0.8 !0.2 Jan00 MesoZ πden MesoZ πmic MesoZooplankton: diatoms e φk Jan04 0.8 0.6 0.4 0.2 0 !0.2 Jan00 Jan01 Jan02 Jan03 Time (d) Jan04 Preferences: spherical formulation MesoZ πdia MesoZ πmic 0.6 0.4 0.2 0 Jan02 Jan03 Time (d) 0.6 0.4 0.2 0 !0.2 Jan00 Jan04 MicroZ πfla 0.6 0.4 0.2 0 14 Jan02 Jan03 Time (d) 0.4 0.2 0 Jan04 Jan04 0.8 0.6 0.4 0.2 0 !0.2 Jan00 Jan01 Jan02 Jan03 Time (d) Jan01 Jan02 Jan03 Time (d) Jan04 MicroZ πdia MicroZooplankton: detritus (N) Mean and standard deviations Mean and standard deviations 0.8 Jan01 Jan02 Jan03 Time (d) 0.6 MicroZ πden MicroZooplankton: flagellates !0.2 Jan00 Jan01 0.8 !0.2 Jan00 MicroZooplankton: diatoms Mean and standard deviations Jan01 0.8 MesoZooplankton: detritus (N) Mean and standard deviations 0.8 !0.2 Jan00 MesoZ πden MesoZooplankton: MicroZooplankton Mean and standard deviations Mean and standard deviations MesoZooplankton: diatoms Jan04 0.8 0.6 0.4 0.2 0 !0.2 Jan00 Jan01 Jan02 Jan03 Time (d) Jan04 Preferences: ternary plots of the final estimates Microzooplankton Gelman’s formulation Mesozooplankton Detritus Diatoms True Initial DEnKF Diatoms MicroZ Flagellates Spherical formulation Mesozooplankton Detritus Diatoms 15 Detritus Microzooplankton Diatoms MicroZ Flagellates Detritus Spherical formulation: asymmetry of the transformation Robustness of the results to the matching preferences/transformed parameters Same 20 experiments (observation, prior ensembles) Microzooplankton: permutation in the assignment preferences/transformed parameters every four experiments. 8 > < > : Mesozooplankton 2 π φ1 ) 2 2 π 2 π πMIC = sin ( φ1 ) cos ( φ2 ) 2π 2 2 2 π πDET = sin ( φ1 ) sin ( φ2 ) 2 2 πDIA = cos ( 8 > < > : Microzooplankton 2 π πDET = cos ( φ1 ) 2 2 π 2 π πFLA = sin ( φ1 ) cos ( φ2 ) 2 2 π 2 2 π πDIA = sin ( φ1 ) sin ( φ2 ) 2 2 Microzooplankton: marginal improvement of the estimates of the preferences. Mesozooplankton: slight degradation of the estimation. Mesozooplankton RMS error (Mesozooplankton.) Diet Prior (%) Gelman Spherical Spher. + perm. 16 Diatoms 45 35 33.3 36.7 MicroZ 120 146.6 86.7 133.3 Detritus Detritus 32 44 48 52 Diatoms Initial DEnKF True MicroZ Conclusion and perspectives Two approaches for estimating positive sum-to-one constrained parameters. B Gelman’s formulation: Gaussian parameters, symmetry of the transformation, N transformed parameters. B Spherical formulation: N − 1 parameters to estimate, asymmetry of the transformation, distribution of the transformed parameters? Preliminary experiments indicate that the zooplankton grazing preferences could be estimated with both approaches. Quite simple framework. Further investigations have to be done. B Distribution. B Additional parameters to estimate. B Real observations (Mike station). 17 Thank you! 18 Bibliography Bertino L., Evensen G. and Wackernagel H.: Sequential Data Assimilation Techniques in Oceanography, International Statistical Reviews, 71, 223-241, 2003. Bocquet M., Pires C.A. and Wu L.: Beyond Gaussian statistical modeling in geophysical data assimilation, Monthly Weather Review, 138, 8, 2997-3023, 2010. Doron M., Brasseur P. and Brankart J.-M.: Stochastic estimation of biogeochemical parameters of a 3D ocean coupled physical-biogeochemical model: Twin experiments. Journal of Marine Systems, 87 (3-4), 194-207, 2011. Gelman A.: Method of Moments Using Monte Carlo Simulation. Journal of Computational and Graphical Statistics, 4, 1, 36-54, 1995. Gelman A., Bois F. and Jiang J.: Physiological Pharmacokinetic Analysis Using Population Modeling and Informative Prior Distributions. Journal of the American Statistical Association, 91, 436, 1400-1412, 1996. Simon E. and Bertino L: Gaussian anamorphosis extension of the DEnKF for combined state and parameter estimation: application to a 1D ocean ecosystem model, Journal of Marine Systems, 89, 1-18, 2012. Simon E., Samuelsen A., Bertino L. and Dumont D.: Estimation of positive sum-to-one constrained zooplankton grazing preferences with the DEnKF: a twin experiments, Ocean Science, 8, 587-602, 2012. Construction of a monovariate anamorphosis: Z (x) = ψ(Y (x)) Empirical anamorphosis Empirical marginal distribution of Z (x): sample (zi )i =1:N . Cumulative distribution function of Y (x): G . N X ψ(y ) = zi 1]G −1 ( i −1 ),G −1 ( N i =1 )] 8 3- Definition of the tails 30 25 4 2 0 !10 !5 Gaussian values 0 5 3 3 Physical values (mg/m ) 8 Physical values (mg/m ) 10 6 (y ) 2- Interpolation 10 3 Physical values (mg/m ) 1- Empirical anamorphosis i N 6 4 2 0 !10 20 15 10 5 !5 Gaussian values 0 5 0 !10 !5 0 5 Gaussian values 10 Major issues Choice of the physical data set: specification of the likely and unlikely values. Dependance on a reference model run (model bias, extreme events). 15
© Copyright 2026 Paperzz