Zupanski, M., 2004 - Ensemble

2004 SIAM Annual Meeting
Minisymposium on Data Assimilation and Predictability for
Atmospheric and Oceanographic Modeling
July 15, 2004, Portland, Oregon
ISSUES IN FURTHER DEVELOPMENT OF
ENSEMBLE DATA ASSIMILATION
Milija Zupanski
Cooperative Institute for Research in the Atmosphere
Colorado State University
Fort Collins, CO 80523-1375
[email protected]
Outline
 Probabilistic analysis-prediction
- ensemble framework
 Gaussian Probability Density Function (PDF) framework
- non-Gaussian PDFs
- nonlinearity
 Maximum Likelihood Ensemble Filter (MLEF)
 Model Errors
Why ensemble data assimilation ?
 Analysis-prediction problem is probabilistic
• Inherent uncertainties in observed and predicted values:
- Observation errors
- Model errors
- Turbulence, Convection
• Kolmogorov equation:
- Transport of Probability Density Function (PDF)
- General mathematical framework for analysis-prediction
• Chaotical Atmosphere/Ocean/Land processes:
- Existence of low-dimensional attractor subspace suggests the need for
‘likelihood’, rather than deterministic knowledge of prediction
 Highly nonlinear processes and interactions in real atmosphere/ocean
- Ensemble DA methodologies are best equipped to handle nonlinearities

Practical aspects
- Parallel computing, code development
Forward Kolmogorov Equation
p( x, t )
[ p( x, t ) f ( x, t )] 1  2 [ p( x, t ) g 2 ( x, t )]


t
x
2
x 2
p – probability density function (PDF); f – dynamical model; g – stochastic forcing (model error)
Prediction
Data Assimilation
• Prediction:
Estimate of the forecast PDF
• Data Assimilation: Estimate of the initial PDF
Implications of Kolmogorov Equation Framework

THERE IS A SINGLE PROBABILISTIC ANALYSIS-PREDICTION SYSTEM
Current systems:
- only weak coupling between analysis and prediction
- modeled forecast PDF information in data assimilation
- practical DA algorithms estimate only a single PDF parameter (e.g., PDF mode)
- analysis PDF estimate is commonly NOT produced
New systems:
- fully coupled: complete feedback between prediction and analysis
- estimate of: (i) analysis PDF, and (ii) forecast PDF
- possibility to estimate various PDF parameters: mode, mean, covariance, . . .
What do we want from PDF?
 Likelihood of an event occurring
- optimal PDF parameter estimate
- uncertainty of the estimate
 PDF parameters
- conditional mean
- conditional mode
- covariance
-...
Gaussian PDF
Maxwell PDF
 Conditional probability using Bayes formula:
Pr( x  xtrue | y  yobs )  Pr( y  yobs | x  xtrue) Pr( x  xtrue)
Event A:
x  xtrue
Event B:
y  yobs
Practical limitations of PDF parameter estimation

LARGE NUMBER OF DEGREES OF FREEDOM (DIMENSIONS)
- computational burden: memory allocation, efficiency

REDUCING THE NUMBER OF DEGREES OF FREEDOM
- statistical sampling of PDF
- ensemble framework: span dynamically important (e.g., unstable) subspace

Statistical PDF parameters estimation methods:
Minimum variance: Ensemble mean
- Monte Carlo (ensemble) Kalman Filter (EnKF) – stochastic filters
- Ensemble Square-Root filters (EnSRF)– deterministic filters
Maximum likelihood: Ensemble mode (deterministic control)
- variational data assimilation
- Maximum Likelihood Ensemble Filter (MLEF)
EKF/EnKF/EnSRF as a
quadratic optimization process
Consider a Gaussian conditional PDF
1

1

Pr  x  xtrue | y  yobs   exp  [ y  H ( x )]T R 1[ y  H ( x )] exp  ( x  xb )T Pf1 ( x  xb )
2

2

Subject to
H ( x )  H ( xb )  H ( x  xb )
 H 
H 


x


Form a quadratic cost function: J= - ln(Pr)
J
1
1
( x  xb )T Pf1 ( x  xb )  [ y  H ( x )]T R 1[ y  H ( x )]
2
2
Pf - forecast error covariance
R - observation error covariance
H - nonlinear observation operator H - linearized observation operator (Jacobian)
y - observation vector
x - analysis vector
xb - first-guess vector
 Search for x (e.g., analysis) which maximizes the conditional probability
(e.g., minimizes the cost function)
Linear KF analysis solution
(with Gaussian PDF assumption)
Maximum likelihood and minimum variance estimates identical for Gaussian PDF
(1) One-step solution of quadratic optimization problem:
xa  xb  Pf H HPf H  R [ y  H ( xb )]
T
linear H=> step-length =1
-1
T
(2) Direct solution of EKF/EnKF/EnSRF:
xa  xb  Pf H HPf H  R [ y  H ( xb )]
T

T
-1

Pf H T  xb  xb H ( xb )  H ( xb )



T
HP f H T  H ( xb )  H ( xb ) H ( xb )  H ( xb )

T
Linear solution framework:
EKF, EnKF, EnSRF solution form obtained by assuming linear observation operators
Nonlinearity
Issue 1: Observation and model operators are highly nonlinear
- Nonlinear prediction model M used in Pf
- Nonlinear observation operator H used in PfHT and HPfHT

Options:
(1) Use linear form of the solution, combined with nonlinear models in covariance
calculation
- current EnKF, EnSRF algorithms
(2) Directly search for nonlinear solution by minimizing non-quadratic cost function
- Maximum Likelihood Ensemble Filter (MLEF)
Remaining question:
- How restrictive is the linear form of the KF, EnKF solution ?
- Should nonlinearity of H be included in a more consistent manner ?
Non-Gaussian PDF assumption

Fundamental problem: Inconsistent PDF assumption
- Operators are nonlinear (observation, model), Gaussian assumption violated
- Gaussian assumption known to be incorrect for some variables (e.g., precipitation,
clouds, etc.)
- Current mathematical framework used in realistic data assimilation relies heavily
on Gaussian PDF assumption (e.g., cost function, PDF)

Need general mathematical framework: Non-Gaussian PDFs
A solution: Within the Max Likelihood (MLEF) approach,
optimize arbitrary non-Gaussian conditional PDF

Remaining problem: Multi-modal PDFs
Statistical PDF parameters
Mean
Mode
PDF
PDF
Uni-modal
xmean
xmode
x
Mean
PDF
x
Mode
PDF
Bi-modal
xmean
Dynamical state
x
xmode
xmode
Dynamical state
x
Maximum Likelihood Ensemble Filter (MLEF):
MLEF developed using ideas from:
• Variational data assimilation (3DVAR, 4DVAR)
• Iterated Kalman Filters
• Ensemble Transform Kalman Filter (ETKF)
Algorithm specifics:
• Nonlinear cost function minimization – as in 3DVAR, 4DVAR
• Unconstrained minimization, well suited for larger residuals (C-G, LBFGS)
• Hessian preconditioning using the ETKF transformation
• Major assumption: Inverse Hessian = Analysis error covariance
=> satisfactory if solution is close to the minimum
References
Zupanski, D., and M. Zupanski, 2004: Model error estimation employing ensemble data assimilation approach. Submitted to Mon.
Wea. Rev. [Available at ftp://ftp.cira.colostate.edu/milija/papers/MLEF_model_err.pdf]
Zupanski, M., 2004: The Maximum Likelihood Ensemble Filter. Theoretical aspects. Submitted to Mon. Wea. Rev. [Available at
ftp://ftp.cira.colostate.edu/milija/papers/MLEF_MWR.pdf]
Maximum Likelihood Ensemble Filter (MLEF)
- conditional PDF mode by minimization of cost function
Korteweg-de Vries-Burgers (KdVB) Equation
u
u
 3u
 2u
 u  6 3  2
t
x
x
x
Experiment:
• Nonlinear advection, dispersion, diffusion
• Periodic boundary conditions
• Two solitary waves (solitons)
• Model domain: 101 grid-points
• Observation error: 0.05 units
• 10 observations (perfect model + perturbation)
• 3 minimization iterations in each MLEF analysis cycle
MLEF data assimilation with KdVB model
(quadratic obs operator, 10 ensembles, 10 obs)
H(x)=x2
RMS error
Analysis error covariance
IMPACT OF MLEF ASSIMILATION
(quadratic observation operator - 10 obs)
RMS error
2.00E-01
NO OBS
1.50E-01
MLEF
1.00E-01
Cycle 1
Cycle 4
5.00E-02
0.00E+00
1
11
21
31
41
51
61
71
81
91
Analysis cycle
RMS error
IMPACT OF MINIMIZATION
(quadratic observation operator - 10 obs)
NO MIN
7.00E-02
6.00E-02
5.00E-02
4.00E-02
3.00E-02
2.00E-02
1.00E-02
0.00E+00
MLEF
1
11
21
31
41
51
61
Analysis cycle
71
81
91
Model dynamics helps in
localization of analysis error
covariance !
Model errors in
Ensemble Data Assimilation (EnsDA)
•
More important than ever before !
- Forecast error covariance information relies on model forecasts:
if incorrect, the forecast error covariance is incorrect !
•
Model bias, empirical parameters, physics, truncation errors, …
•
Improve the spread of ensemble forecasts
•
Optimal estimate of model error
•
Optimal estimate of model error covariance
•
Can be used to learn about the sources of model error
Model error estimation
State augmentation approach:
- adopted in MLEF (and NCEP’s Eta 4DVAR)
 xn   M  xn-1 , γ   Φn-1  (1   )b
 x0 
Φ   
F b

Φ

(
1


)
b
n -1
 n 
 0,n  
I
 γ  

 γ 
x0 – initial conditions ; b – model bias ; g – empirical parameters
Augmented control variable:
z  x0 ,b, γ 
Augmented error covariance:
 Px0 , x0

P   Pb, x0
P
 γ , x0
Px0 ,b
Pb,b
Pγ ,b
Px0 ,γ 

Pb,γ 
Pγ ,γ 
Model error estimation – cont.
State augmentation approach:
- initial conditions + model bias
 xn   M xn-1   Φn-1  (1   )b
 x0 
 F0,n  
Φ   

Φn-1  (1   )b
b
 n 

x0 – initial conditions ; b – model bias
Augmented control variable:
z   x0 ,b 
Augmented error covariance:
 Px0 , x0
P  
 Pb, x0
Px0 ,b 

Pb,b 
MLEF data assimilation with KdVB model
Augmented analysis error covariance matrix
Cross-covariance between
model bias and
initial conditions: Px0,b
Auto-covariance for
initial conditions: Px0,x0
Auto-covariance for
model bias: Pb.b
From Zupanski and Zupanski 2004, MWR [Available at
ftp://ftp/cira.colostate.edu/milija/MLEF_model_err.pdf]
Conclusions
 Unified probabilistic analysis-prediction system is important in
addressing the atmospheric and oceanographic issues:
- sampling of analysis-prediction PDF (ensemble framework)
- complete feed-back between ensemble data assimilation and
ensemble forecasting
 Treatment of nonlinearities of prediction model and observation
operator can be improved with cost function minimization (MLEF)
 Model errors (bias, empirical parameters) need to be included in realistic
ensemble data assimilation applications
 Need non-Gaussian PDF framework
Future development
 Non-Gaussian PDF framework within MLEF approach
- Control theory application
- Direct optimization of non-Gaussian conditional PDFs
- Nonlinear observation and model operators
- Global shallow-water model on geodesic grid
- Optimization algorithms, Hessian preconditioning
 Applications with NCEP’s Global Forecasting System
- Comparison between the conditional mean and conditional mode
ensemble data assimilation
- Real measurements, operational prediction model
- Practical aspects: fine resolution control, coarse resolution ensembles