Sampling strategies and SQRT analysis

Sampling strategies and SQRT analysis
schemes for the EnKF
Geir Evensen
Norsk Hydro Research Centre, Bergen, Norway
Based on Evensen 2004, submitted to Ocean Dynamics
– p.1
Background
EnKF/EnKS in native formulation are pure Monte Carlo
methods.
Uses Monte Carlo sampling for initial ensemble, model noise
and measurement perturbations.
Uses stochastic equation for ensemble integration.
Computes analysis based on ensemble perturbations and
measurement perturbations.
Review by Evensen (2003), Ocean Dynamics, 53, 343–367.
– p.2
Part one: Outline
Sampling errors can be reduced by:
Wise sampling of initial ensemble and model noise as
motivated by
Pham (2001), MWR.
Nerger et al (to appear 2004), MWR.
Elimination of measurement perturbations in the analysis
scheme is possible by use of “square root” algorithms as
shown by
Andersen (2001), MWR.
Bishop et al (2001), MWR.
Whitaker and Hamill (2002) MWR.
Tippett et al (2003) MWR.
– p.3
EnKF: Ensemble representation
Define the ensemble matrix
)
The ensemble mean is (defining
The ensemble perturbations becomes
becomes
The ensemble covariance matrix
– p.4
EnKF: Measurement perturbations
, define
#
$
"!
Given a vector of measurements
&
%
stored in
#
#
#
'
The ensemble perturbations are stored in
'
(
'
thus, the measurement error covariance matrix becomes
– p.5
EnKF: Analysis equation
*
%
,+
(
!
*
*
*
*
and using previous
%
'
!
2
!
+
%
0/
1 !
'
*
*
*
)
+
.
-
%
Defining the innovations
definitions:
%
!
)
The analysis equation can now be written
and
!
+
%
1 3
/
!
'
/
'
/
1
and
2
*
/
where
– p.6
EnKF with linear exact model
4
6
7
5 4
Linear noise free model
9
2
2
: 9
4
6
8
5 4
4
EnKF with linear noise free model
6
9
;
5
With rank
and rank
, the quality of the EnKF
solution is dependent on the rank and conditioning of the initial
.
ensemble
– p.7
Improved sampling: Introduction
<
<
=
Full covariance
>
?
<
<
A
>
?
>?
.
A
=
?
and
?
<
A
or
>
C
D
=
B
A
When
>?
>
@?
@
>?
Ensemble covariance
– p.8
Improved sampling: Approach
eigenvectors of
Should be constructed using the first
these are too expensive to compute.
but
>
computed
F
>
in
singular vectors of
.
E
Store first
>
E
@
?>
E
E
E
E
<
Approximate eigenvectors by singular vectors
from a large ensemble of perturbations:
.
Generate an ensemble which best possibly represents
in
.
?
dominant singular values of
?
Store
@
Generate a random orthogonal matrix
.
@
G +
>
? Compute
– p.9
Improved sampling: Conditioning
Singular value spectrums
1
n=100
n=200
n=300
n=400
n=500
n=600
n=700
n=800
0.8
0.6
0.4
0.2
0
0
20
40
60
80
100
Singular value
– p.10
Example: Model
.
K
and time step is
LH
K
Advection speed is 1.0,
I H
J
Linear one-dimensional “exact” advection model on periodic
domain.
M
True initial condition sampled from the distribution which
has mean equal to zero, variance equal to one and spatial
decorrelation length 20.
,
M
First guess is true state pluss another sample drawn from
thus initial variance is assumed to be one.
M
Initial ensemble is generated by adding samples drawn form
, to the first guess.
Four measurements every 5th time step with std. dev. 0.1.
Integration length is 300 time units.
– p.11
Example (Time t=3)
7
Reference
Estimate
Measurements
Standard deviation
6
5
4
3
2
1
0
0
200
400
600
800
1000
x-axis
– p.12
Example (Time t=121)
7
Reference
Estimate
Measurements
Standard deviation
6
5
4
3
2
1
0
0
200
400
600
800
1000
x-axis
– p.13
Example (Time t=241)
7
Reference
Estimate
Measurements
Standard deviation
6
5
4
3
2
1
0
0
200
400
600
800
1000
x-axis
– p.14
Improved sampling: Initial ensemble
Residuals: Impact of initial sampling
0.9
Exp. A (100)
Exp. B (100)
Exp. C (200)
Exp. D (400)
Exp. E (600)
0.85
0.8
0.75
0.7
0.65
0.6
0.55
0.5
0
5
10
15
20
25
30
Simulation number
35
40
45
50
– p.15
Improved sampling: Measurements
Residuals: Impact of improved sampling of measurement perturbations
0.9
Exp. B
Exp. I
Exp. E
Exp. H
0.85
0.8
0.75
0.7
0.65
0.6
0.55
0.5
0
5
10
15
20
25
30
Simulation number
35
40
45
50
– p.16
A square root analysis scheme (1)
N
*
!
+
/
1 O
)
N
The ensemble mean can be updated from
N
*
(
+P
!
*
*
*
)
N
N
N
The analysis covariance is defined as
/
.
+
/ 1 )
)
Q
or, in ensemble notation
<
+=
R
+
<
1
<
=
<
1
1
Inverse of
– p.17
<
/ +=
2
@
?
>
with the SVD of
2
.
2 2
-
<
/ / +=
<
+=
T
T
T
S
S
S
/
<
-
.
/ +=
<
)
)
A square root analysis scheme (2)
We get
defined as
– p.18
@
? ?
)
? ?
@
? ?
@
-
@
.
? ?
@
@
.
? ?
@
-
@
?
>
@
?
>
)
)
V.
U
V
U
-
A square root analysis scheme (3)
We then get
The analysis equation becomes
– p.19
A square root analysis scheme (4)
Residuals: Impact of SQRT analysis
0.9
Exp. B
Exp. E
Exp. H
Exp. G
0.85
0.8
0.75
0.7
0.65
0.6
0.55
0.5
0
5
10
15
20
25
30
Simulation number
35
40
45
50
– p.20
Impact of ensemble size (1)
Residuals: Impact of ensemble size
0.9
Exp. B
Exp. E
Exp. B150
Exp. B200
Exp. B250
Exp. G
0.85
0.8
0.75
0.7
0.65
0.6
0.55
0.5
0
5
10
15
20
25
30
Simulation number
35
40
45
50
– p.21
Impact of ensemble size (2)
Residuals: Impact of ensemble size
0.9
Exp. B
Exp. G50
Exp. G52
Exp. G55
Exp. G60
Exp. G75
Exp. G
0.85
0.8
0.75
0.7
0.65
0.6
0.55
0.5
0
5
10
15
20
25
30
Simulation number
35
40
45
50
– p.22
Summary: Part one
Size matters!
With the right technique size is not all!
Sampling of initial ensemble (and model noise).
Square root formulation for analysis scheme.
Details in Evensen 2004, Ocean Dynamics.
W
F90 code for new routines available from
http://www.nersc.no/ geir/EnKF
– p.23