アンサンブルカルマンフィルターによる大気海洋結合モデルへのデータ同化

アンサンブルカルマンフィルターによ
On-line
estimation
of
る大気海洋結合モデルへのデータ同化
observation error covariance
for ensemble-based filters
Genta Ueno
The Institute of Statistical Mathematics
1
Covariance matrix in DA
State space model
xt  f t  xt  1  G t vt ,
y t  ht  xt   wt
Cost function




x1 ~ N xb , B ,
vt ~ N 0, Qt ,
wt ~ N  0, Rt 
 1
1
1 T
 Q 1v
J x1, v 2 : T | y1: T  x1  x B

x


x
v
1 b 2
t t t
b
2
t2
 1
1 T
  y t  ht  x t  R t y t  ht  x t 
2 t 1
where xt  f t  xt  1  G t vt










2
Filtered estimates with different θ
Large Q
large sh大)
Large R
(large a)
Which one should be
chosen?
3
Ensemble approx. of distribution
Gaussian dist.
Vt | j
Non-Gaussian dist.
xt
xt
xt | j
Exactly
represented

N xt | j , V t | j
Kalman filter (KF)

Ensemble approx.
/ Particle approx.
xt
Ensemble Kalman filter (EnKF),
Particle filter(PF)
4
Kalman filter (KF)
Filtered dist. at t-1

yt Filtered dist. at
Predicted dist. at t


N xt  1 | t  1,V t  1 | t  1


N xt | t  1,V t | t  1
N x t | t ,V t | t
V t 1| t 1
V t | t 1
xt  1
xt  1 | t  1
xt
xt | t  1
xt | t  1  F t xt  1 | t  1
V t | t  1  F t V t  1 | t  1 F t  G t Qt G t
Kalman gain
Simulation

Vt |t
xt
xt | t


x t | t  1  K t y t  H t x t | t  1
K t  V t | t  1 H t H t V t | t  1 H t  Rt
xt | t

Vt |t

I  K t H t  V t | t  1
5
1
EnKF and PF
EnKF
xt(1) 1 | t  1
xt(1|)t  1
xt(1|)t  1
xt(1|)t
xt(1|)t
Resampling

 V
H H V
H  R
t | t  1 t  t t | t  1 t
t | t  1 
n
xt | t 
yt
xt(1) 1 | t  1
Approx. Kalman gain
Kt
PF
yt
p yt | xt(n| t)  1


1

n
n
n 
x t | t  1  K t  y t  wt  H t x t | t  1 


KF
xt  1
xt
6
xt
Likelihood
Which is the most likely distribution that produces
observation yobs ?

p y |q
1

yobs

p y |q
2
yobs
Likelihood L(t) = p(yobs|θ)
In this example, q3 is most likely.

p y |q
3

yobs
Likelihood of time series
L q  
p  y1, y 2,
, yT | q 


p  y1 | q   p  y 2 | y1, q   p y 3 | y1, y 2, q 
T
  p y t | y1: t  1, q
t 1



Find θ that maximizes L(θ).
In practice, log-likelihood is easy to handle:
q   log p  y1: T | q 

T
  log p y | y
,q
t 1: t  1
t 1


 p y N | y1, y 2,
,y
T  1, q

Likelihood of time series


T
q    log p yt | y1: t  1,q
t 1
T
 p y | x ,q p x | y
  log 
, q dx
t t
t 1: t 1
t
t 1

 
likelihood
Predicted dist.
Observation model
y  H w
t
t
t

w ~ N 0, R
t
t



N H x ,R
t t t


Non-Gaussian dist.
[due to nonlinear model]
If it were
Gaussian,
f
f
N  x , P 
t 
 t
Estimation of covariance matrix
Minimizing innovation [predicted error]
Maximum likelihood
1. With assumption of Gaussian dist. of state
• Naive
• Ensemble mean and covariance of state
• Adjustment according to cost function
• Matcing with innovation covariance
2. Without assumption of Gaussian dist. of state
• EnsembleThis
mean
of likelihood
study
Bayes estimation
Covariance matching
Ueno et al., Q. J. R. Met. Soc. (2010)
10
Ensemble approx. of likelihood
q   log p  y1: T | q 


T
,q
  log p y | y
t 1: t  1
t 1
T
 p y | x ,q p x | y
, q dx
  log 
t
t 1: t  1
t t
t 1
T
1 N 
 n   dx

q
,
x
|
y
p
log
  x x


t t | t  1  t
t t
N

n 1
t 1
T
1 N 
 n  ,q 
  log
 p y | x
t t | t  1 
N

n 1
t 1
T  dim y
t log 2  1 log
  
Rt
2
2
t  1

 



Observation model
y t  ht ( xt )  wt

w ~ N 0, R
t
t

Ensemble mean of likelihood
of each member xt|t-1(n)



 
N
n
n



1





1 y  H x

N
log

 log  exp    y t  H t x
R

 t  t
t t | t  1
1

t
|
t
2






n 1
 
• Find θ that maximizes the ensemble approx. log-likelihood.
11
Regularization of Rt
T  dim y
t log 2  1 log
l q    
Rt
2
2
t  1


N
 1

 
 





n
n

1
 log  exp   y t  H t xt | t  1 R t  y t  H t xt | t  1  log N 




 2 
n 1

Regularization with
Gaussian graphical model
12 neighborhood
Sample covariance
(singular due to n<<p)
12
Maximum likelihood


T 
N
 1

1
 
 
 dim y t

l q    
log 2  log Rt  log  exp   y t  H t xt n| t  1 Rt 1 yt  H t xt n| t  1  log N 



2
2

 2 
t  1
n 1


q  s , L , L , a 
 h
x

y
 



2

2

y
y


i
j 
 xi x j
2
Q  q ij   s exp 

 ,
h
2
2
2 Lx
2 Ly 




 
R  a
s

 0.1, 0.2, 0.5, 1, 2, 5, 10
h
L  4, 8, 20, 40
x
L  1, 2, 5, 10
y
a  1, 2, 5, 10, 20, 50,100, 200, 500
13
Data and Model
year
longitude
The color shows SSH anomalies.
14
Filtered estimates with different θ
Large Q
large sh大)
Large R
(large a)
Which one should be
chosen?
15
System noise: magnitude
l
s ,a  
p h
max l  s , L , L , a 
L ,L  h x y 
x y
16
System noise: zonal correlation length
l
L ,a  
p x
max l  s , L , L , a 
s ,L  h x y 
h y
17
System noise: meridional correlation
length
l  L , a   max l  s , L , L , a 
p y  s , L  h x y 
h x
18
Observation noise: magnitude
l
a  
p
max
l  s , L , L , a 
s ,L ,L  h x y 
h x y
19
Estimates with MLE
Filtered estimate
Smoothed estimate
year
longitude
qMLE   2m, 20deg, 5deg, 20 
magnitude = (5.95cm)2, correlation lengths= (2.38,20
2.52deg)
Summary for the first half
•
•
Maximum likelihood estimation can be carried out even for nonGaussian state distribution with ensemble approximation
Applicable for ensemble-based filters such as EnKF and PF
•
Estimated parameters:
q  s , L , L , a 
 h x y 
q
MLE

 2m, 20 , 5 , 20
Ueno et al., Q. J. R. Met. Soc. (2010)
• … Tractable for just four parameters?
21

Motivation for the second half
• The output of DA (i.e. “analysis”) varies with prescribed
parameter θ, where
θ = (B, Q1:T, R1:T)
B: covariance matrix of the initial state (i.e. V0|0)
Qt: covariance matrix of system noise
Rt: covariance matrix of observation noise
• My interest is how to construct optimal θ for a fixed dynamic
model
• Only four parameters so far …
• We allow more degree of freedom on R1:T
• (dim yt)2/2 elements at maximum
22
Likelihood of Rt
Current assumption
Log-likelihood
where 

R

t 1: t
R
,R
1: t  1 t
q  
 R1, R2 ,
dim y t
2
log 2 
 R1: T 


R
 R ,R , ,R
1: T
1 2
T
B and Q
are fixed
1: T

T
,R  
R
T
t 1: t
t 1
1
log Rt
2


R
t



N




n
n




1




1
 log  exp    y t  ht  x
 R t  y t  ht  x t | t  1   

t
|
t

1
 2



 

n 1


 log N
R
1: t  1
23
Estimation design


R

t 1: t
R
,R
1: t  1 t
dim y t
2
log 2 
1
log Rt
2
R
t



N




n
n




1




1
 log  exp    y t  ht  x
 R t  y t  ht  x t | t  1   

t
|
t

1
 2



 

n 1


 log N
R
1: t  1
• Use ℓt(R1:t) for estimating Rt only
• It is of course that R1:t-1 are parameters of ℓt(R1:t)
• But they are assumed to have been estimated with former log-likelihood, ℓ1(R1),
…, ℓt-1(R1:t-1) , and to be fixed at current time step t.
• Rt is estimated at each time step t.
Bad news:
• The estimated Rt may vary significantly between different time steps.
• A time-constant R cannot be estimated within the present framework.
24
Experiment
• Assumed structure of Rt
case 1: R  20 (control)
t
case 2 : R  a 
t
t

case 3 : R  diag r ,
t
1
case 4 : R  a I
t
t
,r
m


25
Data and Model
year
longitude
The color shows SSH anomalies.
26
Estimate of Rt (Temporal mean)
R  20
t
R a 
t
t

R  diag r ,
t
1
,r
m

R a I
t
t
var
cov
•Case at: similar output for 20.
•Case diagonal: large variance near equator, small variance for offequator
•Case atI: uniform variance with intermediate value
27
Estimate of Rt (Spatial mean)
R  20 R  a 
t
t
t

R  diag r ,
t
1
,r
m

R a I
t
t
var
1992- year -2002
• Case at: small variance for first half, large for second half
• Case diagonal: large variance around 1998
• Case atI: similar for the diagonal case
28
Filtered estimates
R  20
t
R a 
t
t

R  diag r ,
t
1
,r
m

R a I
t
t
•Case at: false positive anomalies in the east
•Case atI: negative anomalies in the east, but the equatorial Kelvin
waves unclear
•Case diagonal: negative anomalies and equatorial Kelvin reproduced
29
Iteration times
R a 
t
t

R  diag r ,
t
1
,r
m

R a I
t
t
• Only 2-4 times
• Small number of parameters requires large iteration numbers
30
Summary of the second half
• An on-line and iterative algorithm for estimating
observation error covariance matrix Rt.
• The optimality condition of Rt leads a condition
of Rt in a closed form.
•Application to a coupled atmosphere-ocean model
•Only 4-5 iterations are necessary
•A diagonal matrix with independent elements
produces more likely estimates
than those of scalar multiplication of fixed matrices
( or I).
31