Appendix-B: Modelling of MSIs - Proceedings of the Royal Society A

Electronic Supplementary Material
A Sequential Importance Sampling Filter with a New Proposal
Distribution for State and Parameter Estimation of Nonlinear
Dynamical Systems
Shuva. J Ghosh1, C S Manohar2, and D Roy3
Structures Lab, Department of Civil Engineering, Indian Institute of Science
Bangalore 560 012, India
1
Research student
Professor, Author for correspondence, Email: [email protected]
3
Associate Professor; Email: [email protected]
2
Appendix-A: Modeling of Gaussian Multiple Stochastic Integrals
This Appendix demonstrates, using an example, the method of computing covariance
matrices of Gaussian MSI-s, which appear in the Ito Taylor's expansion of a
multidimensional SDE. An n-dimensional Ito SDE driven by an m-dimensional Wiener
process B(t ) 
B (t ), B (t ), B (t ),, B (t ) is considered in the following form:
1
2
3
m
du(t )   (u(t ), ud , t )dt   (u(t ), t )dB(t );
u (t )  R n ,
where
u(0)  u0
(A-1)
 : R n  R d  R  R n is the drift coefficient ,
ud  R d
is
the
deterministic forcing function,  : R n  R  R m is the diffusion coefficient matrix,
dB(t )  R m is the increment vector of standard Brownian motion processes and u0  R n is
the initial condition vector modeled as a vector of random variables. The equation may be
represented component-wise as:
m
du (i ) (t )   (i ) (u(t ), ud , t )dt    (i , j ) (u(t ), t )dB( j ) (t ) , u (i ) (0)  u0(i ) ; i  1,..., n (A-2)
j 1
We
note
that
matrix   [ 1 ,  2 ,
 ( i ,j )
is
the
(i,j)-th
component
of
the
(n  m)
 m ] with  j as the j -th column vector .
Let  (t , u(t )) be a C 2 function,  : R  R n  R v . Hence  is another Ito process. Then,
using Ito’s formula for to  t  s (Kloeden and Platen 1992), we have:
s
s
r 1t
t
m
 ( s, X ( s))   (t , X (t ))     r  ( s1 , X ( s1 )) dBr ( s1 )   L ( s1, X ( s1 )) dBr ( s1 )
(A-3)
The operators are given by
n
 r =   rj (t , X )
j 1

,
X j
n


1 m n n
2
L     j (t , X )
    rj (t , X )  ri (t , X )
t j 1
X j 2 r 1 i 1 j 1
X i X j
(A-4 a, b)
Repeated applications of Ito’s formula to the functions within the integrals yield ItoTaylor’s expansion, which can be generated to a desired order of accuracy. One of the
distinguishing features of the stochastic (Ito) Taylor expansion is that it involves multiple
stochastic integrals (MSI-s), which are zero-mean correlated random variables. Some
examples of MSIs have been provided in section 2 (equations 2.6 and 2.7). These
integrals involve increments of scalar Wiener processes.
In what follows, we illustrate the typical procedure for computing the covariance matrix
of a set of Gaussian MSI-s. Ito's formula may be used to determine the elements of this
covariance matrix. The covariance structure of the typical set of MSI-s I r , I r 0 , I r 00 and
I s , r  s, is derived in detail. Over the interval (tk, tk+1], elements of the covariance
matrix are obtained by using Ito’s formula. Consider the following scalar SDE-s:
du  dBr (t ), dv  udt and dw  vdt
(A-5)
subject to zero initial conditions. It follows that wk 1  I r 00 . From Ito’s formula, we get
the following scalar SDE for w2 (t ) .
dw2  2vwdt
(A-6)
Thus we have E[ w ]  E[ I
2
k 1
2
r 00
] 
tk 1
 E (vw)dt . Similarly, for vw, uw, v
2
and uv , we
tk
have:
d (vw)  (uw  v 2 )dt , or , E[vw]k 1  E[ I r 0 I r 00 ] 
tk 1
 E(uw
 v 2 )dt ,
(A-7)
tk
d (uw)  uvdt  wdBr , or , E[uw]k 1  E[ I r I r 00 ] 
tk 1
 E (uv)dt ,
(A-8)
tk
d (uv)  u dt  vdBr , or , E[uv]k 1  E[ I r I r 0 ] 
2
tk 1

E (u 2 )dt 
tk
d (v )  2uvdt , or , E[v ]k 1  E[ I r 0 ] 
2
2
2
tk 1

tk
2
,
2
3
E (uv)dt  ,
3
(A-9)
(A-10)
where   tk 1  tk . Now, working backwards, we can obtain the other terms of the
covariance matrix. To summarize, we have the following covariance matrix:
 Ir 
I 
 r0 
 
 I r 00 
 I s 




 0  
   2
0 
N   ,  2
 0 
   3
0   6



0
2
2
2
3
4
8
0
3
6
4
8
5
20
0

0 


0 
.


0 

  
(A-11)
Appendix-B: A Pseudo Code for the Proposed Method
A pseudo-code for the implementation of the SIS filter for state estimation via the new
proposal density is given below. The definitions and dimensions of various variables are
available in sections 2, 3 and 4.
B.1 Governing Equations and Approximations
Start with the governing SDE, given by equation 2.3:
dx(t )  a( x(t ), ud , t )dt  b( x(t ), t )dB(t ) ;
x(0)  x0
(B-1.1)
Discretize the SDE B-1.1 by Ito-Taylor’s expansion to arrive at the following process
equation in discrete time:
xk 1  ak ( xk , ud )  bk ( xk , ud ) wk  ck ( xk , ud ) k ; k  0,1, 2,... (B-1.2)
k
k
k
Measurements (system response sampled at a set of discrete time instants) are assumed to
be modeled by the following equation:
yk  hk ( xk )  qk ( xk )mk ; k  1, 2,3,...
(B-1.3)
Following the approximations provided by equations 4.2a and 4.2b, the measurement
equation may be written as:
yk  hk ( xk -1 )  Bk ( xk -1 )mk  B1k ( xk -1 ) k ; k  1, 2,3,...
(B-1.4)
B.2 Computational Implementation of the Filter


1. Set k = 0; draw samples {xi ,0 }in1 from p( x0 ) and assign initial weights W  xi ,0:0 
N
i 1
.
2. For k = 1, 2,
A. Sampling and Weight Calculation
Calculate Q , R following the procedure outlined in appendix A.
For i =1, 2,…, N:
a. Importance Sampling to Generate Samples

C4  bk 1 ( xi ,k -1 , udk1 ) Q bk 1T ( xi ,k -1, udk1 );
 For j=1,2,.., N1
i.

Generate samples of the non Gaussian MSI-s i ,jk   i j,k 1 T ,  i j,k
T

T
using
the appropriate formulae.
ii.
M i j,2  hk ( xi ,k -1 )  B1k ( xi ,k -1 ) i j,k
M i j,3  [{ak 1 ( xi ,k 1 , udk 1 )  ck 1 ( xi ,k 1 , udk 1 ) i j,k 1 }T , {hk ( xi ,k -1 )  B1k ( xi ,k -1 ) i j,k }T ]T
M i j,4  ak 1 ( xi ,k 1 , udk 1 )  ck 1 ( xi ,k 1 , udk 1 ) i j,k 1
iii.
iv.
Ci j,2  Bk ( xi ,k -1 , i j,k ) R BT ( xi ,k -1 , i j,k )
For the specific problem, obtain  (the vector of Gaussian random variables
consisting of all the elements of the Gaussian vectors wk 1 and mk ) and
( xk -1 , k ) (the corresponding coefficient matrix obtainable from the joint
representation of xk and yk given xk 1 , k ). Now find:
Ci j,3  ( xi ,k -1 , i ,jk ) E[{}{}T ]T ( xi ,k -1 , i ,jk ) .
Pick out the cross covariance terms C i j,3 of the matrix C i j,3 .
v.

M i ,jI  M i j,4  C3C i j,2 -1 ( yk - M i j,2 ) ;
C i j,I  C4 - C i j,3 C i j,2 -1C i j,3 T .
Using a histogram of the generated samples of
i ,k  {i ,jk }Nj 1 , obtain the
1
weights pij associated with the samples. The optimal ispdf is given by
N1
 ( xk )   pij N  M i j,I , C i j,I  from which the sample xk(i ) is drawn.
j 1

Calculation of Weights
The calculation weights needs the computation of the following densities.
i.
Evaluate the Gaussian mixture density ( xk ) at the sampled value xk(i )
ii.
When the discretization of the process equation leads to non-Gaussian MSIs, then p  xk | xi ,k 1  may be evaluated as a mixture of Gaussian densities as
p  xk | xi ,k 1  
qis
is
N1
 q p( x
s
i
k
s 1
obtained
| xi ,k 1 , is,k 1 ) where p( xk | xi ,k 1 , is,k 1 ) ~ N ( M is,4 , C4 ) and
from
the
histogram
of
the
generated
samples
of i ,k 1  { ij,k 1}Nj 11 . This Gaussian mixture density is evaluated at the
sampled value xk(i )
iii.
p( yk | xi ,k )
N hk ( xi ,k ), C5 ( xi ,k )  with C5  xi ,k   qk ( xi ,k ) Rqk ( xi ,k )T
Calculate the weight as:
W  xi ,0:k   W  xi ,0:k 1 
p  yk | xi ,k  p  xi ,k | xi ,k 1 
 ( xi ,k )
.
B. Resampling
The effective sample size Neff 
1
W  x 
N
is calculated and resampling is done using
2
i ,k
i 1
any standard resampling algorithm if it goes below the threshold sample size N thres .
Following resampling,
x 
N
i
0:k i 1
is the set of final samples retained. Set k  k 1 .
Appendix-C: Simplifications of the Proposed Density in the
Presence of Gaussian Random Variables Only
As noted in comment (c) (section 4), in such cases where the discretization of the process
SDE or the measurement equation does not lead to non-Gaussian MSI-s, the general
filtering strategy, as proposed in section 4, can be considerably simplified. This is a
commonly occurring situation particularly in the context of many problems in structural
system identification, where noises are typically considered to be additive. The order of
Ito-Taylor discretization employed to obtain a faithful representation of the time-
continuous system is often such non-Gaussian MSI-s do not arise. Also, if the order to
which the nonlinear observation function is approximated is such that non-Gaussian MSIs do not occur, then the random variables occurring in the problem are entirely Gaussian.
In this case, the discretized version of equation 2.3 may be cast in the form:
xk 1  ak ( xk , ud )  bk ( xk , ud ) wk
k
(C-1)
k
where ak  R s  R f  R s , bk  R s  R f  R s  R 2 , wk  R 2 . The vector
n
n
wk contains
lower order MSI-s that are strictly Gaussian (with wk ~ N[M1 , C1 ], M1  0 ). Also, the
measurement equation 2.9 can be approximated as:
yk  hk ( xk -1 )  Bk ( xk -1 )mk
(C-2)
Here hk : R s  R  R m , the vector mk  R n5 represents the Gaussian terms as a result of
the approximation and Bk : R s  R  R m  R n5 . It follows from equation C-2 that
p( yk | xi ,k 1 ) is Gaussian and is given by:
p( yk | xk 1 )  N (M 2 , C2 )
(C-3)
where M 2  hk ( xk -1 )  B1k ( xk -1 ) k , C2  Bk ( xk -1 ) R B T ( xk -1 )
and R is the covariance
matrix of the resultant Gaussian noise vector mk in equation C-2. Considering equations
C-1 and C-2, it follows that xk and yk are jointly Gaussian given xk 1 , i.e.,
p( yk , xk | xk 1 )  N ( M 3 , C3 )
(C-4)
with M 3  {ak 1 ( xk 1 , udk1 ), hk ( xk -1 ) }T and C3  ( xk 1 ) E[{}{}T ]T ( xk 1 ) . From the
discrete map C-1, it follows that:
p( xk | xk -1 )  p( xk | xk -1 )  N ( M 4 , C4 )
where M 4  ak 1 ( xk 1 , udk 1 ) ,
C4  bk 1 ( xk 1 , udk1 ) Q bk 1T ( xk 1, udk1 ),
(C-5)
and
Q
is
the
covariance matrix of wk 1 . Based on the theory of vector Gaussian random variables, it
may be shown that:
p( xk | yk , xk 1 )  N (M I , CI )
(C-6)
with M I  M 4  C3C2-1 ( yk - M 2 ) ; CI  C4 - C3C2-1C3T . Here C3 denotes the cross terms
of the covariance matrix C3 . This is the required optimal ispdf, which turns out to be
Gaussian instead of a weighted Gaussian mixture density. In this case, the drawing of
samples and calculation of the corresponding weights become simpler. Accordingly, the
computational overhead is significantly reduced due to the avoidance of the additional
Monte Carlo steps.
Appendix-D: The Special Case of Linear Measurement Equations
and only Additive Gaussian Noises
When the process and measurement equations conform to the format given in equation
3.9, the proposed optimal ispdf may be shown to reduce to the ideal Gaussian ispdf valid
for the system given by equation 3.10. Here the proposed method leads to the already
existing closed form solution as the nonlinearity in the measurement equation goes to
zero.
We apply the proposed method to arrive at an optimal ispdf for the system given by
equation 3.9. Comparing it with the measurement equation 2.9, it follows that the
observation function for this case is given by hk (t , Z (t ))  HZ (t ) . Since H is constant, we
have the following identity via equation 4.2a:
hk (tk , Z (tk ))  HZ (tk )  Hf ( Z k -1 )  H k
(D-1)
From equation D-1, it can be shown (in the same manner as in section 4 or appendix C)
that the importance function turns out to be a Gaussian density whose mean and
covariance are specified by equation 3.10.
Reference
Kloeden, P.E. & Platen, E. 1992 Numerical solution of stochastic differential equations,
Springer, Berlin.