Convergence of Sequential Monte Carlo Methods

Convergence of Sequential
Monte Carlo Methods
Dan Crisan, Arnaud Doucet
Problem Statement

X: signal, Y: observation process

X satisfies X 0   0 (dx0 ) and evolves according to
the following equation,
Pr ( X t  At | Y0:t 1  y0:t 1 , X 0:t 1  x0:t 1 )
  kt ( y0:t , x0:t 1, dxt ), At  B( R nx )
At

Y satisfies
Pr (Yt  Bt | Y0:t 1  y0:t 1 , X 0:t  x0:t )
  g t ( y0:t , x0:t )dyt , Bt  B ( R y )
n
Bt
Bayes’ recursion

Prediction
 t ( A0:t )  
A0:t 1
kt ( y0:t , x0:t 1, At ) t 1 (dx0:t 1 )
(  t , f t )  ( t 1 , kt f t )

Updating
 t ( A0:t )  Ct1  g t ( y0:t , x0:t )  t (dx0:t 1 )
A0:t
( t , f t )  (  t , f t g t )(  t , g t ) 1
A Sequential Monte Carlo
Methods

Empirical measure
1
 (dx )    (dx )
N
Transition kernel t ( y0:t , x0:t 1 ,  t 1 , dx0:t )
Importance distribution ~t   t 1t
N
N
t


0:t
i 1
(i )
x0:t
0:t

 t : abs. continuous with respect to

h
: strictly positive Radon Nykodym derivative
d
d
~

~t
h
~


Then is also continuous w.r.t. t and
d
d
~
 g h
Algorithm

Step 1:Sequential importance sampling


sample: ) t:0~xd , 1Nt  , 1)ti:0( x , t:0y ( t  ~ ) )ti:(0~x (
evaluate normalized importance weights
) )ti:(0~x , 1Nt  , t:0y ( th) )ti:(0~x , t:0y ( t g 
) i(
t
w
and let
1 N
N
~
 t (dx0:t )    ~x ( i ) (dx0:t )
N i 1 0:t
1 N
N
 t (dx0:t )    t( i ) ~x ( i ) (dx0:t )
0:t
N i 1

Step 2: Selection step

~x{
}
multiply/discard particles
with
) i(
high/low importance weights tw to obtain
N ) i(
N particles 1 i} t:0x{
let assoc.empiricalNmeasure
N ) i(
1 i t:0
tN (dx0:t ) 

1
N

i 1
x0(:it)
(dx0:t )
Step 3: MCMC step

sample ) t:0xd , 1Nj} ) jt:0(x{( tK  )ti:0(x ,where K is a
Markov kernel of invariant distribution ) t:0xd( t 
N
and let
1
 tN (dx0:t )    x (dx0:t )
N
i 1
(i)
0:t
Convergence Study
denote f  sup x | f ( x) |
 convergence to 0 of average mean square
2
N
error E  t , f t    t , f t 
under quite general conditions
 tN
 Then prove (almost sure) convergence of
t
toward
under more restrictive conditions

n


Bounds for mean square errors

Assumptions

1.-A Importance distribution and weights
 t is assumed abs.continuous with respect to ~t
) i( ~
) i( ~
n t
)

,
y
,
x
(
h
)
y
,


P
((
R
)
),
for all
t:0
t:0
t
t:0
t:0x ( t g
is a bounded function in argument x0:t  ( R n x ) t 1

x
( ~t , f t g t ht )
( t , f t )  ~
,
(  t , g t ht )
define

   t ( y0:t , x0:t 1 , u, d~
x0:t ),
u
t

   t ( y0:t , x0:t 1 , v, d~
x0:t ),
v
t
htu ()  ht ( y0:t , u,),
htv ()  ht ( y0:t , v,)
There exists a constant dt s. t. for all
n t
there exists f t 1  B R   with ft 1  ft s.t.

x
 t f t  t f t  d t (  , f t 1 )  ( , f t 1 )

There exists
fh
s. t.
g t ht  g t ht  (  , f h )  ( , f h )
and a constant et s.t.
ht ( x0:t )  ht ( x0:t )  et min( ht ( x0:t ), ht ( x0:t )

f t  B R nx

t 1


2.-A Resampling/Selection scheme
 N
E   ( N ( i )t  Nwit )q ( i )

 i 1
2

  Ct N max q ( i )


2

First Assumption ensures that



Importance function is chosen so that the
corresponding importance weights are
bounded above.
Sampling kernel and importance weights
depend “ continuously” on the measure
variable.
Second assumption ensures that

Selection scheme does not introduce too
strong a “discrepancy”.

Lemma 1

Let us assume that for any


E (( tN1 , f t 1 )  ( t 1 , f t 1 )) 2  ct 1
f t 1
N
then after step 1, for any


f
E (( ~tN , f t )  ( ~t , f t )) 2  c~t t
N

2

f t  B R nx
2
Lemma 2

Let us assume that for any
2


E (( tN1 , f t 1 )  ( t 1 , f t 1 )) 2  ct 1
then for any


ft  B R


n x t 1
E (( tN , f t )  ( t , f t )) 2  ct
ft
N

2
f t 1
N
  
f t 1  B R nx


t 1

t

ft  B R

n x t 1


ft
N
2
~
~
~
E ((  t , f t )  (  t , f t ))  ct
N
2

Lemma 3

Let us assume that for any


2
ft
E (( tN , f t )  ( t , f t )) 2  ct
N
then after step 2, for any



Let us assume that 2for any


ft
E (( , f t )  ( t , f t ))  ct
N
t
2



N
E (( tN , f t )  ( t , f t )) 2  ct
ft
then for any

f t  B R nx




t 1


t 1

N
Lemma 4


f t  B R nx
2
ft
E ((tN , f t )  ( t , f t )) 2  ct

f t  B R nx
t 1
N
2

ft  B R
n x t 1
Theorem 1

For all t  0 , there exists ct
independent of N s.t. for any


E (( tN , f t )  ( t , f t )) 2  ct
ft
N
2

f t  B R nx

t 1
