Deformable Object Tracking

Deformable Object Tracking: A Variational
Optimization Framework
CMPUT 615
Nilanjan Ray
Tracking Deformable Objects
• Desirable properties of deformable models:
– Adapt with deformations (sometimes drastic
deformations, depending on applications)
– Ability to learn object and background:
• Ability to separate foreground and background
• Ability to recognize object from one image frame to the next,
in an image sequence
Some Existing Deformable Models
• Deformable models:
– Highly deformable
• Examples: snake or active contour, B-spline snakes, …
• Good deformation, but poor recognition (learning) ability
– Not-so-deformable
• Examples
– Active shape and appearance models
– G-snake
– …
• Good recognition (learning) capability, but of course poor
deformation ability
So, how about good deformation and good recognition capabilities?
Technical Background: Level Set
Function
• A level set function represents a contour or a
front geometrically
• Consider a single-valued function (x, y) over
the image domain; intersection of the x-y plane
and  represents a contour:
2
2

 ( x  X ( x, y ))  ( y  Y ( x, y )) , if ( x, y ) is inside the object
 ( x, y )  
2
2

 ( x  X ( x, y ))  ( y  Y ( x, y )) , otherwise,
(X(x, y), Y(x, y)) is the point on the curve that is closest to the (x, y) point
• Matlab demo (lev_demo.m)
Technical Background: Non-Parametric
Density Estimation
Normalized image intensity histogram:
1
( I ( x, y )  i ) 2
H (i )   exp( 
)dxdy
2
C
2 i
I(x, y) is the image intensity at (x, y)
i is the standard deviation of the Gaussian kernel
C is a normalization factor that forces H(i) to integrate to unity
Technical Background: Similarity and
Dissimilarity Measures for PDFs
Kullback-Leibler (KL) divergence (a dissimilarity measure):
KL( P, Q)   Q( z ) log(
Q( z )
)dz
P( z )
Bhattacharya coefficient (a similarity measure):
BC ( P, Q)   Q( z ) P( z ) dz
P(z) and Q(z) are two PDFs being compared
Proposed Method: Tracking Deformable
Object
• Deformable Object model (due to Leventon [1]):
– From the first frame learn the joint pdf of level set function and
image intensity (image feature)
• Tracking:
– From second frame onward search for similar joint pdf
[1] M. Leventon, Statistical Models for Medical Image Analysis, Ph.D. Thesis, MIT, 2000.
Deformable Object Model
• Joint probability density estimation with Gaussian kernels:
1
( ( x, y )  l ) 2
( J ( x, y )  i ) 2
Q(l , i)   exp( 
) exp( 
)dxdy
2
2
C
2 l
2 i
Level set function value: l
Image intensity: i
J(x, y) is the image intensity at (x, y) point on the first image frame
(x, y) is the value of level set function at (x, y) on the first image frame
C is a normalization factor
We learn Q on the first video frame given the object contour (represented
by the level set function)
Proposed Object Tracking
• On the second (or
subsequent) frame
compute the density:
1
( ( x, y )  l ) 2
( I ( x, y )  i) 2
P(l , i)   exp( 
) exp( 
)dxdy
C
2 l2
2 i2
• Match the densities P and
Q by KL-divergence:
• Minimize KL-divergence
by varying the level set
function (x, y)
KL   Q(l , i) log
Q(l , i)
dldi
P(l , i )
Note that here only P is
a function of (x, y)
I(x, y) is the image intensity at (x, y) on the second/subsequent frame
(x, y) is the level set function at on the second/subsequent frame
Minimizing KL-divergence
• In order to minimize KL-divergence we use
Calculus of variations
• After applying Calculus of variations the rule of
update (gradient descent rule) for the level set
function becomes:
Q(l , i) ( t ( x, y )  l )
( t ( x, y )  l ) 2
( I ( x, y)  i) 2
 ( x, y)   ( x, y)  (t ) 
exp( 
) exp( 
)dldi
2
2
2
P(l , i)
l
2 l
2 i
t 1
t
t : iteration number
t : timestep size
Minimizing KL-divergence:
Implementation
• There is a compact way of expressing the update
rule:
Q(l , i) ( t ( x, y )  l )
( t ( x, y )  l ) 2
( I ( x, y)  i) 2
 ( x, y)   ( x, y)  (t ) 
exp( 
) exp( 
)dldi
P(l , i)
 l2
2 l2
2 i2
t 1
t
convolution
Q
P
 t 1 ( x, y )   t ( x, y )  (t )(  g1 )( t ( x, y ), I ( x, y ))
Where g1 is a convolution kernel:
l2
i2
g1 (l )  2 exp(  2 ) exp(  2 )
l
2 l
2 i
l
Q
P
is a function defined simply as:
Q Q(l , i )

P
(l , i )
Minimizing KL-divergence: A Stable
Implementation
• The previous implementation is called explicit scheme and is
unstable for large time steps; if small time step is used then the
convergence will be extremely slow
• One remedy is a semi-implicit scheme of numerical implementation:
 t 1 ( x, y ) 
 t ( x, y )  (t )(
1  (t )(
Where g is a convolution kernel:
l2
i2
g (l )  exp(  2 ) exp(  2 )
2 l
2 i
lQ
 g )( ( x, y ), I ( x, y ))
P
Q
 g )( ( x, y ), I ( x, y ))
P
lQ
is a function defined simply as:
P
lQ
Q(l , i )
l
P
(l , i )
In this numerical scheme t can be large and still the solution will
be convergent; So very quick convergence is achieved in this scheme
Results: Tracking Cardiac Motion
A few cine MRI frames and delineated boundaries on them
Show videos
Numerical Results and Comparison
Segmentation Score
Pratt FOM
1
0.5
0
10
20
30
40
50
Frame number
60
GVF snake method
Proposed method
70
80
90
1
0.5
0
10
20
30
40
50
Frame number
60
GVF snake method
Proposed method
70
80
90
Sequence with slow heart motion
Segmentation Score
Pratt FOM
1
0.5
0
20
40
60
80
100
Frame number
120
GVF snake method
140
160
Proposed method
180
1
0.5
0
20
40
60
80
100
Frame number
Sequence with rapid heart motion
Comparison of mean performance measures
Pratt’s FOM
Segmentation Score
Slow Seq.
Rapid Seq.
Slow Seq.
Rapid Seq.
GVF Snake Method
0.51
0.62
0.51
0.44
Proposed Method
0.88
0.77
0.74
0.79
120
GVF snake method
140
160
180
Proposed method