Slide

Motion Segmentation over Image
Sequences Using Multiway Cuts and
Affine Transformations
Braga Natarajan
Organization
Motion Segmentation
Motion models,
affine transformation
Energy minimization, graph cuts,
multiway cuts
Combining multiway cuts and
affine transformations
Algorithms and results
Motion segmentation




Image segmentation – by color, texture,
shape, and motion
Motion – very important cue
Divide image into regions – exhibiting
relatively different coherent motion
Ideal motion
segmentation
Why motion segmentation?






Number of applications
Robotics
Video coding/compression
Video indexing/retrieval
Object tracking, surveillance
Intermediate image processing task – output
given to high level computer vision problems
Motion representation


Pixel motion represented by a 2D vector
Either dense (optical flow) or sparse (features)
Models:
translation: 2 parameters
affine: 6 parameters
Lucas-Kanade affine estimation

Minimize nonlinear equation:
2x2 matrix
residue

2x1 vector
next image
current image
pixel location
Iterative linear solution (Newton’s method):
6x6 gradient matrix
unknown parameters (A, d),
6x1 vector
6x1 error vector
[Lucas & Kanade, 1981; Shi & Tomasi, 1994]
Energy minimization



Motion segmentation as
energy minimization
challenge: Thousands of
dimensions!
Solution: Graph cut
methods [Boykov,
Veksler, and Zabih 1999]


accurate
fast
Data
penalties
Smoothness
penalties
Image Correspondence




Motion segmentation comes
under the general category
of image correspondence
Goal of image
correspondence: assign
labels to every pixel in the
image
Energy functions can be
devised once labels are
defined and listed
What do labels mean?
Stereo Correspondence
Motion
Binary cut





Maximum flow-minimum cut, Ford-Fulkerson 1956
graphs constructed - a node per pixel
source and sink terminals; binary variables 0 and 1
t-link weight: data penalty, n-link weight: smoothness penalty
Minimum s-t cut, pixels get label 0 or 1 based on what links are cut
t-link
n-link
cut n-link
Multiway cuts



More than 2 labels; typical for motion and stereo – multiway cuts
Repeated binary cuts by forming binary graphs for every pair of
labels: alpha-beta swap
Repeated binary cuts by forming binary graphs for a particular label
and the existing label, for all labels: alpha expansion
Multiway cuts
Parent algorithm




Multiway cut for stereo and
motion with slanted surfaces,
Birchfield and Tomasi 1999
Combines multiway cuts and
affine transformation
Works iteratively by
progressive refinement of
displacement functions of
labels
Algorithm re-implemented,
proposed algorithms are
extensions to this paper
Motion segmentation over image
sequences


Parent algorithm when employed on sequence of images,
does not produce consistent results
Also computationally inefficient to exhaustively search over all
translational displacement functions for every frame.
frame1, 5 segments
frame2, 6 segments
Changes to parent algorithm
result from previous
frame pair
control number
of loop iterations
do affine merge
at the end
Algorithm




frame t
frame t+1
parent
algorithm,
affine merge
Run parent algorithm on first
frame and get correct number
of segments, parameters are
fixed
Set number of iteration loop for
affine update of displacement
functions to a constant
Initial motion segment image
for next frame is predicted by
affine warping of current
motion segments and reestimation of displacement
functions
Final iterative affine merge
step merges neighboring
regions
predict
label image
initialize parent
algorithm for
next frame
frame t+1
frame t+2

Affine merging of regions – if
neighbor regions within
threshold, then merge; if
number of segments is still
more relax threshold and
repeat

This step similar to over
segmentation step but does
not involve energy
computation, hence
threshold dependent




Results for frames 2, 10, 19
and 25 are shown.
Number of motion segments
maintained
Algorithm took 71.18
seconds to run on 27 frames,
parent algorithm took 97.04
seconds
Boundaries between
segments are not crisp due
to occlusions and lack of
texture




Taxi sequence, algorithm
works well for frames 1
to 36. Frame 5, 18 are
shown.
Failure for 36 to 40 due
to small motion of taxi
and two components for
right vehicle
Frame 40 failure of
affine merge shown
Right vehicle
segmentation poor due
to occlusion by tree
Hard constraint points for stereo



Another extension to stereo correspondence
Cost functional not able to preserve small and thin
long objects in depth maps
Multiway cuts smoothes out small regions
Normalized correlation


Normalized correlation performed, unambiguous disparity points are chosen
and initialized as hard constraint points
These points initialize multiway cuts, number of iterations of affine updating
is controlled
sum of squared
differences inside
window
scan line
left image
ambiguous minimum
right image
clear minimum
Occlusion detection



Errors of regions in between
motion layers due to
movement of foreground
over background
Selective occlusion
detection is done using
estimated affine parameters
Assumption – multiway cuts
labels occluded areas with
the label of the foreground


Compute residues of region a and region b based on
affine parameters of both region 1 and region 2 and
pick the worst.
The occluded region has the worst residue because it
has no matching region in the next frame.
Conclusions





Studied, analyzed and implemented multiway
cuts and affine transformation techniques
All implementation in C++ from scratch, using
Kolmogorov and Blepo
Two extensions to the parent algorithm –
motion segmentation over image sequences
and hard constraint points for stereo
Simple occlusion detection presented
Results are reasonable
Future work



Spatiotemporal multiway cuts for segmenting
object in the video volume
Redesigning cost functionals to improve
segmentation results
Integrate occlusion detection with multiway
cuts for getting cleaner borders.
Thank You