Mean-Shift based Object Tracking Algorithm using SURF Features

Recent Advances in Circuits, Communications and Signal Processing
Mean-Shift based Object Tracking Algorithm using SURF Features
SOURAV GARG
Innovation Lab
Tata Consultancy Services
Noida, Uttar Pradesh, India
[email protected]
SWAGAT KUMAR
Innovation Lab
Tata Consultancy Services
Noida, Uttar Pradesh, India
[email protected]
Abstract: Mean-Shift tracking is primarily used for carrying out localized search on an image frame using colour
histograms. The application of mean-shift tracking directly to SURF features is limited due to the unavailability of
sufficient number of key points for a given object. This paper proposes a method called re-projection to overcome
this limitation so that the mean-shift algorithm can be used directly with SURF descriptors for tracking an object
in a video recorded from a non-stationary camera. Since the SURF features are computed only for the object being
tracked, the computational requirement is small enough to allow real-time tracking of the object. The efficacy of
the approach is demonstrated through various simulation results.
Key–Words: SURF, Mean-shift, Object Tracking, re-projection
1
Introduction
frame-rate) performance [14] even with these computationally heavy features.
In this paper, we use mean-shift algorithm directly with SURF features to track objects in a video
sequence. This method belongs to the category of interest point based tracking methods [11] which use a
method of object recognition based on SURF correspondence. This approach does not require estimation of object or feature motion model unlike other
approaches that use optical flow or Kalman Filter for
such estimations or predictions [8] [13]. However,
such motion models may become indispensable if one
needs to track partially or fully occluded objects.
Application of mean-shift algorithm for tracking
requires a histogram of the object template which
will be searched in subsequent frames. The template
histogram is formed by creating a fixed number of
clusters with the SURF features of the object template. This is similar to the object recognition method
used in a bag-of-words approach [15] [16]. This histogram is used by the mean-shift algorithm to localize the object in the next frame. The approach is not
straight forward and one has to address various issues
like, availability of very few descriptors for the object model, depletion of matching features over subsequent frames, presence of outliers, scaling of tracking
window and so on. These problems have apparently
discouraged the researchers to apply the mean-shift algorithm directly to SURF descriptors.
Most of these problems arise due to the fact that
the histogram, created with the limited number of key
points available for a given object, may not represent
the true pdf of the object model. We overcome this
Object tracking in a video sequence is an important
problem in computer vision with applications in areas
like video surveillance, vehicle navigation, perceptual
user interface and augmented reality [1]. It also forms
an integral part of the vision based robot tracking technologies [2].
Mean-shift tracking is a local search algorithm
based on colour histogram matching [3]. This method
is very simple and easy to implement which makes it
very popular among the colour based tracking methods [4]. However, the colour based tracking methods are sensitive to variation in illumination condition
and necessitate having non-matching backgrounds
[5]. This has prompted researchers to look for more
distinctive features like SIFT [6] and SURF [7] which
have been shown to be robust to photometric and geometric distortions. These robust local point features
are being increasingly used for visual object tracking application [8] [9]. SURF being computationally
more efficient compared to SIFT features, we focus
on object tracking methods that make use of SURF
features.
Many works have been reported in literature
which use SURF features for visual object tracking. These works may be broadly classified into two
groups - one using SURF features to improve the robustness of colour-based object tracking algorithms as
in [10] [4] and, the other using SURF features directly
for object tracking as in [11] [12][13]. The latter approaches are becoming more popular with the appearance of algorithms that can achieve real-time (near-
ISBN: 978-1-61804-164-7
187
Recent Advances in Circuits, Communications and Signal Processing
dow with width w and height h. Given I0 , W0
and V0 (I0 , W0 ), the task is to compute the tracking window Wi (ci , wi , hi ) for all image frames i =
1, 2, · · · , N .
problem by using a method called re-projection where
the histogram of the object template is updated on-line
for every frame. The method of re-projection aims to
enrich the source histogram by making a homographic
projection of the matching points from the target window on to the source window (object model) at the end
of each mean-shift convergence. This increases the
number of key-points in the source window and thus,
improves the pdf of the object model. This, in turn,
overcomes several other problems mentioned above.
The proposed method has several advantages.
First, the tracking can be carried out in real-time as
the SURF features needed for tracking are computed
only over the region containing the target. Second,
tracking can be carried out in a video recorded from
a non-stationary camera where the background is not
static. This is due to the fact that our approach does
not make use of any foreground/background classification methods as used by many other authors [14]
[10]. However, we do not consider the cases of partial
or full occlusion [13] of the target in this paper, which
forms the future scope of the work.
The main contribution of this paper is that we propose a method that enables us to use mean-shift algorithm to track an object using SURF features directly
without using any other additional information about
the object. According to our literature survey, such a
work has not been reported so far and hence, we consider this to be a novel contribution in this field.
The rest of this paper is organized as follows. The
problem definition is provided in the next section. The
mean-shift algorithm for implementing object tracking is provided in Section 3. The simulation and experimental results are provided in Section 4 followed
by conclusion in Section 5.
2
3
In this paper, we use mean-shift algorithm [3] directly
on SURF features to track the object in subsequent
frames. This is different from other mean-shift based
approaches as in [4][9], where mean-shift algorithm
is used with colour histograms and SURF features
are used only for improving its performance based on
point correspondences. Mean-shift tracking necessitates having an object histogram model which will be
used for searching the object in the next frame based
on histogram matching.
The pseudo-code for the tracking algorithm is
provided in Table 1. The tracking method consists of
the following four steps:
1. Creating object histograms using SURF descriptors. This is done only for the first frame as described in the lines 2-6 of the pseudo-code.
2. Searching the object in the new frame through
histogram matching and localize the target window using mean-shift algorithm. The mean-shift
iteration is carried out as shown in lines 10-17 in
the pseudo-code.
3. Scaling the target window to reflect the correct
size of the tracking object. The computation of
scaling coefficient α and positioning of the target window is done in line numbers 18 and 19
respectively.
Problem Definition
4. Re-projecting the matching key points of the target window (obtained after mean-shift convergence) on to the source window using homography. This is done in line numbers 20-23 of the
pseudo-code. The re-projected point locations
are represented by X ′ . The histogram is updated
only if the locations of the projected points are
close to the original points X.
Consider a set of frames Ii , i = 0, 1, 2, · · · N
of a video sequence where an object identified by
the user in the first frame is to be tracked over
all the frames. The object is identified by the
user by selecting a rectangular region on the first
frame.
Let this rectangular region be denoted
by W0 corresponding to the first image I0 . Let
V (I, W ) = {(x1 , v1 ), (x2 , v2 ), · · · , (xn , vn )} be the
set of SURF key points of an image I within the window W , where xi is the 2-dimensional key point location of the SURF descriptor vi . We use X to denote
the set of key point locations in the source window or
the object model and Y to denote the corresponding
set on the target window.
The tracking window W is represented by W =
(c, w, h) where c = (cx , cy ) centre of the winISBN: 978-1-61804-164-7
The Method
We use the following notations for continuing the
discussion in the remaining part of this section. The
source window (Ws ) refers to the window on the first
frame which contains the object to be tracked. The
target window (Wt ) refers to the window on the destination frame where the target is to be searched or is
found.
188
Recent Advances in Circuits, Communications and Signal Processing
3.1
Creating Histogram with SURF features
Object Model/Source Window: The SURF descriptors are computed for the source window Ws = W0 .
The 64-dimensional SURF feature vectors are then
clustered into M number of clusters using k-means algorithm. These centres become the bins for the object
(source) histogram Hs . For every SURF descriptor
belonging to a centre, the count of the corresponding
bin is incremented. The belongingness of a descriptor
to a centre is decided based on its minimum distance
from all cluster centres. This is similar to the histogram creation process in a bag-of-words approach
used for object recognition [15] [16]. The clustering
of SURF descriptors is done only once for the source
window. Hence there is no computational burden during on-line tracking.
Target Window: For a target window on the destination image frame, compute all the SURF descriptors
lying within this rectangular region. Now create the
target histogram Ht by considering the clusters in the
source window as the bins. A descriptor belongs to
a particular source cluster if its Euclidean distance to
this cluster centre is minimum. Note that no clustering
is done for the target window. The target histogram
Ht is computed by finding the belongingness of target descriptors with the source bins.
Since the mean-shift algorithm works on histogram matching, the outliers (features corresponding to background objects) will pollute the target histogram as shown in Figure 1. Figure 1(a) is the source
window or the object model which is to be tracked
in the destination frame. Figures 1(b) and 1(c) are
the two possible target windows one may come across
during mean-shift search. The target window 2 shows
a case where a part of the background is selected as
well. The histograms for these three windows are
shown in Figure 1(d). It can be seen that the histograms for the source window and the target window
1 are similar while that of the target window 2 is significantly different. The points having same colour
within each window refer to the SURF key points
which belong to the same cluster. There are 5 clusters representing the 5 bins of the histogram. This
difference between the histograms of the source and
the target window 2 would eventually cause the meanshift algorithm to drift the target window in a direction
where the dissimilarity will decrease.
3.2
Normalized Frequency
(a) Source
Window
0.55
0.5
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
(c) Target
Window 2
Source Histogram
Target Window 1
Target Window 2
-1
0
1
2
Bin Index
3
4
5
(d) Histogram for three windows
Figure 1: Effect of outliers on histogram matching. First
row shows the locations of descriptors in three windows.
(a) is the object model which is to be tracked in the next
frame. (b) is one target window which is similar to the
source window. (c) is the target window which contains
a part of the background and hence contributes outliers.
Points of same colour belong to the same cluster. (d) shows
the SURF histograms for all the three windows. One can
see that the histogram for first two windows are similar
compared to that of the third window.
coefficient [3]. This, in turn, requires computing histograms of the source as well as the target windows.
Once the histograms are computed using SURF features, the mean-shift procedure is identical to that used
for colour histogram based method. The details of
the mean-shift algorithm is omitted from this paper
to maintain brevity and avoid repetition, in stead, the
readers are referred to the original paper by Comaniciu et. al. [3] for further details. There is, however, a
slight difference between the current implementation
and the one with colour histograms as explained below.
Mean-Shift Algorithm
Mean-shift algorithm uses mean-shift iterations to
find the target window which is most similar to a given
object model (source window), with the similarity being expressed by a metric based on the Bhattacharyya
ISBN: 978-1-61804-164-7
(b) Target
Window 1
The centre of the new target window computed by
189
Recent Advances in Circuits, Communications and Signal Processing
the mean-shift algorithm is given by:
!
n
X
x − xi 2
wi g h xi
z = i=1n
!
X
x − x i 2
wi g h Table 1 Pseudo-code for updating tracker window using Mean-Shift Algorithm
Require: A sequence of frames: {Ii }, i = 0, 1, 2, . . . , N
1: for i = 0 to N do
2:
if i = 0 then {first frame}
3:
Select the object using a rectangular window: W0
4:
Extract the SURF features of the object: V0 (I0 , W0 )
5:
Create D clusters using k − means
6:
Create Object Histogram by treating clusters as bins. Denote
this object histogram by Hs
7:
else {for other frames}
8:
k ← 0 {counter for mean-shift loop}
9:
Initialize tracker window: Wi (ci (k), wi (k), hi (k)) =
Wi−1 (ci−1 , wi−1 , hi−1 )
10:
repeat
11:
k ←k+1
12:
Compute SURF descriptors V (Ii , Wi (k)) for the target
window.
13:
Compute target histogram Ht using the source cluster centres as bins.
14:
Find out the set of matching descriptors between source and
target windows.
15:
Compute new window centre using the matching descriptors: ci (k) = mean-shift(Hs , Ht )
16:
Update the tracker window on frame Ii : Wi (k) =
(ci (k), αwi (k − 1), αhi (k − 1))
17:
until kci (k) − ci (k − 1)k ≤ ε
18:
Compute scaling coefficient: α(Wi )
19:
Draw the target window on the image.
20:
Compute Re-Projected points: Xi′ using homography
(RANSAC)
21:
if Xi ∼ Xi′ then
22:
Update V0 (I0 , W0 ) and Hs
23:
end if
24:
end if
25: end for
(1)
i=1
where g(x) = −k ′ (x) is the derivative of the kernel
profile and wi is the weight associated with each key
point location i of the source window which has a correspondence in the target window. The new centre location depends on the number of correspondences n
between the source and the target window. In colourbased mean-shift algorithm, the correspondences include each and every pixel location with in the source
window. In our method which depends on SURF descriptors, the weighted average is computed over the
n SURF correspondences available between the two
windows. The SURF correspondences between the
windows are computed using minimum distance criterion and RANSAC for removing outliers.
3.3
Scaling
Since the SURF correspondences are used between
the source and the target window, it is difficult to find
a bounding box for the object being tracked on the
destination image. The solution is to scale the original window based on how much the matching points
have scaled up or down in the target window [4] [3].
We use the method as described in [4] for scaling the
target window in our case. The scaling factor is given
by
α =
(n2 )
X
sk
where
points obtained from the source window Ws are limited, the histogram created with this set of key points
may not truly represent the object model. In case
of the Bag of Words approach for object recognition
[15], a large set of images containing the same object is used for computing the object histogram. It
is not possible to have a large number of samples in
our case as the tracking is carried out on-line. In
order to overcome this issue, we propose a method
called re-projection where the matching key points of
the final tracking window (obtained after Mean Shift
convergence) are projected back onto the source window. Since homography (using RANSAC) is used for
avoiding wrong correspondences, the projected points
lie close to the original key points on the source window as shown in Figure 2(a) and 2(b). In the worst
case with very poor correspondence (see Figure 2(c)),
the projected points may not lie close to the original
points as shown in Figure 2(d). Such points are discarded and not included in the updated histogram.
Hence the re-projection improves the source histogram by appending projected key points and descriptors to the original set. The improved source
(2)
k=0
sk =
kyi − yj k
n
, (i, j) → k, k = 1, 2, . . . ,
kxi − xj k
2
(3)
The locations of matching key points on the
source window are represented by xj and those on the
target window are given by yj , where j = 1, 2, . . . , n
bethe number of matching descriptors. There are
n
2 number of unique pairs of inter-interest-points in
each window and sk is the scaling value for the pair
(i, j) → k.
3.4
Improving the source histogram using
Re-Projection
Mean-shift tracking algorithm requires a histogram of
the object model. Since the number of SURF key
ISBN: 978-1-61804-164-7
190
Recent Advances in Circuits, Communications and Signal Processing
histogram will lead to faster mean-shift convergence
thereby improving the real-time performance of the
algorithm. The number of key points in the source
window increases over time. The dominant key points
will have more descriptors in its vicinity compared to
others. In order to keep a check on the total number of descriptors in the source window, we allow
at most two re-projections at a given location in the
source window. Since there are more descriptors at
a given location, the chances of obtaining matching
correspondences become higher, which in turn leads
to better tracking. The improvement obtained due to
re-projection will be discussed further in the simulation section where many of these conjectures will be
validated.
(a)
The tracking performance is expressed in terms of
the percentage overlap between the window obtained
from the ground truth and the converged target window obtained from our algorithm as given by
%Overlap =
(4)
where A and B represent the set of pixels in the window from the ground truth and the converged target
window respectively. Higher value of this quantity
represents better tracking performance.
Some of the snapshots of tracker window along
with the ground truth are shown in Figure 4. The
red window represents the ground truth while the blue
window is obtained from our algorithm. The first row
in this figure shows the cases where the mean-shift
tracker achieves proper scaling and hence, leads to
correct tracking of the object. The second row shows
some of the poor cases where the algorithm tracks the
object with improper scaling.
90
(b)
Without re-projection
With re-projection
Percentage Overlap
85
(c)
A∩B
× 100
A∪B
(d)
80
75
70
65
Figure 2: Understanding re-projection: (a) and (b) refer to the best case with good correspondence, where the
re-projected points (red colour) overlap with the original
key point locations (cyan colour); (c) and (d) refer to the
worst case with poor correspondence, where the reprojected points lie far away from the original point locations.
60
0
200
400
600
Frame Index
800
1000
Figure 3: Effect of re-projection on tracking performance.
4
Re-projection leads to better tracking with higher value of
percentage overlap between the ground truth and the target
window obtained from our algorithm.
Simulation Results
In order to test our algorithm we record a video from
a non-stationary camera which is mounted on a mobile robot platform following an object. Note that in
this video, the foreground as well as the background is
dynamic and hence the methods based on background
subtraction can not be used [5].
Since our tracking algorithm is based on SURF
features, it is necessary to have a significant number
of key points within the object model. In order to develop the theory of this paper, we choose a black and
white checkered pattern for the target object. However, we will also show that the tracking works well
with other natural objects with similar outcomes.
ISBN: 978-1-61804-164-7
The mean-shift tracker is initialized by selecting a
rectangular region containing the object to be tracked
in the first frame. The centre of the tracking window
for all subsequent frames is computed using the meanshift algorithm as described in Table 1. The meanshift algorithm is considered to have converged if the
Euclidean distance between two consecutive centres
is less than 3 pixels. The mean-shift iteration loop
is stopped whenever it does not converge within 50
iterations.
The performance of the tracking algorithm in
terms of percentage overlap is shown in Figure 3. As
one can see, the re-projection method leads to better
191
Recent Advances in Circuits, Communications and Signal Processing
Number of matching points
40
Figure 4: Results for mean-shift tracking based on SURF
features. The red window is the ground truth and the blue
window is obtained using our algorithm. The first row
shows the best cases and the second row shows the worst
case.
Without re-projection
With re-projection
35
30
25
20
15
10
0
200
400
600
800
1000
Frame Index
tracking performance compared to the case when reprojection is not used.
The effect of re-projection on our tracking algorithm can be better understood by analyzing the figures 5, 6 and 7. As explained in section 3.4, reprojection increases the number of SURF descriptors
in the source window, which increases the chances of
getting better correspondences. This effect is shown
in Figure 5 where one can see that re-projection
leads to larger number of matching points between
the source and the final target window. With the
availability of improved histogram, it becomes easier to find the target object using the mean-shift algorithm. The Figure 6 shows the similarity between
the source window and the final tracking window obtained from mean-shift convergence in terms of Bhattacharyya Coefficient. One can conclude that the final
target window obtained with re-projection are more
similar to the original object model compared to the
case when re-projection is not used.
One should also note that the source histogram
stabilizes over the subsequent frames as shown in Figure 7. This means, as time progresses, the histogram
obtained with re-projection comes closer to the actual
object description. The improvement in histogram
leads to faster convergence of the mean-shift iterations
as shown in Figure 8. As one can see, the mean-shift
algorithm takes approximately 2 to 3 steps to converge
to the final tracking window. On the other hand, the
number of steps needed for mean-shift convergence
increases over time if re-projection is not used.
In order to corroborate the above findings, the performance of our tracking algorithm is tested on two
different example videos as shown in Figure 9 and 10.
In the first video, a human torso is tracked based on
the SURF features obtained from the clothes worn by
the subject. The second video shows a difficult case
where we track a box containing few letters. In this
ISBN: 978-1-61804-164-7
Figure 5: Effect of re-projection on number of matching points for each image: The average number of matching points between the source and target window remains
more or less constant for all frames. This number decreases
monotonically if re-projection is not used. The trajectory
for re-projection case is shown in green color.
Bhattacharyya Coefficient
1
Without re-projection
With re-projection
0.98
0.96
0.94
0.92
0.9
0.88
0.86
0
200
400
600
800
1000
Frame Index
Figure 6: Effect of re-projection on Bhattacharyya Coefficient. Re-projection leads to higher value of Bhattacharyya
Coefficient (shown by dashed line) between the source window and the converged target window. Hence the final window is more similar to the object model compared to the
case when re-projection is not used (solid line). The y-axis
values are averaged over the frame count.
192
0.45
Iterations for Mean-Shift Convergence
Normalized Frequency of Bins(Cluster Centers)
Recent Advances in Circuits, Communications and Signal Processing
Bin-1
Bin-2
Bin-3
Bin-4
Bin-5
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
0
100
200
300
400 500 600
Frame Index
700
800
900 1000
Figure 7: Effect of re-projection on the PDF. The bin fre-
11
10
9
8
7
6
Without re-projection
With re-projection
5
4
3
2
1
0
200
400
600
Frame Index
800
1000
Figure 8: Effect of re-projection on mean-shift conver-
quency (number of points in each bin) tends to stabilize
over the frames. This means, as more points are added to
the source window through re-projection, the resulting histogram represents the true pdf model of the object. The
source histogram has 5 bins which represent the clusters
created with k-means.
gence. Algorithm with re-projection (dashed line) needs
less number of iterations for mean-shift convergence as
compared to the one without re-projection (solid line).
The y-axis values shown are averaged over the number of
frames.
case, the number of descriptors available in the object
model is quite small and are concentrated over a very
small region within the box. In both the cases, we
are able to track the object satisfactorily. Since it was
time consuming to draw the ground truth manually for
all the frames, we only show the tracking window obtained from our algorithm in these two figures. The
tracking videos will be made available on request.
5
Conclusion
Figure 9: Mean-shift tracking results for Example 2 where
a human torso is being tracked based on the SURF features
obtained from the clothes worn by the subject. Blue window is the tracking window obtained from our algorithm
Mean-shift algorithm is a popular method for tracking
objects based on colour histograms. Its application to
SURF descriptors is limited due to the unavailability
of sufficient key points which can be used for computing a reliable histogram for the object model. This
paper proposes a mean-shift based object tracking algorithm that uses SURF descriptors for creating histograms. The problem associated with the availability
of smaller number of key points is resolved by using
an approach called re-projection and it is shown to
provide significant improvement in the tracking performance.
Figure 10: Mean-shift tracking results for Example 3
where a box with very few SURF descriptors is used for
tracking. Scaling is affected due to the availability of lesser
number of descriptors. Blue window is the tracking window obtained from our algorithm
References:
[1] A. Yilmaz, O. Javed, and M. Shah. Object
tracking: A survey. ACM Computing Surveys
ISBN: 978-1-61804-164-7
193
Recent Advances in Circuits, Communications and Signal Processing
(CSUR), 38(4), December 2006.
[12] Duy-Nguyen Ta, Wei-Chao Chen, Natasha
Gelfand, and Kari Pulli. SURFTrac: Efficient
tracking and continuous object recognition using local feature descriptors. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2937–2944, Miami,FL,
2009. IEEE.
[2] Zhigang Bing, Yongxia Wang, Jinsheng Hou,
Hailong Lu, and Hongda Chen. Research of
tracking robot based on surf features. In International Conference on Natural Computation
(ICNC), pages 3523–3527, Yantai, Shandong,
2010. IEEE.
[13] Wei He, T. Yamashita, Lu Hongtao, and Shihong
Lao. Surf tracking. In International Conference
on Computer Vision, pages 1586–1592, Kyoto,
2009. IEEE.
[3] D. Comaniciu, V. Ramesh, and P. Meer. Realtime tracking of non-rigid objects using meanshift. In Proc. of Int. Conf. on Computer Vision
and Pattern Recognition (CVPR), pages 142–
149, vol. 2, Hilton Head Island, CS, 2000. IEEE.
[14] Steve Gu, Ying Zheng, and Carlo Tomasi. Efficient visual object tracking with online nearest neighbor classifier. In 10th Asian Conference on Computer Vision (ACCV), pages 271–
282, Queenstown, New Zealand, 2010. Springer
Berlin Heidelberg.
[4] Jian Zhang, Jun Fang, and Jin Lu. Mean-shift
algorithm integrating with surf for tracking. In
Natural Computation (ICNC), pages 960–963,
Shanghai, 2011. IEEE.
[15] A. Ahmadi, M. R. Daliri, A Nodehi, and A Qorbani. Objects recognition using the histogram
based on descriptors of SIFT and SURF. Journal of Basic and Applied Scientific Research,
2(9):8612–8616, 2012.
[5] M. Gupta, L. Behera, and V. K. Subramanian. A
novel approach of human motion tracking with
mobile robotic platform. In UKSIM Int. Conf. on
Computer Modeling and Simulation, pages 218–
223, Cambridge, UK, 2011. IEEE.
[16] Bart Thomee, Erwin M. Bakker, and Michael S.
Lew. TOP-SURF: a visual words toolkit. In
Proc. of International Conference on Multimedia, pages 1473–1476, New York, 2010. ACM.
[6] D. G. Lowe. Distinctive image features from
scale-invariant keypoints. Internation Journal of
Computer Vision, 60(2):91–110, January 2004.
[7] H. Bay, A. Ess, T. Tuytelaars, and L. V.
Gool. Speeded-up robust features (surf). Computer Vision and Image Understanding, Elsevier,
110:346–359, December 2008.
[8] Yuichi Motai, Sumit Kumar Jha, and Daniel
Kruse. Human tracking from a mobile agent:
Optical flow and kalman filter arbitration. Signal
Processing: Image Communication, 27(1):83–
95, January 2012.
[9] Huiyu Zhou, Yuan Yuan, and Chunmei Shi.
Object tracking using sift features and mean
shift. Computer Vision and Image Understanding, 113(3):345–352, March 2009.
[10] S. Haner and I. Y. Gu. Combining foreground /
background feature points and anisotropic mean
shift for enhanced visual object tracking. In International Conference on Pattern Recognition
(ICPR), pages 3488–3491, Istanbul, 2010. IEEE.
[11] Werner Kloihofer and Martin Kampel. Interest point based tracking. In International Conference on Pattern Recognition (ICPR), pages
3549–3552. ACM, 2010.
ISBN: 978-1-61804-164-7
194