Irregular Moving Object Detecting and Tracking based on Color and

Irregular Moving Object Detecting and Tracking
Based on Color and Shape in Real-time System
Tran Thi Trang, Cheolkeun Ha
School of Mechanical Engineering,
University of Ulsan, Ulsan, South Korea
Email: [email protected], [email protected]
Abstract—This paper describes an efficient approach for
irregular moving object detecting and tracking in real-time
system based on color and shape information of the target object
from realistic environment. Firstly, the data is gotten from a realtime camera system at a stable frame rate. And then, each frame
is processed by using proposed method to detect and track the
target object immediately in consecutive frames. Finally, the
target position based modifying controlling signal is used to
control pan-tilt-zoom camera (PTZ camera) in order to
automatically follow the target object. Our experiment results
were obtained by using pan-tilt-zoom camera Sony EVI D70
under variety environments in real-time and our algorithm is
verified that it can be implemented effectively and accurately at
high frame speed, even 29.97 fps.
geometry, in most cases, color is a clearer identifying feature,
less sensitive to noise and more largely robust to a view
direction change and resolution. Hence, many color-based
approaches were also developed. A simple and efficient
performing algorithm, namely Backprojection, was introduced
by Swain and Ballard [5], in which the pixels of the image are
determined by their confidence values and the peaks in the
confident space are considered as target objects. However, the
applied area is the whole image so, if there are some regions in
the background that has the same color as the target color, their
confidence values are also high, but they are not the targets.
This problem is solved in [6], in which higher weights are
assigned to the pixels near the region center and lower to the
background ones. These algorithms are simple, yet too
computationally complex because of their complexity in
putting effort into dealing with irregular moving target objects
in a challenging environment. Our goal is to design an
efficient, high accurate tracking and detecting system in which
these above problems are mitigated, and it must run fast so that
target object may be detected and tracked in real-time while
consuming as few system resource as possible.
Index Term—Object detection, object tracking, CamShift
algorithm, pan-tilt-zoom camera
I.
INTRODUCTION
Real-time object detecting and tracking is an importance
issue which aims to develop robots visual skills so that they are
able to interact with a dynamic, realistic environment. The
main challenges of the problem commonly are perspective,
viewpoints changes, background clutter, image noise, scale,
scene illumination and camera parameters. In the last few
years, the problem has received a large amount of attention, in
an attempt to improve the implementation at high frame rate
with high accuracy. Color, gradient, intensity, depth were used
to be effective features for object detection and recognition [1],
contour and shape based approaches were also proposed. In the
case of algorithm simplifier and reducing time consumption in
order to be suitable with realistic environment, the two basic
object features, color and contour information, should be taken
for a job at hand. However, these two characters of object were
used to be used separately for anti-jamming in weak systems,
reduce consecutive images processing cost, improve working
ability in complex environment, and etc. Elaborate contourbased methods were proposed, linking the edges, partitioning
and connecting them to form a contour, then finding the
sequence chains resembling the model outlines [2], learning
detection from the segmented images, then applied for a larger
un-segmented images set in [3], or using a bandwidth of a
contour for deformable object [4]. They take, typically, at least
a few seconds to scan and detect, therefore, they are far too
expensive for real-time constraint system. Compared to object
978-1-4673-2088-7/13/$31.00 ©2013 IEEE
Apparently, the most important thing here is high precise
decision of target object and its localization, so, we have
focused on both color and contour based detection and
tracking. Nonetheless, the challenges of environment and time
consumption should also be taken into account. Many above
approaches were not appropriate to real-time system because of
its complexity leading to reducing processing rate. To
overcome this drawback, we divided the process into two main
stages: detection stage and tracking stage. Let us call the
detection stage the whole image processing and the tracking
stage the interest region processing. Firstly, we use the whole
image processing to detect object in the first consecutive
frames. Until the system becomes stable, we use the obtained
information of object position and size from the detection stage
to decide the interest region which will be processed in the
tracking stage in order to continue tracking our target. After a
certain time, the process will be returned. The interest region
processing is used to track the target while reducing process
time consumption and the whole image processing is used to
ensure the target object is tracked accurately even in the case
the tracking stage cannot follow the object. Figure 1
summarizes our detecting and tracking approach in real-time
system.
415

( R − G ) + ( R − B)

H = cos −1 
,
2
 2 ( R − G ) + ( R − G )(G − B) 
R ≠ G or R ≠ B
if B > G then H = 2π − H
S = 1−
3
[min( R, G , B )]
( R + G + B)
1
I = ( R + G + B)
3
B. Circle detection
Circle detection is an important issue in image processing
and computer vision, whose area has gaining much attention in
recent years. In circle detection, we expect to find triplets of (a,
b, R), which describe a circle completely, center x axis
coordinate, y axis coordinate and radius, respectively.
Approaches to detect circle mainly based on Random Hough
Circle Transform (RHCT) [10] and its derivatives [11] are
commonly in use. The RHCT approach is simple but usually
time-consuming and very sensitive to noise, because the
accuracy of circle detection is proportional to the number of
chosen accumulator cells. However, the more the accumulator
cells are used, the more amount of memory required increases.
In addition, our practice showed that it is usually not effective
for the noise contaminated images; spurious circles may be
detected in these cases. Furthermore, in the real-time object
detecting and tracking system, which the object contour may be
changed because of changing view direction, this algorithm
will not be implemented effectively.
Figure 1. Object detecting and tracking algorithm in real-time system
This paper is arranged into five sections as follows. Section
II presents color and shape based object detection. Section III
indicates object tracking. Experimental results, conclusions and
future research are given in section IV and V, respectively.
II.
The more elaborate method is using moving window [12],
which is enlarged enough to contain all the circle object pixels.
The window center neighbors are the target center candidates
and the radius is near the half of the window side consequently.
Apparently, the accuracy depends on the chosen number of the
circle candidates. However, if we increase this number, the
computation will become significantly complex and time
constraint for real-time system, therefore, is not satisfied.
COLOR AND SHAPE BASED OBJECT
DETETCTION
A. Color space selection
Theo Gevers and W.M Smeulders presented a comparison
between different color models for color-based object
recognition in [7]. The choice of color model depends on their
robustness against varying illumination and changes in object
surface orientation. In our case, the consecutive processed
images are gotten from online PTZ camera of realistic
environment, so the images may be contaminated by noise,
illumination variation, changes of view, and etc. Therefore, we
need to use a color model that is robust to a change in viewing
direction, robust to a change in the intensity of the illumination
and it should be concise and discriminatory. The HSI (Hue,
Saturation, Intensity) color space is most appropriate because
the HSI color model has its own two strong principal
advantages. Firstly, the H and S components are related to the
way in which human perceives color so, the colors in this
model can be clearly defined by human perception. Secondly,
the I component is the brightness of color so it is disassociated
from the color information. In many applications, we can only
use H and I component [8], even only H component of the
object color for the purpose of detection, recognition, and etc.
The conversion formula from RGB (Red, Green, Blue)
components to HSI components is represented as follows [8]:
To overcome this problem, we propose a simple and
visualized method based on connected edge area and the circles
centers and radii will be found as follow:
Assuming that the image has p × q pixels, ( xi , yi ) is the
coordinate of a point pt (i) , the number of the connected edge
curves in the detected map is M . Let Sm , Rm , nm be the area,
the circumscribed circle radius (if it has), and the number of the
m th edge boundary, respectively.
(1) Compute the image gradients by using Gaussian
template
(2) Detect edge map by using Canny edge detection
(3) For each connected edge boundary ( m = 1, 2,..., M ),
compute the boundary area, the circumscribed circle
center and radius:
416
For each histogram bin j
For all edge points pt ( i ) ( i ∈ [0, nm − 1] ) in a boundary
Sm =
1 nm
∑ xi −1 yi −xi yi −1
2 i =1
R j := min(
(1)
det( A)
4(( x2 − x1 )( y3 − y1 ) − ( x3 − x1 )( y2 − y1 ))
b := D * b
( xt , yt ) := loc(max x , y , bx , y )
(2)
For all pixels inside the tracking window:
Compute the zero moment
2
1
2
1
y
I ( x, y )
(11)
M 10 = ∑
∑
x
y
xI ( x, y )
(12)
M 01 = ∑
∑
x
y
yI ( x, y )
(13)
The localization is set at
xc =
M10
M 00
yc =
M 01
M 00
(14)
;
Step 4: Center the tracking window at the center of mass
until converged
If the m th edge curve is a circle, the following equation must be
satisfied
(5)
Step 5: Record ( xc , yc ) , M 00 for the tracking window in
the next frame, and the window size is set to:
Because the shape of object may change slightly due to the
changing of view position, noise, elimination changing, and
etc., thus, we can choose the rate between Sm and Rm in order
to detect the object accurately.
III.
∑
x
(4)
where ( xd , yd ) is the coordinate of any of three selected points.
Sm = π Rm2
M 00 = ∑
Find the first moment for x and y
 2( x − x ) x22 + y22 − ( x12 + y12 ) 
B= 2 1
2
2
2
2 
 2( x3 − x1 ) x3 + y3 − ( x1 + y1 ) 
Rm = ( xm 0 − xd ) 2 + ( ym 0 − yd ) 2
(9)
(10)
Step 3: Find the center of mass within the tracking window
[14].
 x + y − ( x + y ) 2( y2 − y1 ) 
A=

 x + y − ( x + y ) 2( y3 − y1 ) 
2
1
2
1
(7)
(8)
r
det( B)
(3)
4(( x2 − x1 )( y3 − y1 ) − ( x3 − x1 )( y2 − y1 ))
With det( A) , det( B ) are the determinants of matrices A and B,
respectively?
2
2
2
3
× 255, 255)
bx , y := Rh( cx , y )
ym 0 =
2
2
2
3
Ij
For each x, y
Pick randomly three edge points Cm = { pt (1), pt (2), pt (3)} in a
boundary and compute their circumscribed circle center
( xm 0 , ym 0 ) radius Rm as follows [13].
xm 0 =
Mj
Window width:
s = 2*
M 00
256
(15)
OBJECT TRACKING
Window length:
A. Object tracking algorithm
After the circle is detected, the circle area and position are
obtained, and then we use Camshift algorithm to continue
tracking object. Enlightened by paper [9], we can do object
tracking in the images as following:
l = 1.2 * s
B. Pan-tilt-zoom camera control
In recent years, pan-tilt-zoom camera control in tracking
system has been gaining much attention, mainly in an attempt
to ameliorate keeping object in the centers of view of camera
ability in real-time. In [15], pan, tilt of a tracking camera is
controlled without explicit formulation, yet it is only bases on
the estimated distribution over the state space. More
complicated approaches were also proposed, such as
biomimetic control for a running system that was based on
physiological neural path of eyes movement control [16], or
using PID control for a single camera [17]. In our detecting and
tracking system, we use a single PTZ camera, whose screen
size is 720x480 pixels, in order to automatically follow the
moving target object. Tracking is performed by using different
Step 1: Set the tracking window according to the
information obtained in the last frame.
Step 2: Calculate color probability distribution inside the
tracking window by using histogram back-projection algorithm
[5].
Let h(c) be the histogram function; “loc” be the function
returning its value argument to a pixel ( x, y ) , and D r be a disk
of radius r :
1
Dr = 
0
if
x2 + y2 < r
(16)
Return step 1;
(6)
otherwise
417
imperfect ones. Moreover, in order to increase the complexity
of the process, the target objects having different sizes are
arranged randomly among the other similar shaped or same
colored objects. Similarly, on the online case, we have a target
object that is moving irregularly in realistic environment.
Proposed object detecting and tracking approach and proposed
camera control algorithm keep the object in the center of the
camera screen.
information between the center of camera screen and the target
object position.
Suppose that the camera center is located at (360,240)
position. Our purpose is to control pan and tilt of the camera in
order to keep the target in the center of the camera view. It
means camera is moved provided that object position is
converged to (360,240). Obviously, it is worth using the
current target position for this purpose. Hence, by calculating
the difference between the target position and the center of the
camera screen, the moving angle value of the pan and tilt
camera can be determined. According to our experimental
statistics by using PTZ camera Sony EVI D70, the relations
between a pixel and a degree moving of the camera should be
found.
A.
Let we denote α p (i ) , α t (i ) , ε p (i ) , ε t (i ) as the current angle
position of the pan and tilt angle of the camera and moving
angle value to track the object of the pan and tilt camera at
i th frame, respectively. The sign of α p (i ) and α t (i ) are defined
as:
-
-
If camera is at “home” position, then α p (i ) = α t (i ) = 0 .
If camera is on the left of “home” position,
then α p (i ) < 0 , on the right, then α p (i ) > 0 .
If camera is lower than “home” position,
then α t (i ) < 0 , higher than “home” position,
then α t (i ) > 0 .
The equations must be satisfied:
(17)
α p (i ) = α p (i − 1) + ε p (i )
α t (i) = α t (i − 1) + ε t (i)
(18)
Let ( xi , yi ) be the target object current position, µ pi and
µti be the difference between the target position and the center
of the camera screen in horizontal and vertical direction,
respectively at i th frame. Let “ f p ”, “ f t ”be the function map
µ pi and µti to moving angle values to control the pan and tilt
of the camera.
ε p (i ) = f p ( µ pi )
(19)
ε t (i) = ft ( µti )
(20)
So, the current pan and tilt angle position of the camera at
i th frame are obtained:
(21)
α p (i ) = α p (i − 1) + f p ( µ pi )
α t (i ) = α t (i − 1) + f t ( µti )
IV.
(22)
EXPERIMENTAL RESULT
Offline case
Actually, the target objects here are red circular shapes of
any size, which are arranged randomly with circles, squares,
rectangles, triangles, and irregular shaped objects of different
colors and size (Fig.2). Along with detecting the target object
from pool of various objects, this experiment aims to detect the
target object’s parameters namely the coordinate of
center ( xc , yc ) , and radius r . Figure 2a shows the red circle
detection, the original image is on the left and detected image,
where the detected circles are marked by a yellow overlay, is
on the right, in which the difference size object detecting
ability is also illustrated. Moreover, the proposed algorithm
also gives a good detecting behavior in noise. We videotaped a
sequence of red circular shape images and added 0, 10, 20%
uniform noise. Fig.2b shows the 20% noise added to the raw
image on the left and resulting image on the right. In these
cases, the target red circles are detected precisely while their
sizes can also be estimated.
B.
Online case
Figure 3 presents circle automatic detecting and tracking at
29.97 fps and the detected object is marked by a red overlay. In
Fig.3a, the target is the static red circular ball, initially the
object is detected, then the tracking stage is initiated which
controls camera in order to keep the object at the center of the
camera screen. At t=1s, the target is in the right side of the
camera screen and at t=3s, the system successfully keeps it in
the center of the camera screen. In Fig.3b, initially, during the
period from t=0s to t=7s, the red circle shape object is
stationary and the detecting and tracking system achieved a
good result, since t=3s the object is kept in the screen center.
After that, at t=8s, the target object moves irregularly. At the
same time, the camera automatically follows the target.
However, in case the object is moving at higher velocity, the
hardware is unable to perform the tracking task. Hence, to
tackle the hardware limitation, switching back from tracking
stage to detection stage is proposed. The tracking algorithm is
suspended and detection algorithm is executed. Consequently,
the target object is detected and tracked successfully as shown
in the four other pictures at 8s, 10s, 32s, and 35s. Moreover, the
camera can also follow the object despite the change in object
size as shown in Fig.3.
V.
In order to evaluate our detecting and tracking system, we
assess its accuracy off-line and on-line. In off-line, we used a
database of over 100 images of the size 480 × 720 pixels,
videotaped in realistic environment and over 100 noise added
images which rarely contain perfect circles so the detection
method approximates the circles which are adapted to
CONCLUSION AND FUTURE WORK
This paper has presented an approach for object detecting
and tracking in real-time system, which takes an efficient and
automatic combination of detecting and tracking stage. Furthermore, we also demonstrated its robustness in different size of
objects detection while each target size can be estimated as
well. Nevertheless, the proposed method is operated quite
418
successfully in our robot system in realistic environment with
irregular moving at different velocity of target object, and, we
are trying to improve our algorithm hoping to come to faster
rates which might be required for such future systems.
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
(a)
Chen Guodong, Zeyang Xia, Rongchuan Sun, Zhenhua Wang, Zhiwu
Ren and Lining Sun, “A learning algorithm for model based object
detection”, 8th International Conference on Ubiquitous Robot and
Ambient Intelligent (URAI) 2011, pp. 101-106.
V.errari, T. Tuytelaars, L. Van Gool, “Object detection by contour
segment networks”, Lecture Notes in Computer Science 3, 14, 2006.
J. Shotton, A. Blake, R. Cipolla, “Contour-based learning for object
detection” in: Proc. ICCV, vol.1, Citeseer, pp. 503-510, 2005.
Xiang Bai, Quannan Li, Longin Jan Latecki, Wenyu Liu, Zhuowen Tu,
“Shape band: a deformable object detection approach”, Computer Vision
and Pattern Recognition, CVPR 2009, 2009.
M. J. Swain, D. H. Ballard, “Color indexing”, Int. J. Computer Vision
vol. 7(1), pp. 11-32, 1991.
Baojie Fan, Linlin Zhu, Yingkui Du, Yandong Tang, “A novel color
based object detection and localization algorithm”, CISP 2010, vol. 3, pp
1101-1105.
Theo Gevers, Arnold W.M. Smeulders, “Color-based object
recognition”, Pattern Recognition, vol. 32, pp 453-464, 1999.
Hong-Kui Liu, Jun Zhou, “Moving object detecting and tracking method
based on color image”, Proceedings of the 7th WICA, pp. 3608 – 3612,
2008.
Gary R. Bradski, “Computer vision face tracking for use in a perceptual
user interface”, Intel Technology Journal, 2(2), 13-27, 1998.
Dimitrios Ioannou, Walter Huda and Andrew. ”Circle recognition
through a 2D Hough transform and radius histogramming”, Image and
Vision Computing, vol.17, issue 1, pp. 15-26, 1999.
Li-qin Jia, C. Z. Peng, H. M. Liu, Z. H. Wang, “A fast randomized
circle detection algorithm”, 4th International Congress on Image and
Signal Processing, 2011, pp. 835-838.
ZHANG Yunchu, WANG Hongming, LIANG Zize, TAN Min, YE
Wenbo and LIAN Bo, “Existence probability map based circle detection
method”, Computer Engineering and Application, 2006.29.
E. Cuevas, F. Wario, D. Zaldivar, M. Pérez_Cisneros, “Circle detection
on images using learning automata”, IET Comput. Vis., 2012, Vol. 6,
Iss.2, pp. 121-132.
Carsten Steger, “On the calculation of arbitrary moments of polygons”,
Technical Report FGBV-96-05,1996.
Matthias Zobel, Joachim Denzler, Heinrich Niemann, “Entropy based
camera control for visual object tracking”, IEEE ICIP 2002, vol. 3, pp.
901-904.
Shaorong Xie, Jun Lou, Zhengbang Gong, Wei Ding, Hairong Zou,
Xiangguo Fu, “Biomimetic control of pan-tilt-zoom camera for visual
tracking based-on an autonomous helicopter”, IROS 2007, IEEE/RSJ,
pp. 2138-2143.
Murad Al Haj, Andrew D. Bagdanov, Jordi Gonazàlez and F. Xavier
Roca, “Reactive object tracking with a single PTZ camera”, 20th
International Conference on Pattern Recognition (ICPR 2010), pp. 16901693, 2010.
(b)
Figure 2. The red circular shape detection. (a) Circular shape detection in
natural image. (b) Circular shape detection in 20% uniform noise added
image.
t=1s
t=5s
(a)
t=1s
t=3s
t=8s
t=10s
t=32s
t=35s
(b)
Figure 3. Circle object detecting and tracking in real-time system. (a) Static
objec detetcting and tracking. (b) Irregular moving object detetcing and
tracking
419