Person Re-Identification Over Camera Gridsby Multi-Task

International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 4 Issue 3, March 2015
Person Re-Identification Over Camera Gridsby
Multi-Task Distance Metric Learning
Prof. Seema B. Idhate1, Atif Sayyed2, Abhishek Sawant3,Pratik Borse4

Abstract—To re-identify a individual in a camera network is
a problematic and brave problem to solve. The methodthese
days share a common Mahalanobis distance metric in which the
data is attained from diverse camera’s in camera networks
there are dissimilar settings and recorded images are
exaggerated by the camera viewing angles and background
clutters. Though using a common metric to direct person
re-identification tasks on diverse pairs of camera’s check the
alterations in the camera settings, on the other hand it is very
time consuming to label videos manually. In this paper we
improve again the person re-identification in a camera network
as a multi-task distance metric learning problem. The already
debated method designs multiple Mahalanobis distance metrics
are connected but diverse and learned by adding joint
imposition of a standard to create easy fitting. We further
present aexclusive model for person re-identification in a
camera grid.
Index Terms—Re-identification, camera networks, distance
metric.
I. INTRODUCTION
.IN MODERN years, camera networks have become
progressivelydominant in public spaces such as
airportsunderground stations, shopping complexes, and road
junctions . It is regularlynecessary to identify a person of
interest from a bulk of number of candidates perceived over
the entire camera network, and then track the same person in
diverse camera views located at different physical sites at
different times. This task is normally known in the literature
as person re-identification. Person re-identification is
imperative for continuous object tracking and human
behavior analysis over large-scale camera networks, and has
drawn noteworthy attention in recent years. However, person
re-identification remains a very thought-provoking task due
to the complexity of conditions imposed by modern camera
networks. First, the camera grid usually has byzantine spatial
and temporal topology. Therefore, given a query image of a
person, the contender set could be very huge, presenting high
levels of vagueness to person re-identification. Second, there
is erraticism in lighting, camera angles and views,
background clutter, and blocking over the camera network. In
particular,the intricacysurges in proportion to the scale ofthe
Manuscript received Feb, 2013.
Prof.
Seema
B
Idhate,Department
of
Electronics
and
Telecommunication, JSPMs ICOER under UoP , Pune, India,
Abhishek
Sawant,Department
Of
Electronics
and
Telecommunication,JSPMs ICOER under UoP, Pune, India..
AtifSayyed, Department of Electronics and Telecomunication, JSPMs
ICOER under UoP, Pune,India
Pratik Borse, Department of Electronics and Telecomunication,JSPM,s
ICOER under UoP, Pune, India.
camera grid. Finally, the human body is expressed,and a
person’s appearance can change almost continuously.
Numerousmethods have been anticipated to address
thesechallenges. Most state-of-the-art procedurescast person
re-identification as a distance metric learning or learning-to
rank problem. After image depiction, the learned distance or
likeness measure between the query image and each
candidate image is totaled in order to govern which pair is the
correct match. Such methods can subtly model the transition
in feature space by seeing first or second order feature
relations in aadministered manner, which makes these
approaches suitable for this purpose. One disadvantage is that
such methods lead to over-fitting in high dimension feature
space. Furthermore, the cameras in the gridregularly have
very different settings, and the captured images are
extremelyexaggerated by erraticism in brightness conditions,
camera viewing angles, and background clutter. The
presentapproaches assign a metric to all image pairs in
thecamera grid and supervise the different settings, and as
aimportance the learned metric is not generalizable.In this
paper, we make multiple metrics for differentcamera pairs in
a camera grid to cope with the multipletransition models in
the network. In the setting of a cameragrid, we articulate the
person re-identification task as amulti-task distance metric
learning model, and we suggest anovel model called
Multi-task
Maximally
Collapsing
MetricLearning
(MtMCML). In dissimilarity to previous work on
metriclearning for person re-identification, in which
aexclusive metricis assigned to all image pairs, our planned
method designsone Mahalanobis distance metric for every
camera pair inthe camera grid. In general, it would be
untaught tolearn every metric discretely from the labeled
image set. Inpractice, however, it is very time-consuming to
physically label each person in images taken from
surveillance videos. For example, in most prevailing person
re-identification datasets, each person is captured in two
camera views, with one image of each person per camera
view. In this case, straight learning a unique Mahalanobis
distance metric for each camera pair is vulnerable to
over-fitting, since the data are inadequately labeled. In our
work we accept that these calculated metrics are dissimilar
but connected, and learn them in a multi-task agenda.
Additionally,
following
the
principles
in
and
extendingMCML to the multi-task setting, MtMCML is used
to acquiremultiple Mahalanobis distance metrics by joint
regularization.We also prove that the objective function of
MtMCML isequally convex and has a Lipschitz continuous
gradient, hence sporadic optimization and first-order
Nesterov methods canbe developed to competently obtain the
ideal solution. Toour information, this is the first attempt to
articulate person
re-identification over camera grids as a multi-task metric
690
ISSN: 2278 – 1323
All Rights Reserved © 2015 IJARCET
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 4 Issue 3, March 2015
learning problem.We perform extensive experimental
authentication on two large,publicly-available person
re-identification datasets collectedfrom camera networks: the
GRID dataset and the VIPeRdataset. The results demonstrate
that byarticulating personre-identification over a camera
network as a multi-task distancemetric learning problem we
can get significant developmentsin matching accuracy; and
our MtMCML modelsoutdo existing state-of-the-art person
re-identificationapproaches.
II. RELATED WORK.
Person re-identification in camera grids has beendiscovered
in the current literature. An individual moving in front of a
camera gridproduces two different types of information, his
or her spatio-temporal cues for entrance or exit and his or her
pictorial appearance. Therefore, most of the work on person
re-identification over camera grids is branded into two
groups:
spatio-temporal
topology
reasoning
and
appearance-based approaches . The first one emphases on
modeling spatiotemporal correlations between different
supervised sites in order to resolve uncertaintiesascending
from alike visual features happening in different objects, or to
trim the candidate set .In , these incidental correlations are
observed as background constraints that can be combined
into appearance-based methods.The appearance-based
methodschallenge personre-identification by taking into
account individual appearancesonly, as used here. Present
work on appearance-basedapproaches mainly emphases on
image depiction anddistance or resemblance function
learning. The main goalof image depiction is to follow visual
features fortelling appearance that are discriminative but
unresponsiveto the camera settings labelled above. The most
sharedapproach used for imagedepiction is to extractmany
features such as color, texture, and gradient, and/orexploit a
policy for uniting them that guarantees highre-identification
correctness. For example, maximum studies usecolor as a
discriminative feature .In order to ease color variability
between differentcameras, exploit the
brightness
transferfunction. Though, it is essential to manually label
theperson image pairs and segment the forefront to learna
brightness transfer function. In order to be more strongwhen
faced with great numbers of poses and angles, localfeatures
are sometimes subjugated. considersuccessive frames and
hire a spatiotemporal segmentation algorithm to
makenoticeable regions that are strong to changes in the
appearance of clothing. In, after finding a person’s outline
using a segmentation step, noticeable local color features are
extracted based on perceptualideologies of symmetry and
asymmetry. To complement this,a richer set of features is
also accessible in the works, partial least squares (PLS) is
used to decrease the dimensionality of features. In
anunsupervised approach is planned to learn the importance
of
features. To fuse many features, After image depiction, in
order to know thecharacter of the query image, distance or
resemblanceeventsare calculated between the query image
and each contenderimage. They present a discriminative
model bymanipulating the Adaboost algorithm to select the
maximumrelatedfeatures. Lately, a new method has been to
treat personre-identification as a learning-to-rank problem or
aMahalanobis distance metric learning problem articulate
person re-identification as a learning-to-rankproblem where
the true match is given maximum ranking,
rather than the incorrect match. Furthermore, proposing
PRDC by articulating this problem in a probabilisticmanner.
PRDC is a second-order feature quantification modelthat
takes correlations between diverse features into
account,althoughRankSVM is a first-order model.
However,the optimization problem of PRDC
is not
convex,and therefore a worldwideideal solution is not
guaranteed.Similarly, Mignon et al. introduce a
pairwise-constrainedmetric-learning procedure, which learns
a projection into
low-dimensional space to deal with the high-dimensional
input space. In , by spreading LMNN [33], Dikmen et al.
introduce large-margin next-door neighbor, with a refusal
option to directly acquire the Mahalanobis distance. To
decreasethe computational cost for huge datasets, Kostinger
et al.suggest a simple yet operative method to learn the
distancemetric founded on arithmetical inference.There is
also some work that emphases on how best to applythe
learned metric by seeing test data as well.
Unlikepresentmethods that consider only thoroughlylabeled
pairwise data, Loy et al introduce a diverseranking outline for
person re-identification by manipulatingthe unlabeled gallery
images. Li et al address the fact thatdissimilar visual metrics
are learned for diversecontender sets.They challenge person
re-identification using a transfer learningoutline, and transfer
the learned metric from training datato diverse candidates in
the test set. Our planned model canbe effortlesslyhired in the
outlines used in asthe learned metric of source field.Thus,
animportant merit of distance metric learning isthat it can
perfect the changeover in feature space between twocamera
views, and Mahalanobis distance metric methods canperfect
the changeover by seeing the covariance of features.This
makes these methods much more actual for
personre-identification. Though, such methods may cause
overfittingto the exercised data in high-dimension feature
space,and meanwhile the cameras and their images are focus
to noteworthyerraticism, learning aexclusive metric for
training datadoes not work well for person re-identification.
In difference,our proposed MtMCML method projectsmany
metrics fordifferent camera pairs in the camera grid, accepts
theyare diverse but connected, and learns these metrics in a
multitasklearning outline. MtMCML projects multiple
metricsinversely. Dissimilar metrics are intendedfor different
contender images from the test data, and theyare online
educated in a transfer learning outline, where aexclusive
metric gained from training data is observed as the
model from source field.Indifference to present multi-task
metric learningprocedures , MtMCML is preciselyintended
for person re-identification over camera grids. Theoverall
multi-task metric learning algorithmsare not easily valid to
person re-identification overcamera grids. There is a
necessity fortriplets for every task, which is unreasonable for
prevailingperson re-identification datasets in the projected
multi-tasksetting. There are two motives why there are no
triplet basedrestrictions available for some tasks: the
technique used toproject multiple tasks in this outline and the
characteristicof most prevailing person re-identification
datasets. In ourplannedoutline, each metric is intended for
each cameraduo, and each task is intended for learning each
metric.Most surviving person re-identification datasets, e.g.
GRIDand VIPeR, are caught by numerous cameras. Each
personis caught in only two camera views, and each person
onlyhas two images (one per camera). The number of
pairwiseresemblance constraints in these datasets is not big.
691
ISSN: 2278 – 1323
All Rights Reserved © 2015 IJARCET
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 4 Issue 3, March 2015
In suchcircumstances, some pairs of cameras may yield no
optimistic imagepairs for a particular training set. The
nonappearance of a positivepair shows that the triplet is
unobtainable for these tasks.Thus triplet based multi-task
metric learning approaches are notdirectly useful in our
outline. In summary, the chief contribution of this work is
double:
(1) We are the first to plan multiple distance metricsfor
person re-identification in the setting of camera gridsand
articulate person re-identification as a multi-task
distancemetric learning problem with sturdy applicability to
far-reachingcamera networks.
(2) Regarding the characteristics of prevailing
personre-identification datasets, we plan a new multi-task
metric
learningprocedure that exploits image pairs. More precisely,
weoffer a novel Multi-task Maximally Collapsing Metric
Learning model, which has decent theoretical properties: the
unbiased function of MtMCML is similarly convex and has
aLipschitz-continuous gradient, and the ideal solution
ofMtMCML can be gained with interchanging optimization
andfirst-order Nesterovapproaches.
III. METHODOLOGY
A) Algorithm:
1)
2)
3)
4)
5)
6)
7)
8)
Start
Arrange multiple cameras in network.
Image Acquisition from each camera
Image Preprocessing
a) Noise reduction
b) Contrast Adjustment
Moving Object Detection in every frame.
Human detection
Feature Extraction on each detected human
Find
multitask
distance
metrics
for
re-identification between different detected
humans
Image Acquisition Toolbox enables you to acquire images
and video from cameras and frame grabbers directly into
MATLAB® and Simulink®. You can detect hardware
automatically and configure hardware properties. Advanced
workflows let you trigger acquisition while processing
in-the-loop, perform background acquisition, and
synchronize sampling across several multimodal devices.
With support for multiple hardware vendors and industry
standards, you can use imaging devices ranging from
inexpensive Web cameras to high-end scientific and
industrial devices that meet low-light, high-speed, and other
challenging requirements. Image is acquired using the
command imread, which enables the user to read an image
which is in the computer or in the matlab software.
Image Pre-processing:
This method usually increases the global contrast of many
images, especially when the usable data of the image is
represented by close contrast values. Through this
adjustment, the intensities can be better distributed on the
histogram. This allows for areas of lower local contrast to
gain a higher contrast. Histogram equalization accomplishes
this by effectively spreading out the most frequent intensity
values.
Noise Reduction:
Noise reduction is the procedure of eliminating noise from a
an image. All recording devices both analog or digital, have
characters which make them vulnerable to noise. Noise can
be random or white noise with no coherence and instrument.
There are innumerable types of noises such as “salt and
pepper” noise, “Gaussian noise” etc. In Gaussian noise each
pixel in the image will be altered from its unique value by a
small amount
Human Detection:
Multi-task distance metric, mentioning to the combined
training of numerous problems, can frequently lead to better
performance by exploiting the shared info through all the
problems. On the other hand, metric learning, an significant
research topic, is yet often studied in the old-style single task
setting. Directing this problem, in this paper, we suggest a
fresh multi-task metric learning outline. Based on the
supposition that the discriminative info across all the errands
can be reserved in a low-dimensional common subspace, our
proposed framework can be willingly used to cover many
current metric learning tactics for the multi-task scenario.
B) Block Diagram:
Camera Networks:
The multi camera surveillance database 1 was captured
from an existing surveillance network, to enable the
evaluation of person re-identification models in a real life
multi camera surveillance environment. The data base
consists of about 10 people moving through the building
environment recorded by two surveillance cameras. Each
camera captures data at 25 frames per second. The cameras
have been placed to provide maximum coverage of the space
and observation of the entrances to the building The camera
networks and the subset varies from person to person
Image Acquisition:
Feature Extraction:
When the input data to an procedure is too large to be handled
and it is suspected to be disreputably redundant (e.g. the same
measurement in both feet and meters) then the input data will
be changed into a abridgeddepiction set of features (also
named features vector). Changing the input data into the set
of features is called feature extraction. If the features
extracted are sensibly chosen it is anticipated that the features
set will extract the related information from the input data in
order to achieve the anticipated task using this
condenseddepiction instead of the full size input.
692
ISSN: 2278 – 1323
All Rights Reserved © 2015 IJARCET
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 4 Issue 3, March 2015
Finding Multi-Task Distance Metric:
The human beings are identified on the basis of their human
parts such as legs, torso and face. The size of these parts are
assumed based on the approximation and depending on that
the parts are sized up.
pairs. Based on the multi-taskmetric learning problem
articulated in the previous Section,transfer metric learning is
subjugated for these strange tasks.We classify these odd tasks
into the target task. It ispresumed that we have learnt m
Mahalanobis distance metrics. The target task TM is
transferred from all the source tasks as
Person Re-Identification:
The person which has past more number of times from one
room to the another room is automatically identified by the
distance metric method. This identified person is identified
on the basis of the distance metric learning. The re-identified
person is labelled accordingly based on certain parameters.
The main goal of this paper is to model person
re-identification over camera network as a multi-task
distance metric learning.
1
𝑇𝑀 =
𝑚
𝑚
𝑀𝑖
1
VI. RESULTS
The output of our project is basically a Graphical User
Interface of Matlab Software. The result is background
image, re-identified image and mean difference image.
C) Person Re-Identification.
In this Unit, we concisely present metric learning for
individual re-identification. The chief goal of this paper is to
model person re-identification over camera grid as
amulti-task distance metric learning. Certain metric learning
approaches can also be carefully chosen as optional attempts
for multitask additions. MCML has three decent properties:
pairwise restraints, its curved and differentiable loss function,
and the constraints on the distance metric forced by correct
kNN classification.Here we are using distance metrics to
re-identify the person which was using multi-task distance
metric and as well as RGB mean. Here we are using body part
separation codes to identify the different lengths of the
human body. Assumption is made depending on the parts of
the body, for eg: face, torso and legs. The pixels of the legs,
torso and face are assumed and then recorded in the software.
The objects are labeled as per their name or alphabets. In
re-identification the objects when inserted again, irrespective
of their order, are labeled correctly as before.
Fig1. Human Detection
IV. SYSTEM OVERVIEW
We now provide the system summary of person
re-identification over camera grids. Here we have used two
cameras to identify the persons which are coming in front of
it. The technique used to identify the persons is multi-task
distance metric learning and by finding the RGB mean.
Body part separation is also a technique which we have been
re-incorporated. In this technique approximation of body
parts is done and depending on that a person is re-identified.
Appropriate labeling is done to avoid confusion and the
person is identified.
V. PROBLEM FORMULATION
We now present the problem formulation of transfer distance
metric learning. Our technique yields each metric for each
pair of camerasaccording to the training set, and no material
regarding thetest set is obligatory through training. In this
way, there occurs theoption that certain pairs of images rise
throughout testing eventhough they are not in the training set.
These pairs of camerasare here called strange pairs, and
errands including these pairs ofcameras are called odd tasks.
In this case, it is significantto select models appropriate for
calculating the distances betweenimage pairs from these odd
Fig 2 . Detection And Labeling
693
ISSN: 2278 – 1323
All Rights Reserved © 2015 IJARCET
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 4 Issue 3, March 2015
P=900
L1 Norm
PRDC
Rank SVM
MtMCML
r=1
4.40
9.68
10.24
14.08
Table 1.
r=5
34.64
24.56
22.00
11.68
r = 10
16.24
32.96
33.28
45.84
topr ranks in contradiction of the p gallery images. A rank 1
matchingrate is therefore the precise matching/recognition
rate. To brandcomplete contrasts between MtMCML and
currentmodels, we set the same p as in , i.e. p = 900
VII . CONCLUSION
Fig 3.Mean Ratio Difference Image.
Fig 4. Inclusion Of ANew Person Is Labeled With ANew
Variable.
In this paper, we articulate person re-identification in a
camera grid as a multi-task distance metric learningproblem.
In
specific,
we
suggest
a
fresh
Multi-task
MaximallyCollapsing Metric Learning (MtMCML) model
forperson re-identification in a camera grid. In order tohandle
with complex conditions in a distinctive camera networksuch
as differences in lighting, camera viewing angles,and
background
clutter,
MtMCML
projects
multiple
diverseMahalanobis distance metrics for different camera
pairs, andaddresses the fact that these Mahalanobis distance
metrics arediverse but linked. These Mahalanobis distance
metrics areequally learned by adding graph-based
regularization to easeover-fitting. Our experiments
authenticate that the performanceof MtMCML is
considerably better than other current state-ofthe-art person
re-identification approaches. It is worth directingout that
although
our
planned
MtMCML
model
is
articulatedprecisely for person re-identification over a
cameragrid. We will consider how to update the learned
numerous metrics in the forthcoming years.
ACKNOWLEDGEMENT
The authors wish to thank Prof. P Badadapure for his
innovative input in improvising the subject and paper. This
work was supported in part by a grant from JSPM’s ICOER,
Pune, Maharashtra, India
REFERENCES
[1]S. A. Globerson and S. Roweis, “Metric learning by collapsing classes,”
inProc. Adv. Neural Inform. Process.Syst., 2005.
[2] C. C. Loy, T. Xiang, and S. Gong, “Multi-camera activity correlation
analysis,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR),Jun.
2009, pp. 1988–1995.
[3] Y. Wang, S. Velipasalar, and M. Casares, “Cooperative object tracking
and composite event detection with wireless embedded smart
cameras,”IEEE Trans. Image Process., vol. 19, no. 10, pp. 2614–2633, Oct.
2010.
Fig 5. Mean Ratio Difference Image Of The Above Image.
The above results shows that the person is first detected and
then the labeling is shown. It also shows their mean ratio
difference image.
[4] C. Ding, B. Song, A. Morye, J. A. Farrell, and A. K. Roy-Chowdhury,
“Collaborative sensing in a distributed PTZ camera network,” IEEETrans.
Image Process., vol. 21, no. 7, pp. 3282–3295, Jul. 2012.
[5] D. Gray and H. Tao, “Viewpoint invariant pedestrian recognition with
an ensemble of localized features,” in Proc. Eur. Conf. Comput. Vis.,
2008, pp. 262–275.
[6] M. Farenzena, L. Bazzani, A. Perina, M. Cristani, and V. Murino,
“Person re-identification by symmetry-driven accumulation of local
features,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR),
Jun. 2010, pp. 2360–2367.
[7] W.-S. Zheng, S. Gong, and T. Xiang, “Reidentification by relative
distance comparison,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35,
694
ISSN: 2278 – 1323
All Rights Reserved © 2015 IJARCET
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 4 Issue 3, March 2015
no. 3, pp. 653–668, Mar. 2013.
[8] B. Prosser, W.-S. Zheng, S. Gong, and T. Xiang, “Person
re-identificationby support vector ranking,” in Proc. Brit. Mach. Vis. Conf.,
2010.
695
ISSN: 2278 – 1323
All Rights Reserved © 2015 IJARCET