Recent Researches in Electrical Engineering Human Detection Based on Covariance Features and SVM YAHIA SAID, MOHAMED ATRI Laboratory of Electronics and Microelectronics (EμE) -LAB 99 ES 30 Faculty of Sciences of Monastir University of Monastir, 5000 TUNISIA [email protected] Abstract: - Detecting pedestrians is a challenging problem owing to the motion of the subjects, the camera and the background and to variations in pose, appearance, clothing, illumination and background clutter. The Region Covariance Matrix (RCM) descriptors show experimentally significantly out-performs existing feature sets for pedestrian detection. In this paper, we present an efficient features extraction scheme: the Integral CovReg, inspired from Region Covariance Matrix (RCM) descriptors, combined with SVM classifier for pedestrian detection. Key-Words: - Pedestrian detection; Image descriptor; Integral CovReg; SVM. normalization in overlapping descriptor blocks are all important for good results. In [2], Tuzel et al. introduce covariance features as fast region descriptors for performing pedestrian detection task and it has achieved promising results. Human appearance is modeled by a dense set of features variances extracted inside a detection window, their correlations with each other and their spatial layout. It can fuse different types of features (image intensities, derivatives, etc.), while producing a compact representation. The performance of the covariance descriptor is found to be superior to other methods, as rotation and illumination changes are absorbed by the covariance matrix [3],[4]. In this paper, we propose a single frame human detection system which follows the path of Dalal et al.[1] and Tuzel et al.[2] This detection system is based on Integral Covariance Region descriptor combined with Support Vector Machines [5] for the recognition stage. SVM is a classification method, which has proved to be very efficient in such case of high dimensional data. It has been developed to detect a human centered in a 128 × 64 single image. The implementation of our system was based on MS VC++ and OpenCv [6] which is an Open source computer vision library, originally developed by Intel, specialized in the real time image processing. This paper is organized as follows: section 2 describes related works in human detection systems. In section 3, we detail the single frame detector and we give more precision about the Integral RegCov 1 Introduction Human detection is one of the most challenging problems in computer vision due to large variations caused by body articulation, illumination conditions, occlusions and appearance. At the same time detecting and tracking people has a wide range of applications including automotive safety, video surveillance, assisted living and robotics. Consequently, there is an increasing interest on this research field in the last years and a vast literature. The pattern recognition field applied to the human detection is faced with nature even of the human. Various variabilities enter in account: variability on the scale, of posture and appearance. Indeed, the human can be positioned with various sites in the image, under various aspects and point of view. The first need is a robust feature set that allows the human form to be discriminated cleanly, even in cluttered backgrounds under difficult illumination. Dalal and al. [1] presented a human detection algorithm with excellent detection results. Their algorithm uses a dense grid of Histograms of Oriented Gradients (HOG) to represent a detection window. The Histogram of Oriented Gradient (HOG) [1] descriptors provided better performance than other existing feature sets. Each stage of the HOG computation has much influence on performance, concluding that fine-scale gradients, fine orientation binning, relatively coarse spatial binning, and high-quality local contrast ISBN: 978-960-474-392-6 311 Recent Researches in Electrical Engineering descriptor and its parameters. Results are discussed in paragraph 4 and, finally, section 5 ends the paper with some final remarks. results show that their approach method was effective. Region covariance matrix (RCM) [2] is among the state-of-the-art features for human detection. RCM is a matrix of covariance of some image statistics computed inside the region of an image. It shows good discriminative power and elegant robustness. For example, Wu et al. [3] and Hussein et al. [4] show that the covariance matrix has the best discriminative power for human detection task compared with other descriptors like HOG [1] and the Edgelet features [8]. 2 Related Works Many different approaches for human detection have been proposed in the literature. Lowe [7] uses a description with orientation of the gradient around interest points. The research of the interest points is carried out of multi-resolution, which makes it possible to define a variable size for the vicinity interest points. The size depends on the scale to which the interest point is detected. The image is then characterized by a whole of local histograms. By regarding these data as "bags of words", we lose then the data space component, which reduces the method relevance. Wu et al. [8] introduced edgelet features, which are a new type of silhouette oriented features. Part detectors, based on these features, are learned by a boosting method. Responses of part detectors are combined to form a joint likelihood model that includes cases of multiple, possibly inter-occluded humns. The most successful and popular descriptor is histograms of oriented gradients (HOG) proposed by Dalal et al. [1]. The method is founded on a description of the image using histograms of gradient orientation. These histograms are calculated on the whole image, but are defined on a local area. The image description is a vector containing all the calculated histograms. This vector brings, on the one hand, local information for each area and makes it possible, on the other hand, to preserve a space fitting, the vectors being calculated systematically by the same process. A complete system for pedestrian detection based on HOG descriptors was introduced by Suard and al. [9]. Their system deals with stereo infrared images. The system first locates positions of interest within the larger view field where humans could possibly be located. Then normal Support Vector Machine classifiers operate on the HOG descriptors taken from these smaller positions of interest to formulate a decision regarding the presence of a pedestrian. This approach gives good results for window classification, and a preliminary test applied on a video sequence proves that this approach is very promising. Zhang et al. [10] optimized the HOG features to achieve an accurate human detection system. They don’t normalize the input detection windows but resize the cell and block by same ratio. Simulation ISBN: 978-960-474-392-6 3 Overview of the Proposed Method 3.1 Integral CovReg Features The proposed detection algorithm operates under a sliding window philosophy, where a window is a rectangular region of size 128×64 pixels that scans the entire image with a horizontal and vertical stride of 8 pixels. In a similar way to the HOG descriptor [1], each detection window is divided into Cells of size 8×8 pixels and each group of 2×2 cells is integrated into a Block in a sliding fashion. For every Block, a covariance matrix of image features is calculated. We give, below, followed steps to calculate the Integral CovReg descriptor: Features Extraction: On the detection window we perform features extraction, sequentially for each pixel, in a row-wise fashion. For each pixel we extract: (1) where Gx, Gxx, ... are first- and second-order derivatives of the image intensity, and the last term is the edge orientation. With this definition, the detection window is mapped to a d = 6-dimensional feature image. The covariance descriptor of a Block is a 6×6 matrix. Integral Representation: Covariance matrices can be computed in a very fast way by adopting integral representations under the form of tensors [2]. We have to compute the firstorder integral tensor P which is a d-dimensional vector encoding the sum of each feature of all pixels in the Block: 312 Recent Researches in Electrical Engineering The basic kernels are: • Linear: (2) and then Q, the d×d second-order integral tensor defined by (6) • Polynomial: (3) (7) Covariances Computation: The covariance matrix of Block B is • Radial Basis Function (RBF): (4) (8) where S = 16×16 is the number of pixels inside the Block B, PB is the d-dimensional vector of tensors, and QB is the d×d dimensional matrix of the secondorder tensors. The covariance descriptor of a Block is a symmetric positive definite 6×6 matrix and due to symmetry only the upper triangular or lower triangular part is stored, which has only 21 different values. An image of 128×64 pixels contains 7×15 blocks with multiple overlapping. We assemble the vectorized covariance matrices obtained for all blocks in a single 1-D vector of 2205 components. This is our final Integral RegCov descriptor. • Sigmoid: (9) In our experiments, we use Linear kernel to classify the feature data of train set. 4 Experiments and Results We evaluated our system with two different static human date sets: • The MIT pedestrian database [11]: contains 709 standing human images taken in city streets. The set only contains images featuring the front or back of human figures and contains little variety in human pose. 3.2 SVM Classifier Vladimir Vapnik proposed the Support Vector Machine (SVM) in 1995 [5]. Since this time, the SVM have been largely used for the pattern recognition, the regression and the density estimation. Given a training data set ( , where are the training examples Integral RegCov feature vector and the class label, the SVM require the solution of the following optimization problem : “Fig. 1,” shows some sample images from the MIT pedestrian data set. • The INRIA database [12]: proposes examples of human in varied postures, with different appearances and under very diverse scenes and luminosity conditions. “Fig. 2,” shows some sample images from the INRIA static person data set. Subject to (5) We selected 2416 images from INRIA data set as positive training examples, and 4120 images as negative training examples. For the test, we use 1178 images as positive examples, and 452 images as negatives examples from INRIA data set and 709 images from MIT data set. The method consists in mapping in a high dimensional space owing to the function . Then SVM looks for an optimal decision function of the . form The kernel function is . ISBN: 978-960-474-392-6 313 Recent Researches in Electrical Engineering Integral RegCov+SVM are performing better than the Integral HOG+SVM. “Fig. 4,” shows sample detections of the proposed Integral RegCov detector. TABLE I. TEST RESULT TABLE MIT data set Model INRIA positive test samples INRIA negative test samples 1.1), (1.2),…, (2.1), (2.2),… depending on your 100% 85.9% 99.1% NormalSections. HOG [1] various Optimized HOG [10] Fig. 1. Some sample images from the MIT pedestrian data set Integral HOG [13] Our method (709/709) (1012/1178) (448/452) 100% 99.07% 98.89% (709/709) (1167/1178) (447/452) 100% 94.48% 97.56% (709/709) (1113/1178) (441/452) 100% 98.38% 98.45% (709/709) (1159/1178) (445/452) Fig. 2. Some sample images from the INRIA static person data set This system is executed in Visual Studio 2010 with OpenCV library and synchronized on a 2.0GHz Dual-core Laptop. The test result on test set, compared to [1], [10] and [13], is shown in table 1. From the table 1, we can say that our results are comparable in terms of accuracy with Normal HOG [1] and Optimized HOG [10] features, and outperforms the Integral HOG [13] features. To quantify the performance of the detectors we plot the Receiver Operating Characteristic (ROC) curve. Each point on the ROC curve represents a sensitivity/specificity pair corresponding to a particular decision threshold. We tested the Integral HOG and Integral RegCov detectors with the INRIA person database. The true positive rate (Sensitivity) is plotted in function of the false positive rate (100-Specificity) for different cut-off points of a parameter. The ROC curves (Fig. 3) show the performances achieved by the two methods. The features with ISBN: 978-960-474-392-6 Fig. 3 ROC curve of various detectors 4 Conclusion In this paper, we present a human detection system using Integral Region Covariance features combined with SVM classifier. To evaluate the proposed method, we have used two different static human date sets: MIT and INRIA pedestrian databases. Experimental results have 314 Recent Researches in Electrical Engineering demonstrated the effectiveness and efficiency of the proposed algorithm. Future works include the use the proposed scheme for detecting other objects (cars, animals...) and/or tracking of the detected objects. [6] http://sourceforge.net/projets/opencvlibrary/ [7] David Lowe. “Distinctive image features from scale invariant keypoints”. International Journal of ComputerVision, 60(2) :91–110, 2004. [8] B. Wu and R. Nevatia. “Detection of multiple, partially occluded humans in a single image by bayesian combination of edgelet part detectors”. IEEE International Conference on Computer Vision, pages 90-97, 2005. [9] F. Suard, A. Rakotomamonjy, and A. Bensrhair. “Pedestrian Detection using Infrared images and Histograms of Oriented Gradients”.In Intelligent Vehicles Symposium 2006, June 13-15, 2006, Tokyo, Japan [10] G.Zhang, F.Gao, C.Liu, W.Liu, H.Yuan. “A Pedestrian Detection Method Based on SVM Classifier and Optimized Histograms of Oriented Gradients Feature”. In 2010 Sixth International Conference on Natural Computation (ICNC 2010) [11] http://cbcl.mit.edu/cbcl/software datasets/PedestrianData.html [12] http://pascal.inrialpes.fr/data/human/ [13] Yahia Said, Mohamed Atri and Rached Tourki. “Human Detection Based on Integral Histograms of Oriented Gradients and SVM”. International IEEE Conference (CCCA’11), Hammamet, Tunisia, March 0305, 2011. References: [1] N. Dalal and B. Triggs. “Histograms of oriented gradients for human detection”. IEEE Conference on Computer Vision and Pattern Recognition, pages 886-893, 2005 [2] Oncel Tuzel, Fatih Porikli, Peter Meer, Pedestrian detection via classification on riemannian manifolds, IEEE Transactions on Pattern Analysis and Machine Intelligence 30 (10) (2008) 1713–1727 [3] B. Wu and R. Nevatia. Optimizing discrimination efficiency tradeoff in integrating heterogeneous local features for object detection. In CVPR, 2008. [4] Mohamed Hussein, Fatih Porikli and Larry Davis.“A Comprehensive Evaluation Framework and a Comparative Study for Human Detectors”. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 10, NO. 3, SEPTEMBER 2009. [5] V.N.Vapnik. "The Nature of Statistical Learning Theory". Springer-Verlag, 1995. Fig. 4. Examples of detection results using Integral RegCov. Detected humans are enclosed in rectangles. ISBN: 978-960-474-392-6 315
© Copyright 2026 Paperzz