Abstract
In this paper we present a new 3D descriptor for human classification and a human detection method based on this descriptor. The proposed 3D descriptor allows classification of an object represented by a point cloud, as human or non-human. It is derived from the well-known Histogram of Oriented Gradient by employing surface normals instead of gradients. The process consists in an appropriate subdivision of the object point cloud into blocks. These blocks provide the spatial distribution modeling of the surface normal orientation into the different parts of the object. This distribution modelling is expressed as a histogram. In addition we have set up a multi-kinect acquisition system that provides us with Complete Point Clouds (CPC) (i.e. 360° view). Such CPCs enable a suitable processing, particularly in case of occlusions. Moreover they allow for the determination of the human frontal orientation. Based on the proposed 3D descriptor, we have developed a human detection method that is applied on CPCs. First, we evaluated the 3D descriptor over a set of CPC candidates by using the Support Vector Machine (SVM) classifier. The learning process was conducted with the original CPC database that we have built. The results are very promising. The descriptor can discriminate human from non-human candidates and provides the frontal direction of humans with high precision. In addition we demonstrated that using the CPCs improves significantly the classification results in comparison with Single Point Clouds (i.e. points clouds acquired with only one kinect). Second, we compared our detection method with two others, namely the HOG detector on RGB images and a 3D HOG-based detection method that is applied on RGB-depth data. The obtained results on different situations show that the proposed human detection method provides excellent performances that outperform the other two detection methods.
Similar content being viewed by others
References
Angelova A, Krizhevsky A, Vanhoucke V, Ogale A, Ferguson D (2015) Real-time pedestrian detection with deep network cascades. In: British machine vision conference
Bajracharya M, Moghaddam B, Howard A, Brennan S, Matthies L H (2009) A fast stereo-based system for detecting and tracking pedestrians from a moving vehicle. In: International Journal of Robotics Research
Baltieri D, Vezzani R, Cucchiara R (2012) People orientation recognition by mixtures of wrapped distributions on random trees. In: European conference on computer vision, pp 270–283
Campmany V, Silva S, Espinosa A, Moure J, Vazquez D, Lopez A (2016) GPU-based pedestrian detection for autonomous driving. Proc Comput Scie 80:2377–2381
Chang C, Lin C (2011) LIBSVM: a library for support vector machines. In: Transactions on intelligent systems and technology, vol 27. ACM, pp 1–27
Chen C, Heili A, Odobez J (2011) Combined estimation of location and body pose in surveillance video. In: International conference on advanced video and signal based surveillance, pp 5–10
Choi B, Meriçli C, Biswas J, Veloso M (2013) Fast human detection for indoor mobile robots using depth images. In: IEEE international conference on robotics and automation. IEEE, pp 1108– 1113
Choi B, Pantofaru C, Savarese S (2011) Detecting and tracking people using an RGB-D camera via multiple detector fusion. In: Conference on computer vision workshops. IEEE, pp 6–13
Culhane K M, OConnor M, Lyons D, Lyons G M (2008) Accelerometers in rehabilitation medicine for older adults. Age Ageing 6:556–560
Herrera DC, Kannala J, Heikkilä J (2011) Accurate and practical calibration of a depth and color camera pair. In: Lecture notes in computer science
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Computer vision and pattern recognition, vol I. IEEE, pp 886–893
Deveaux J C, Hadj-Abdelkader H, Colle E (2013) A multi-sensor calibration toolbox for kinect: application to kinect and laser range finder fusion. In: International conference on advanced robotics
Drory A, Zhu G, Li H, Hartley R (2017) Automated detection and tracking of slalom paddlers from broadcast image sequences using cascade classifiers and discriminative correlation filters. Comput Vis Image Underst 159:116–127
Eichner M, Marin-Jimenez M, Zisserman A, Ferrari V (2012) 2D articulated human pose estimation and retrieval in (almost) unconstrained still images. Int J Comput Vis 99(2):190–214
Engelcke M, Rao D, Wang D Z, Tong C H, Posner I (2017) Vote3Deep: fast object detection in 3D point clouds using efficient convolutional neural networks. International Conference on Robotics and Automation
Fitte-Duval L, Mekonnen A, Lerasle F (2015) Upper body detection and feature set evaluation for body pose classification. In: International conference on computer vision theory and applications, pp 439–446
Gavrila D M, Munder S (2007) Multi-cue pedestrian detection and tracking from a moving vehicle. In: International journal of computer vision, vol 73. Springer, pp 41–59
Gond L, Sayd P, Chateau T, Dhome M (2008) A 3D shape descriptor for human pose recovery. In: Lecture notes in computer science, vol 5098. Springer, pp 370–379
Hegger F, Hochgeschwender N, Kraetzschmar G K, Ploeger P G (2013) People detection in 3d point clouds using local surface normals. Lect Notes Comput Sci 7500:154–165
Holz D, Holzer S, Rusu R B, Benke S (2012) Real-time plane segmentation using rgb-d cameras. In: Lecture notes in computer science. Springer, pp 306–317
Hosseini JO, Mitzel D, Leibe B (2014) Real-time rgb-d based people detection and tracking for mobile robots and head-worn cameras. In: IEEE international conference on robotics and automation
Ikemura S, Fujiyoshi H (2011) Real-time human detection using relational depth similarity features. In: Asian conference on computer vision. Springer, pp 25–38
Johnson A (1997) Spin-images: a representation for 3-D surface matching. Ph.D. thesis, The Robotics Institute, Carnegie Mellon University
Klaser A, Marszalek M, Schmid C (2008) A spatio-temporal descriptor based on 3D- gradients. In: British machine vision conference, pp 275:1–10
Lai K, Bo L, Ren X, Fox D (2011) A scalable tree-based approach for joint object and pose recognition. In: Conference on artificial intelligence
Li C, Wang X, Liu W (2017) Neural features for pedestrian detection. In: Neurocomputing, pp 420–432
Liem M C, Gavrila D M (2014) Coupled person orientation estimation and appearance modeling using spherical harmonics. Image Vis Comput 32(10):728–738
Lin B Z, Lin C C (2016) Pedestrian detection by fusing 3D points and color images. Int J Netw Distrib Comput 4:252
Liu B, Wu H, Su W, Sun J (2017) Sector-ring HOG for rotation-invariant human detection. Signal Process Image Commun 54:1–10
Liu J, Liu Y, Zhang G, Zhu P, Chen Y Q (2015) Detecting and tracking people in real time with RGB-D camera. In: Pattern recognition letters. Elsevier, p 1623
Maimone A, Fuchs H (2011) Encumbrance-free telepresence system with real-time 3D capture and display using commodity depth cameras. In: IEEE international symposium on mixed and augmented reality, pp 137–146
Mattausch O, Panozzo D, Mura C, Sorkine-Hornung O, Pajarola R (2014) Object detection and classification from large-scale cluttered indoor scans. In: EUROGRAPHICS, vol 33
Mitzel D, Leibe B (2012) Close-range human detection for head-mounted cameras. In: British machine vision conference
Moeslund T B, Hilton A, Kruger V (2008) A survey of advances in vision-based human motion capture and analysis. Comput Vis Image Underst 23:90–126
Mozos O M, Kurazume R, Hasegawa T (2010) Multi-layer people detection using 2D range data. In: International journal of social robotics, vol 2. Springer, pp 31–40
Munaro M, Basso F, Menegatti E (2012) Tracking people within groups with RGB-D data. In: International conference on intelligent robots and systems. IEEE, pp 2101–2107
Nakazawa M, Mitsugami I, Makihara Y, Nakajima H, Habe H, Yamazoe H, Yagi Y (2012) Dynamic scene reconstruction using asynchronous multiple kinects. In: International conference on pattern recognition, pp 11–15
Navarro-Serment L, Mertz C, Hebert M (2010) Pedestrian detection and tracking using three-dimensional LADAR data. In: Tracts in advanced robotics, vol 62. Springer, pp 103–112
Oreifej O, Liu Z (2013) HON4D: histogram of oriented 4D normals for activity recognition from depth sequences. In: IEEE conference on computer vision and pattern recognition
Ott C, Lee D, Nakamura Y (2008) Motion capture based human motion recognition and imitation by direct marker control. In: IEEE-RAS international conference on humanoid robots, pp 399–405
Ouyang W, Wang X (2012) A discriminative deep model for pedestrian detection with occlusion handling. In: IEEE conference on computer vision and pattern recognition, pp 3258–3265
Ouyang W, Zeng X, Wang X (2013) Modeling mutual visibility relationship in pedestrian detection. In: IEEE conference on computer vision and pattern recognition, pp 3222–3229
Parisot P, Vleeschouwer C D (2017) Scene-specific classifier for effective and efficient team sport players detection from a single calibrated camera. Comput Vis Image Underst 159(Supplement C):74–88
Paul P, Haque S M E, Chakraborty S (2013) Human detection in surveillance videos and its applications - a review. EURASIP J Adv Signal Process 1:1–16
Plagemann C, Ganapathi V, Koller D, Thrun S (2010) Real-time identification and localization of body parts from depth images. In: IEEE international conference on robotics and automation, pp 3108–3113
Raposo C, Barreto J P, Nunes U (2013) Fast and accurate calibration of a kinect sensor. In: International conference on 3D vision. IEEE, pp 342–349
Roetenberg D, Luinge H, Slycke P (2009) Xsens mvn: full 6dof human motion tracking using miniature inertial sensors xsens motion technologies bv
Rusu R (2010) Semantic 3D object maps for everyday manipulation in human living environments. In: KI - Künstliche Intelligenz, vol 24
Rusu R B, Blodow N, Beetz M (2009) Fast point feature histograms (FPFH) for 3D registration. In: Proceedings of the 2009 IEEE international conference on robotics and automation, pp 1848–1853
Salas J, Tomasi C (2011) People detection using color and depth images. In: Pattern recognition, vol 6718. Springer, Berlin, pp 127–135
Satake J, Miura J (2009) Multiple-person tracking for a mobile robot using stereo. In: IAPR conference on machine vision applications, pp 8–17
Shashua A, Gdalyahu Y, Hayun G (2004) Pedestrian detection for driving assistance systems: single-frame classification and system level performance. In: IEEE intelligent vehicles symposium, pp 1–6
Shen Y, Hao Z, Wang P, Ma S (2013) A novel human detection approach based on depth map via kinect. In: IEEE conference on computer vision and pattern recognition workshops, pp 535–541
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905
Song S, Xiao J (2014) Sliding shapes for 3D object detection in depth images. In: European conference on computer vision
Spinello L, Arras K O (2011) People detection in RGB-D data. In: International conference on intelligent robots and systems. IEEE, pp 3838–3843
Stone E E, Skubic M (2012) Capturing habitual, in-home gait parameter trends using an inexpensive depth camera. In: IEEE engineering in medicine and biology society, pp 5106–9
Tang S, Wang X, Lv X, Han T X, Keller J, He Z, Skubic M, Lao S (2012) Histogram of oriented normal vectors for object recognition with a depth sensor. Asian Conference on Computer Vision 7725:525– 538
Tian Q, Zhou B, Zhao W, Wei Y, Fei W (2013) Human detection using HOG features of head and shoulder based on depth map. J Softw 8:2223–2230. Academy Publisher
Tian Y, Luo P, Wang X, Tang X (2015) Deep learning strong parts for pedestrian detection
Tombari F, Salti S, Stefano L D (2010) Unique signatures of histograms for local surface description. In: European conference on computer vision, pp 356–369
Weinrich C, Vollmer C, Gross H (2012) Estimation of human upper body orientation for mobile robotics using an SVM decision tree on monocular images. In: International conference on intelligent robots and systems, pp 2147–2152
Xia L, Chen C, Aggarwal J K (2011) Human detection using depth information by Kinect. In: Computer vision and pattern recognition workshops. IEEE, pp 15–22
Zhang L, Lin L, Liang X, He K (2016) Is faster r-cnn doing well for pedestrian detection?. In: European conference on computer vision, pp 443–457
Zong C, Clady X, Chetouani M (2011) An embedded human motion capture system for an assistive walking robot, pp 1–6
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Essmaeel, K., Migniot, C., Dipanda, A. et al. A new 3D descriptor for human classification: application for human detection in a multi-kinect system. Multimed Tools Appl 78, 22479–22508 (2019). https://doi.org/10.1007/s11042-019-7568-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-019-7568-6