Skip to main content
Log in

A new 3D descriptor for human classification: application for human detection in a multi-kinect system

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper we present a new 3D descriptor for human classification and a human detection method based on this descriptor. The proposed 3D descriptor allows classification of an object represented by a point cloud, as human or non-human. It is derived from the well-known Histogram of Oriented Gradient by employing surface normals instead of gradients. The process consists in an appropriate subdivision of the object point cloud into blocks. These blocks provide the spatial distribution modeling of the surface normal orientation into the different parts of the object. This distribution modelling is expressed as a histogram. In addition we have set up a multi-kinect acquisition system that provides us with Complete Point Clouds (CPC) (i.e. 360° view). Such CPCs enable a suitable processing, particularly in case of occlusions. Moreover they allow for the determination of the human frontal orientation. Based on the proposed 3D descriptor, we have developed a human detection method that is applied on CPCs. First, we evaluated the 3D descriptor over a set of CPC candidates by using the Support Vector Machine (SVM) classifier. The learning process was conducted with the original CPC database that we have built. The results are very promising. The descriptor can discriminate human from non-human candidates and provides the frontal direction of humans with high precision. In addition we demonstrated that using the CPCs improves significantly the classification results in comparison with Single Point Clouds (i.e. points clouds acquired with only one kinect). Second, we compared our detection method with two others, namely the HOG detector on RGB images and a 3D HOG-based detection method that is applied on RGB-depth data. The obtained results on different situations show that the proposed human detection method provides excellent performances that outperform the other two detection methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23

Similar content being viewed by others

References

  1. Angelova A, Krizhevsky A, Vanhoucke V, Ogale A, Ferguson D (2015) Real-time pedestrian detection with deep network cascades. In: British machine vision conference

  2. Bajracharya M, Moghaddam B, Howard A, Brennan S, Matthies L H (2009) A fast stereo-based system for detecting and tracking pedestrians from a moving vehicle. In: International Journal of Robotics Research

  3. Baltieri D, Vezzani R, Cucchiara R (2012) People orientation recognition by mixtures of wrapped distributions on random trees. In: European conference on computer vision, pp 270–283

  4. Campmany V, Silva S, Espinosa A, Moure J, Vazquez D, Lopez A (2016) GPU-based pedestrian detection for autonomous driving. Proc Comput Scie 80:2377–2381

    Article  Google Scholar 

  5. Chang C, Lin C (2011) LIBSVM: a library for support vector machines. In: Transactions on intelligent systems and technology, vol 27. ACM, pp 1–27

  6. Chen C, Heili A, Odobez J (2011) Combined estimation of location and body pose in surveillance video. In: International conference on advanced video and signal based surveillance, pp 5–10

  7. Choi B, Meriçli C, Biswas J, Veloso M (2013) Fast human detection for indoor mobile robots using depth images. In: IEEE international conference on robotics and automation. IEEE, pp 1108– 1113

  8. Choi B, Pantofaru C, Savarese S (2011) Detecting and tracking people using an RGB-D camera via multiple detector fusion. In: Conference on computer vision workshops. IEEE, pp 6–13

  9. Culhane K M, OConnor M, Lyons D, Lyons G M (2008) Accelerometers in rehabilitation medicine for older adults. Age Ageing 6:556–560

    Google Scholar 

  10. Herrera DC, Kannala J, Heikkilä J (2011) Accurate and practical calibration of a depth and color camera pair. In: Lecture notes in computer science

  11. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Computer vision and pattern recognition, vol I. IEEE, pp 886–893

  12. Deveaux J C, Hadj-Abdelkader H, Colle E (2013) A multi-sensor calibration toolbox for kinect: application to kinect and laser range finder fusion. In: International conference on advanced robotics

  13. Drory A, Zhu G, Li H, Hartley R (2017) Automated detection and tracking of slalom paddlers from broadcast image sequences using cascade classifiers and discriminative correlation filters. Comput Vis Image Underst 159:116–127

    Article  Google Scholar 

  14. Eichner M, Marin-Jimenez M, Zisserman A, Ferrari V (2012) 2D articulated human pose estimation and retrieval in (almost) unconstrained still images. Int J Comput Vis 99(2):190–214

    Article  MathSciNet  Google Scholar 

  15. Engelcke M, Rao D, Wang D Z, Tong C H, Posner I (2017) Vote3Deep: fast object detection in 3D point clouds using efficient convolutional neural networks. International Conference on Robotics and Automation

  16. Fitte-Duval L, Mekonnen A, Lerasle F (2015) Upper body detection and feature set evaluation for body pose classification. In: International conference on computer vision theory and applications, pp 439–446

  17. Gavrila D M, Munder S (2007) Multi-cue pedestrian detection and tracking from a moving vehicle. In: International journal of computer vision, vol 73. Springer, pp 41–59

  18. Gond L, Sayd P, Chateau T, Dhome M (2008) A 3D shape descriptor for human pose recovery. In: Lecture notes in computer science, vol 5098. Springer, pp 370–379

  19. Hegger F, Hochgeschwender N, Kraetzschmar G K, Ploeger P G (2013) People detection in 3d point clouds using local surface normals. Lect Notes Comput Sci 7500:154–165

    Article  Google Scholar 

  20. Holz D, Holzer S, Rusu R B, Benke S (2012) Real-time plane segmentation using rgb-d cameras. In: Lecture notes in computer science. Springer, pp 306–317

  21. Hosseini JO, Mitzel D, Leibe B (2014) Real-time rgb-d based people detection and tracking for mobile robots and head-worn cameras. In: IEEE international conference on robotics and automation

  22. Ikemura S, Fujiyoshi H (2011) Real-time human detection using relational depth similarity features. In: Asian conference on computer vision. Springer, pp 25–38

  23. Johnson A (1997) Spin-images: a representation for 3-D surface matching. Ph.D. thesis, The Robotics Institute, Carnegie Mellon University

  24. Klaser A, Marszalek M, Schmid C (2008) A spatio-temporal descriptor based on 3D- gradients. In: British machine vision conference, pp 275:1–10

  25. Lai K, Bo L, Ren X, Fox D (2011) A scalable tree-based approach for joint object and pose recognition. In: Conference on artificial intelligence

  26. Li C, Wang X, Liu W (2017) Neural features for pedestrian detection. In: Neurocomputing, pp 420–432

  27. Liem M C, Gavrila D M (2014) Coupled person orientation estimation and appearance modeling using spherical harmonics. Image Vis Comput 32(10):728–738

    Article  Google Scholar 

  28. Lin B Z, Lin C C (2016) Pedestrian detection by fusing 3D points and color images. Int J Netw Distrib Comput 4:252

  29. Liu B, Wu H, Su W, Sun J (2017) Sector-ring HOG for rotation-invariant human detection. Signal Process Image Commun 54:1–10

    Article  Google Scholar 

  30. Liu J, Liu Y, Zhang G, Zhu P, Chen Y Q (2015) Detecting and tracking people in real time with RGB-D camera. In: Pattern recognition letters. Elsevier, p 1623

  31. Maimone A, Fuchs H (2011) Encumbrance-free telepresence system with real-time 3D capture and display using commodity depth cameras. In: IEEE international symposium on mixed and augmented reality, pp 137–146

  32. Mattausch O, Panozzo D, Mura C, Sorkine-Hornung O, Pajarola R (2014) Object detection and classification from large-scale cluttered indoor scans. In: EUROGRAPHICS, vol 33

  33. Mitzel D, Leibe B (2012) Close-range human detection for head-mounted cameras. In: British machine vision conference

  34. Moeslund T B, Hilton A, Kruger V (2008) A survey of advances in vision-based human motion capture and analysis. Comput Vis Image Underst 23:90–126

    Google Scholar 

  35. Mozos O M, Kurazume R, Hasegawa T (2010) Multi-layer people detection using 2D range data. In: International journal of social robotics, vol 2. Springer, pp 31–40

  36. Munaro M, Basso F, Menegatti E (2012) Tracking people within groups with RGB-D data. In: International conference on intelligent robots and systems. IEEE, pp 2101–2107

  37. Nakazawa M, Mitsugami I, Makihara Y, Nakajima H, Habe H, Yamazoe H, Yagi Y (2012) Dynamic scene reconstruction using asynchronous multiple kinects. In: International conference on pattern recognition, pp 11–15

  38. Navarro-Serment L, Mertz C, Hebert M (2010) Pedestrian detection and tracking using three-dimensional LADAR data. In: Tracts in advanced robotics, vol 62. Springer, pp 103–112

  39. Oreifej O, Liu Z (2013) HON4D: histogram of oriented 4D normals for activity recognition from depth sequences. In: IEEE conference on computer vision and pattern recognition

  40. Ott C, Lee D, Nakamura Y (2008) Motion capture based human motion recognition and imitation by direct marker control. In: IEEE-RAS international conference on humanoid robots, pp 399–405

  41. Ouyang W, Wang X (2012) A discriminative deep model for pedestrian detection with occlusion handling. In: IEEE conference on computer vision and pattern recognition, pp 3258–3265

  42. Ouyang W, Zeng X, Wang X (2013) Modeling mutual visibility relationship in pedestrian detection. In: IEEE conference on computer vision and pattern recognition, pp 3222–3229

  43. Parisot P, Vleeschouwer C D (2017) Scene-specific classifier for effective and efficient team sport players detection from a single calibrated camera. Comput Vis Image Underst 159(Supplement C):74–88

    Article  Google Scholar 

  44. Paul P, Haque S M E, Chakraborty S (2013) Human detection in surveillance videos and its applications - a review. EURASIP J Adv Signal Process 1:1–16

    Google Scholar 

  45. Plagemann C, Ganapathi V, Koller D, Thrun S (2010) Real-time identification and localization of body parts from depth images. In: IEEE international conference on robotics and automation, pp 3108–3113

  46. Raposo C, Barreto J P, Nunes U (2013) Fast and accurate calibration of a kinect sensor. In: International conference on 3D vision. IEEE, pp 342–349

  47. Roetenberg D, Luinge H, Slycke P (2009) Xsens mvn: full 6dof human motion tracking using miniature inertial sensors xsens motion technologies bv

  48. Rusu R (2010) Semantic 3D object maps for everyday manipulation in human living environments. In: KI - Künstliche Intelligenz, vol 24

  49. Rusu R B, Blodow N, Beetz M (2009) Fast point feature histograms (FPFH) for 3D registration. In: Proceedings of the 2009 IEEE international conference on robotics and automation, pp 1848–1853

  50. Salas J, Tomasi C (2011) People detection using color and depth images. In: Pattern recognition, vol 6718. Springer, Berlin, pp 127–135

  51. Satake J, Miura J (2009) Multiple-person tracking for a mobile robot using stereo. In: IAPR conference on machine vision applications, pp 8–17

  52. Shashua A, Gdalyahu Y, Hayun G (2004) Pedestrian detection for driving assistance systems: single-frame classification and system level performance. In: IEEE intelligent vehicles symposium, pp 1–6

  53. Shen Y, Hao Z, Wang P, Ma S (2013) A novel human detection approach based on depth map via kinect. In: IEEE conference on computer vision and pattern recognition workshops, pp 535–541

  54. Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905

    Article  Google Scholar 

  55. Song S, Xiao J (2014) Sliding shapes for 3D object detection in depth images. In: European conference on computer vision

  56. Spinello L, Arras K O (2011) People detection in RGB-D data. In: International conference on intelligent robots and systems. IEEE, pp 3838–3843

  57. Stone E E, Skubic M (2012) Capturing habitual, in-home gait parameter trends using an inexpensive depth camera. In: IEEE engineering in medicine and biology society, pp 5106–9

  58. Tang S, Wang X, Lv X, Han T X, Keller J, He Z, Skubic M, Lao S (2012) Histogram of oriented normal vectors for object recognition with a depth sensor. Asian Conference on Computer Vision 7725:525– 538

    Google Scholar 

  59. Tian Q, Zhou B, Zhao W, Wei Y, Fei W (2013) Human detection using HOG features of head and shoulder based on depth map. J Softw 8:2223–2230. Academy Publisher

    Google Scholar 

  60. Tian Y, Luo P, Wang X, Tang X (2015) Deep learning strong parts for pedestrian detection

  61. Tombari F, Salti S, Stefano L D (2010) Unique signatures of histograms for local surface description. In: European conference on computer vision, pp 356–369

  62. Weinrich C, Vollmer C, Gross H (2012) Estimation of human upper body orientation for mobile robotics using an SVM decision tree on monocular images. In: International conference on intelligent robots and systems, pp 2147–2152

  63. Xia L, Chen C, Aggarwal J K (2011) Human detection using depth information by Kinect. In: Computer vision and pattern recognition workshops. IEEE, pp 15–22

  64. Zhang L, Lin L, Liang X, He K (2016) Is faster r-cnn doing well for pedestrian detection?. In: European conference on computer vision, pp 443–457

  65. Zong C, Clady X, Chetouani M (2011) An embedded human motion capture system for an assistive walking robot, pp 1–6

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cyrille Migniot.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Essmaeel, K., Migniot, C., Dipanda, A. et al. A new 3D descriptor for human classification: application for human detection in a multi-kinect system. Multimed Tools Appl 78, 22479–22508 (2019). https://doi.org/10.1007/s11042-019-7568-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-7568-6

Keywords

Navigation