Benchmarking Two Algorithms for People Detection from Top-View Depth Cameras

  • Vincenzo Carletti
  • Luca Del Pizzo
  • Gennaro PercannellaEmail author
  • Mario Vento
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10484)


Automatic people detection from videos is an important task in many computer vision applications either for security and safety motivations or for business intelligence purposes. In order to achieve high person detection accuracy many authors propose the adoption of a depth sensor mounted in a top-view position in order to mitigate the effects of occlusions and illumination conditions on the performance. Unfortunately, most approaches presented so far in the scientific literature have been tested on very small datasets which do not account for the typical situations arising in real scenarios and consequently do not allow interested readers to figure out which method has to be used in the specific scenario at hand. In this paper we benchmark two different approaches available in the literature for people detection from a zenithal mounted depth camera; the former is an unsupervised method aimed at finding the head of persons defined as the local minimum regions in the depth map, while the latter is based on the combination of the histograms of oriented gradient description and the support vector machine classifier. The benchmarking is performed on a public dataset of images captured in two different lighting conditions and with varying number of persons; this allows to assess the performance of the considered approaches under different real world scenarios. A detailed analysis of the two methods is reported in the experimental section of the paper allowing the reader to comprehend the pros and cons of each approach on the considered scenes.


  1. 1.
    Conte, D., Foggia, P., Percannella, G., Vento, M.: Removing object reflections in videos by global optimization. IEEE Trans. Circuits Syst. Video Technol. 22(11), 1623–1633 (2012)CrossRefGoogle Scholar
  2. 2.
    Conte, D., Foggia, P., Percannella, G., Vento, M.: Counting moving persons in crowded scenes. Mach. Vis. Appl. 24(5), 1029–1042 (2013)CrossRefGoogle Scholar
  3. 3.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 1, pp. 886–893. IEEE (2005)Google Scholar
  4. 4.
    Del Pizzo, L., Foggia, P., Greco, A., Percannella, G., Vento, M.: A versatile and effective method for counting people on either RGB or depth overhead cameras. In: 2015 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2015 (2015)Google Scholar
  5. 5.
    Del Pizzo, L., Foggia, P., Greco, A., Percannella, G., Vento, M.: Counting people by RGB or depth overhead cameras. Pattern Recogn. Lett. 81, 41–50 (2016)CrossRefGoogle Scholar
  6. 6.
    Erickson, V.L., Lin, Y., Kamthe, A., Brahme, R., Surana, A., Cerpa, A.E., Sohn, M.D., Narayanan, S.: Energy efficient building environment control strategies using real-time occupancy measurements. In: Proceedings of 1st ACM Workshop on Embedded Sensing Systems for Energy-Efficiency in Buildings, BuildSys 2009, pp. 19–24. ACM, New York (2009)Google Scholar
  7. 7.
    Freund, Y., Schapire, R.E.: A desicion-theoretic generalization of on-line learning and an application to boosting. In: Vitányi, P. (ed.) EuroCOLT 1995. LNCS, vol. 904, pp. 23–37. Springer, Heidelberg (1995). doi: 10.1007/3-540-59119-2_166 CrossRefGoogle Scholar
  8. 8.
    Galčík, F., Gargalík, R.: Real-time depth map based people counting. In: Blanc-Talon, J., Kasinski, A., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2013. LNCS, vol. 8192, pp. 330–341. Springer, Cham (2013). doi: 10.1007/978-3-319-02895-8_30 CrossRefGoogle Scholar
  9. 9.
    Karpagavalli, P., Ramprasad, A.: Estimating the density of the people and counting the number of people in a crowd environment for human safety. pp. 663–667 (2013)Google Scholar
  10. 10.
    Lin, D.-T., Jhuang, D.-H.: A novel layer-scanning method for improving real-time people counting. In: Stephanidis, C. (ed.) HCI 2013. CCIS, vol. 374, pp. 661–665. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-39476-8_133 CrossRefGoogle Scholar
  11. 11.
    Nalepa, J., Szymanek, J., Kawulok, M.: Real-time people counting from depth images. In: Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D. (eds.) BDAS 2015. CCIS, vol. 521, pp. 387–397. Springer, Cham (2015). doi: 10.1007/978-3-319-18422-7_34 Google Scholar
  12. 12.
    Prati, A., Mikic, I., Trivedi, M.M., Cucchiara, R.: Detecting moving shadows: algorithms and evaluation. IEEE Trans. Pattern Anal. Mach. Intell. 25(7), 918–923 (2003)CrossRefGoogle Scholar
  13. 13.
    Rauter, M.: Reliable human detection and tracking in top-view depth images. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 529–534 (2013)Google Scholar
  14. 14.
    Saleh, S.A.M., Suandi, S.A., Ibrahim, H.: Recent survey on crowd density estimation and counting for visual surveillance. Eng. Appl. Artif. Intell. 41, 103–114 (2015)CrossRefGoogle Scholar
  15. 15.
    Toyama, K., Krumm, J., Brumitt, B., Meyers, B.: Wallflower: principles and practice of background maintenance. In: Proceedings of 7th IEEE International Conference on Computer Vision, vol. 1, pp. 255–261 (1999)Google Scholar
  16. 16.
    Vera, P., Zenteno, D., Salas, J.: Counting pedestrians in bidirectional scenarios using zenithal depth images. In: Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Rodríguez, J.S., di Baja, G.S. (eds.) MCPR 2013. LNCS, vol. 7914, pp. 84–93. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-38989-4_9 CrossRefGoogle Scholar
  17. 17.
    Zhang, X., Yan, J., Feng, S., Lei, Z., Yi, D., Li, S.Z.: Water filling: unsupervised people counting via vertical KINECT sensor. In: 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance (AVSS), pp. 215–220. IEEE (2012)Google Scholar
  18. 18.
    Zhu, L., Wong, K.-H.: Human tracking and counting using the KINECT range sensor based on Adaboost and Kalman filter. In: Bebis, G., et al. (eds.) ISVC 2013. LNCS, vol. 8034, pp. 582–591. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-41939-3_57 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Vincenzo Carletti
    • 1
  • Luca Del Pizzo
    • 1
  • Gennaro Percannella
    • 1
    Email author
  • Mario Vento
    • 1
  1. 1.Department of Computer and Electrical Engineering and Applied MathematicsUniversity of SalernoFisciano (SA)Italy

Personalised recommendations