Abstract
People detection is an important issue in video surveillance applications. The factors such as severe occlusions and scene perspective distortions in real application scenarios complicate this task. This article proposes an efficient deep learning framework for detecting people from overhead images. The proposed method uses a multi-scale Yolov4-Tiny algorithm for person detection. The multi-scale feature of the algorithm enables it to successfully detect people who appear small in an image. A comparative analysis was made with non-maximum suppression (NMS) algorithms, which significantly affect the detection performance. In this context, it is aimed to determine the most suitable NMS algorithm for the multi-scale Yolov4-Tiny algorithm. The performance of this method has been evaluated by various experiments. The experiments were carried out on two datasets consisting of many difficulties such as light, distance, occlusion, and wear a hat. As a result of the experiments, 94.85% mean average precision (mAP) value was obtained with the proposed method. The proposed method not only shows promising results, but can also work in real-time applications due to low computational cost.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
I. Ahmed, M. Ahmad, A. Ahmad, G. Jeon, Top view multiple people tracking by detection using deep SORT and YOLOv3 with transfer learning: within 5G infrastructure. Int. J. Mach. Learn. Cybern. 1–15 (2020)
C. Mohan, H.K. Verma, Direction and distance sensors and sensing system for elderly people. Mater. Today-Proc. 34, 667–674 (2021)
R. Grimming, B. McIntosh, A. Mahalanobis, R.G. Driggers, LWIR sensor parameters for deep learning object detectors. OSA Continuum 4(2), 529–541 (2021)
C.A. Luna, C. Losada-Gutiérrez, D. Fuentes-Jiménez, M. Mazo, Fast heuristic method to detect people in frontal depth images. Expert Syst. Appl. 168, 114483 (2021)
M. Peker, B. Inci, E. Musaoglu, H. Cobanoglu, N. Kocakir, O. Karademir, Performance analysis of deep learning architectures on embedded devices for people counting system. Paper presented at the 5th International Mediterranean Science and Engineering Congress, Antalya, 20–22 April 2020 (2020)
S. Zhou, M. Ke, J. Qiu, J. Wang, A survey of multi-object video tracking algorithms, in International Conference on Applications and Techniques in Cyber Security and Intelligence. (Springer, New York, 2018), pp. 351–369
P. Li, D. Wang, L. Wang, H. Lu, Deep visual tracking: review and experimental comparison. Pattern Recogn 76, 323–338 (2018)
L. Anuj, Krishna MG (2017) Multiple camera based multiple object tracking under occlusion: a survey. Paper presented at the International Conference on Innovative Mechanisms for Industry Applications (ICIMIA) 21–23, 432–437 (Feb 2017)
R. Nakatani, D. Kouno, K. Shimada, T. Endo, A person identification method using a top-view head image from an overhead camera. J Adv Comput Intell Intell Inform 16(6), 696–703 (2012)
M. Ahmad, I. Ahmed, K. Ullah, I. Khan, A. Adnan, Robust background subtraction based persons counting from overhead view. Paper presented at the 9th IEEE Annual Ubiquitous Computing, Electronics Mobile Communication Conference (UEMCON), 8–10 Nov 2018, pp 746–752 (2018)
L. Del Pizzo, P. Foggia, A. Greco, G. Percannella, M. Vento, A versatile and effective method for counting people on either RGB or depth overhead cameras. Paper presented at the IEEE International Conference on Multimedia & Expo Workshops (ICMEW), 29 June-3 July 2015, pp. 1–6 (2015)
S. Mukherjee, B. Saha, I. Jamal, R. Leclerc, N. Ray, A novel framework for automatic passenger counting. Paper presented at theIEEE International Conference on Image Processing (ICIP), 11–14 Sept. 2011, pp. 2969–2972 (2011)
S. Sun, N. Akhtar, H. Song, C. Zhang, J. Li, A. Mian, Benchmark data and method for real-time people counting in cluttered scenes using depth sensors. IEEE Trans Intell Transp Syst 20(10), 3599–3612 (2019)
V. Carletti, L. Del Pizzo, G. Percannella, M. Vento, An efficient and effective method for people detection from top-view depth cameras. Paper Presented at the IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), 29 Aug-1 Sept 2017, pp. 1–6 (2017)
M.S. Kristoffersen, J.V. Dueholm, R. Gade, T.B. Moeslund, Pedestrian counting with occlusion handling using stereo thermal cameras. Sensors 16(1), 62 (2016)
I. Ahmed, A. Ahmad, F. Piccialli, A.K. Sangaiah, G. Jeon, A robust features-based person tracker for overhead views in industrial environment. IEEE Internet Things J 5(3), 1598–1605 (2017)
I. Ahmed, A. Adnan, A robust algorithm for detecting people in overhead views. Cluster Comput 21(1), 633–654 (2018)
I. Ahmed, M. Ahmad, A. Adnan, A. Ahmad, M. Khan, Person detector for different overhead views using machine learning. Int J Mach Learn Cybern 10, 2657–2668 (2019)
M. Ahmad, I. Ahmed, A. Adnan, Overhead view person detection using YOLO. Paper Presented at the 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), 10–12 Oct. 2019, pp. 627–633 (2019)
M. Ahmad, I. Ahmed, K. Ullah, M. Ahmad, A deep neural network approach for top view people detection and counting. Paper presented at the 10th annual ubiquitous computing, electronics & mobile communication conference (UEMCON), 10–12 Oct 2019, pp. 1082–1088 (2019)
I. Ahmed, S. Din, G. Jeon, F. Piccialli, Exploring deep learning models for overhead view multiple object detection. IEEE Internet of Things J 7(7), 5737–5744 (2019)
A. Musaev et al., Towards in-store multi-person tracking using head detection and track heatmaps (2020). arXiv preprint arXiv:2005.08009
Top View Multi Person Tracking Dataset, Ukrainian Catholic University, Rockville (2020). https://github.com/ucuapps/top-view-multi-person-tracking
M. Rezaei, M. Azarmi, DeepSOCIAL: social distancing monitoring and infection risk assessment in COVID-19 pandemic (2020). arXiv preprint arXiv:2008.11672
K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition (2014). arXiv preprint arXiv:1409.15561-13
K. He et al., Deep residual learning for image recognition. Paper presented at the IEEE Conference on Computer Vision and Pattern Recognition, 27–30 June 2016, pp. 770–778 (2016)
C.Y. Wang et al., CSPNet: a new backbone that can enhance learning capability of CNN. Paper Presented at the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 14–19 June, 2020, pp. 390–391 (2020)
A. Sharifi et al., DeepHAZMAT: hazardous materials sign detection and segmentation with restricted computational resources (2020). Available at SSRN: https://ssrn.com/abstract=3649600
W. Liu et al., SSD: single shot multibox detector. Paper Presented at the European Conference on Computer Vision. Springer, pp. 21–37 (2016)
J. Redmon et al., You only look once: unified, real-time object detection. Paper Presented at the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 779–788 (2016)
J. Redmon, A. Farhadi, YOLOv3: an incremental improvement. Comput Vis Pattern Recognit (cs.CV) (2018). arXiv:1804.02767
A. Bochkovskiy, C.-Y. Wang, H.-Y.M. Liao, YOLOv4: Optimal speed and accuracy of object detection (2020). arXiv Comput. Vis. Patter Recognit. doi: arXiv: 2004.10934
Z. Jiang, L. Zhao, S. Li, Y. Jia, Real-time object detection method based on improved YOLOv4-tiny (2020). arXiv:2011.04244[cs]
Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, D. Ren, Distance-IoU loss: Faster and better learning for bounding box regression. Proceedings of the AAAI Conference on Artificial Intelligence 34(07), 12993–13000 (2020)
N. Bodla, B. Singh, R. Chellappa, Davis LS (2017) Soft-NMS–improving object detection with one line of code. Paper Presented at the IEEE International Conference on Computer Vision 22–29, 5561–5569 (Oct 2017)
F. Wilcoxon, Individual comparisons by ranking methods. Biom. Bull. 1, 80–83 (1945)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
The authors declare that they have no conflicts of interest.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Peker, M., İnci, B., Musaoğlu, E., Çobanoğlu, H., Kocakır, N., Karademir, Ö. (2022). An Efficient Deep Learning Framework for People Detection in Overhead Images. In: Fernandes, S.L., Sharma, T.K. (eds) Artificial Intelligence in Industrial Applications. Learning and Analytics in Intelligent Systems, vol 25. Springer, Cham. https://doi.org/10.1007/978-3-030-85383-9_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-85383-9_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85382-2
Online ISBN: 978-3-030-85383-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)