Skip to main content
Log in

An intelligent deep learning based capsule network model for human detection in indoor surveillance videos

  • Application of soft computing
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

At present times, indoor surveillance becomes a hot research topic among researchers and business sectors. Human detection is one of the vital areas of focus in the surveillance system owing to its significance in proper person detection, human activity identification, and scene classification. Since the indoor spaces comprise poor lighting, variable illuminations, shadowing, and complex background, the human detection process becomes a tedious task. The advent of computer vision and deep learning (DL) models is commonly employed for human detection. This article presents a new intelligent deep learning model for human detection in indoor surveillance videos (IDL-HDIS). As data augmentation process is one of the most renowned ways to increase the size of the dataset which is highly essential for enhancing the prediction accuracy of the model, the same is carried out as a part of even this research work which includes performing rotation, translation and flipping. The IDL-GDIS model uses Faster Region Convolutional Neural Network (Faster R-CNN) model for human detection. The Faster R-CNN comprises of Fast R-CNN and Region Proposal Network (RPN). The RPN uses Capsule Networks (CapsNet) model as a shared convolution neural network (CNN), which acts as a feature extractor and generates the feature map. Besides, dropout is employed to avoid overfitting problem in the CapsNet architecture, the validation of IDL-HDIS model is done by a comprehensive simulation analysis under different aspects. The validation is supported by the evident results of the IDL-HDIS model which is given in the paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

Enquiries about data availability should be directed to the authors.

References

  • An F, Liu Z (2019) Facial expression recognition algorithm based on parameter adaptive initialization of CNN and LSTM. Vis Comput 35:1–16

    Google Scholar 

  • Chahyati D, Fanany MI, Arymurthy AM (2017) Tracking people by detection using CNN features. Proc Comput Sci 124:167–172

    Article  Google Scholar 

  • Flores Calero MJ, Aldás M, Lázaro J, Gardel A, Onofa N, Quinga B (2019) Pedestrian detection under partial occlusion by using logic inference, HOG and SVM. IEEE Lat Am Trans 17(09):1552–1559

    Article  Google Scholar 

  • Guo K, Wu S, Xu YF (2017) Face recognition using both visible light image and near-infrared image and a deep network. CAAI Trans Intell Technol 2(1):39–47

    Article  Google Scholar 

  • Hahn S, Choi H (2020) Understanding dropout as an optimization trick. Neurocomputing 398:64–70

    Article  Google Scholar 

  • Haq EU, Jianjun H, Li K, Haq HU (2020) Human detection and tracking with deep convolutional neural networks under the constrained of noise and occluded scenes. Multimed Tools Appl 79(41):30685–30708

    Article  Google Scholar 

  • Jeon H, Nguyen VD, Jeon JW (2019) Pedestrian detection based on deep learning. In: IECON—45th annual conference of the IEEE industrial electronics society, Lisbon, pp 144–151

  • Kim B, Yuvaraj N, Sri Preethaa KR, Santhosh R, Sabari A (2020) Enhanced pedestrian detection using optimized deep convolution neural network for smart building surveillance. Soft Comput 24(22):17081–17092

    Article  Google Scholar 

  • Kundid Vasić M, Papić V (2020) Multimodel deep learning for person detection in aerial images. Electronics 9(9):1459

    Article  Google Scholar 

  • Lv JJ, Cheng C, Tian GD, Zhou XD, Zhou X (2016) Landmark perturbation-based data augmentation for unconstrained face recognition. Signal Process Image Commun 47:465–475

    Article  Google Scholar 

  • Mateus A, Ribeiro D, Miraldo P, Nascimento JC (2019) Efficient and robust Pedestrian Detection using deep learning for human-aware navigation. Robot Auton Syst 113:23–37

    Article  Google Scholar 

  • Mekhalfi ML, Bejiga MB, Soresina D, Melgani F, Demir B (2019) Capsule networks for object detection in UAV imagery. Remote Sens 11(14):1694

    Article  Google Scholar 

  • Patrick MK, Adekoya AF, Mighty AA, Edward BY (2019) Capsule networks–a survey. J King Saud Univ Comput Inf Sci 34(1):1295–1310

    Google Scholar 

  • Ren S, He K, Girshick R, Sun J (2016) Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149

    Article  Google Scholar 

  • Said YF, Barr M (2019) Pedestrian detection for advanced driver assistance systems using deep learning algorithms. IJCSNS Int J Comput Sci Netw Secur 19(10):9–14

    Google Scholar 

  • Sulman N, Sanocki T, Goldgof D, Kasturi R (2008) How effective is human video surveillance performance? In: 19th international conference on pattern recognition (ICPR 2008), IEEE, Piscataway, pp 1–3

  • Supreeth HSG, Patil CM (2018) Efficient multiple moving object detection and tracking using combined background subtraction and clustering. Signal Image Video Process 15:1097

    Article  Google Scholar 

  • Xinxin S, Liangnian J, Qinghua L (2019) Detection of stationary humans using time-division UWB MIMO through-wall radar. J Eng 20:6799–6802

    Article  Google Scholar 

Download references

Funding

The authors have not disclosed any funding.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Ushasukhanya.

Ethics declarations

Conflict of interest

The authors declare that there has no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ushasukhanya, S., Malleswari, T.Y.J.N., Karthikeyan, M. et al. An intelligent deep learning based capsule network model for human detection in indoor surveillance videos. Soft Comput 28, 737–747 (2024). https://doi.org/10.1007/s00500-023-09443-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-023-09443-8

Keywords

Navigation