RGB+D and deep learning-based real-time detection of suspicious event in Bank-ATMs

Khaire, Pushpajit A.; Kumar, Praveen

doi:10.1007/s11554-021-01155-2

RGB+D and deep learning-based real-time detection of suspicious event in Bank-ATMs

Special Issue Paper
Published: 23 July 2021

Volume 18, pages 1789–1801, (2021)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

389 Accesses
5 Citations
1 Altmetric
Explore all metrics

Abstract

Real-time detection of human activities has become very important in terms of surveillance and security of Bank-Automated Teller Machines (ATMs), public offices because of the day-to-day increase in criminal activities. The current way of monitoring such constrained environments is done through monocular CCTV cameras which capture only RGB video. The RGB+D sensor provides depth data of the scene in addition to RGB data. To address the problem of online detection of abnormal activities in Bank ATMs, we propose a supervised deep learning framework based on multi-stream CNNs and RGB+D sensor. From the online video stream of RGB+D data, motion templates are created from RGB and depth video segments and then trained on CNNs to detect a suspicious event in ongoing activity. Moreover, due to the unavailability of any dataset for analyzing human activities in ATMs, we also contributed a novel RGB+D dataset in this paper. The proposed deep learning-based framework is evaluated on qualitative and quantitative statistical evaluation parameters and detect suspicious event with the precision of 0.932 and accuracy of 94.2%. Detailed statistical analysis of results shows that the proposed framework can detect the suspicious event in a real-time online manner before the abnormal activity gets completed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Learning-Based Smart Surveillance System

Online suspicious event detection in a constrained environment with RGB+D camera using multi-stream CNNs and SVM

Article 15 April 2022

Toward trustworthy human suspicious activity detection from surveillance videos using deep learning

Article Open access 10 March 2023

References

Hu, J.-F., Zheng, W.-S., Lai, J., Zhang, J.: Jointly learning heterogeneous features for RGB-D activity recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5344–5352 (2015)
Aggarwal, J.K., Ryoo, M.S.: Human activity analysis: a review. ACM Comput. Surv. (CSUR) 43(3), 16 (2011)
Article Google Scholar
Yun, K., Honorio, J., Chattopadhyay, D., Berg, T. L., Samaras, D.: Two-person interaction detection using body-pose features and multiple instance learning. In: Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on, IEEE, pp. 28–35 (2012)
Sung, J., Ponce, C., Selman, B., Saxena, A.: Unstructured human activity detection from RGBD images. In: Robotics and Automation (ICRA), 2012 IEEE International Conference on, IEEE, pp. 842–849 (2012)
Ofli, F., Chaudhry, R., Kurillo, G., Vidal, R., Bajcsy, R.: Berkeley mhad: a comprehensive multimodal human action database. In: Applications of Computer Vision (WACV), 2013 IEEE Workshop on, IEEE, pp. 53–60 (2013)
Chen, C., Jafari, R., Kehtarnavaz, N.: Utd-mhad: a multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor. In: Image Processing (ICIP), 2015 IEEE International Conference on, IEEE, pp. 168–172 (2015)
Shahroudy, A., Liu, J., Ng, T.-T., Wang, G.: Ntu RGB+ D: a large scale dataset for 3d human activity analysis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1010–1019 (2016)
Bobick, A.F., Davis, J.W.: The recognition of human movement using temporal templates. IEEE Trans. Pattern Anal. Mach. Intell. 23(3), 257–267 (2001)
Article Google Scholar
Yang, X., Zhang, C., Tian, Y.: Recognizing actions using depth motion maps-based histograms of oriented gradients. In: Proceedings of the 20th ACM international conference on Multimedia, pp. 1057–1060 (2012)
Liu, F., Tang, J., Zhao, R., Tang, Z.: Abnormal behavior recognition system for atm monitoring by RGB-D camera. In: Proceedings of the 20th ACM international conference on Multimedia, pp. 1295–1296 (2012)
Nar, R., Singal, A., Kumar, P.: Abnormal activity detection for bank ATM surveillance. In: 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI), IEEE, pp. 2042–2046 (2016)
Lee, W.-K., Leong, C.-F., Lai, W.-K., Leow, L.-K., Yap, T.-H.: Archcam: real time expert system for suspicious behaviour detection in ATM site. Expert Syst. Appl. 109, 12–24 (2018)
Article Google Scholar
Imran, J., Kumar, P.: Human action recognition using RGB-D sensor and deep convolutional neural networks. In: international conference on advances in computing, communications and informatics (ICACCI). IEEE 2016, 144–148 (2016)
Khaire, P., Kumar, P., Imran, J.: Combining CNN streams of RGB-D and skeletal data for human activity recognition. Pattern Recogn. Lett. 115, 107–116 (2018)
Article Google Scholar
Liu, M., Yuan, J.: Recognizing human actions as the evolution of pose estimation maps. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1159–1168 (2018)
McNally, W., Wong, A., McPhee, J.: Star-net: action recognition using spatio-temporal activation reprojection. In: 2019 16th Conference on Computer and Robot Vision (CRV), IEEE, pp. 49–56 (2019)
Huynh-The, T., Hua, C.-H., Kim, D.-S.: Encoding pose features to images with data augmentation for 3-d action recognition. IEEE Trans. Industr. Inf. 16(5), 3100–3111 (2019)
Article Google Scholar
Zhang, E., Xue, B., Cao, F., Duan, J., Lin, G., Lei, Y.: Fusion of 2d CNN and 3d densenet for dynamic gesture recognition. Electronics 8(12), 1511 (2019)
Article Google Scholar
Wang, P., Li, W., Li, C., Hou, Y.: Action recognition based on joint trajectory maps with convolutional neural networks. Knowl.-Based Syst. 158, 43–53 (2018)
Article Google Scholar
Chen, Y., Wang, L., Li, C., Hou, Y., Li, W.: Convnets-based action recognition from skeleton motion maps. Multimed. Tools Appl. 79(3), 1707–1725 (2020)
Article Google Scholar
Liu, M., Meng, F., Chen, C., Wu, S.: Joint dynamic pose image and space time reversal for human action recognition from videos. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, pp. 8762–8769 (2019)
Kamel, A., Sheng, B., Yang, P., Li, P., Shen, R., Feng, D.D.: Deep convolutional neural networks for human action recognition using depth maps and postures. IEEE Trans. Syst. Man Cybernet.: Syst. 49(9), 1806–1819 (2018)
Article Google Scholar
Ahad, M.A.R., Tan, J.K., Kim, H., Ishikawa, S.: Motion history image: its variants and applications. Mach. Vis. Appl. 23(2), 255–281 (2012)
Article Google Scholar
Chen, C., Liu, K., Kehtarnavaz, N.: Real-time human action recognition based on depth motion maps. J. Real-Time Image Proc. 12(1), 155–163 (2016)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556
Mansur, A., Makihara, Y., Yagi, Y.: Inverse dynamics for action recognition. IEEE Trans. Cybernet. 43(4), 1226–1236 (2013)
Article Google Scholar
Karg, M., Kirsch, A.: Simultaneous plan recognition and monitoring (spram) for robot assistants, (2013)
Koppula, H.S., Gupta, R., Saxena, A.: Learning human activities and object affordances from RGB-D videos. Int. J. Robot. Res. 32(8), 951–970 (2013)
Article Google Scholar
Li, W., Mahadevan, V., Vasconcelos, N.: Anomaly detection and localization in crowded scenes. IEEE Trans. Pattern Anal. Mach. Intell. 36(1), 18–32 (2013)
Google Scholar
Lu, C., Shi, J., Jia, J.: Abnormal event detection at 150 fps in matlab. In: Proceedings of the IEEE international conference on computer vision, pp. 2720–2727 (2013)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4510–4520 (2018)
Zhao, Y., Deng, B., Shen, C., Liu, Y., Lu, H., Hua, X.-S.: Spatio-temporal autoencoder for video anomaly detection. In: Proceedings of the 25th ACM international conference on Multimedia, pp. 1933–1941 (2017)
Chong, Y. S., Tay, Y. H.: Abnormal event detection in videos using spatiotemporal autoencoder. In: International symposium on neural networks, Springer, pp. 189–196 (2017)
Wang, P., Li, W., Gao, Z., Zhang, J., Tang, C., Ogunbona, P.O.: Action recognition from depth maps using deep convolutional neural networks. IEEE Trans. Hum.-Mach. Syst. 46(4), 498–509 (2015)
Article Google Scholar

Download references

Acknowledgements

This research was supported by Science and Engineering Research Board (SERB) under Project No. ECR/2016/000387, in cooperation with the Department of Science and Technology (DST), Government of India. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of DST-SERB or the Government of India. The DST-SERB or Government of India is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation thereon.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Visvesvaraya National Institute of Technology, Nagpur, India
Pushpajit A. Khaire & Praveen Kumar

Authors

Pushpajit A. Khaire
View author publications
You can also search for this author in PubMed Google Scholar
Praveen Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pushpajit A. Khaire.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 1084 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Khaire, P.A., Kumar, P. RGB+D and deep learning-based real-time detection of suspicious event in Bank-ATMs. J Real-Time Image Proc 18, 1789–1801 (2021). https://doi.org/10.1007/s11554-021-01155-2

Download citation

Received: 29 January 2021
Accepted: 14 July 2021
Published: 23 July 2021
Issue Date: October 2021
DOI: https://doi.org/10.1007/s11554-021-01155-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RGB+D and deep learning-based real-time detection of suspicious event in Bank-ATMs

Abstract

Access this article

Similar content being viewed by others

Deep Learning-Based Smart Surveillance System

Online suspicious event detection in a constrained environment with RGB+D camera using multi-stream CNNs and SVM

Toward trustworthy human suspicious activity detection from surveillance videos using deep learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (PDF 1084 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

RGB+D and deep learning-based real-time detection of suspicious event in Bank-ATMs

Abstract

Access this article

Similar content being viewed by others

Deep Learning-Based Smart Surveillance System

Online suspicious event detection in a constrained environment with RGB+D camera using multi-stream CNNs and SVM

Toward trustworthy human suspicious activity detection from surveillance videos using deep learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (PDF 1084 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation