A Deep Autoencoder-Based Approach for Suspicious Action Recognition in Surveillance Videos

Ahmed, Waqas; Yousaf, Muhammad Haroon

doi:10.1007/s13369-023-08038-7

A Deep Autoencoder-Based Approach for Suspicious Action Recognition in Surveillance Videos

Research Article-Computer Engineering and Computer Science
Published: 08 July 2023

Volume 49, pages 3517–3532, (2024)
Cite this article

Arabian Journal for Science and Engineering Aims and scope Submit manuscript

243 Accesses
Explore all metrics

Abstract

In the recent era of technological advancements, surveillance cameras are installed in crowded areas to ensure public protection. In the video surveillance context, contents belonging to suspicious actions are very less in course of the surveillance stream. Therefore, manual monitoring of suspicious actions may become very exhaustive, which effects reliability and speed during emergencies due to monitoring tiredness, so the importance of suspicious action detection is very clear. We first address the issue of detecting suspicious activities from the surveillance videos with our proposed CNN-based autoencoder. The features are extracted using a three-dimensional convolutional neural network (C3D) and fed to our proposed autoencoder framework, which detects the localization of activity based on high reconstruction loss. For normal video clips, we have seen low reconstruction loss and the converse is seen for video clips containing suspicious actions. Secondly, we extract these suspicious clips from the long surveillance videos and use them to classify various suspicious actions with the help of our proposed generative adversarial network (GAN). We evaluate the performance of our work with benchmark datasets, namely UT interaction, hybrid crime action (HCA), and UCF crime. The results show the effectiveness of our work and as achieved accuracies are 97.5%, 89.6%, and 47.34% on UT interaction, HCA and UCF crime dataset, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Anomaly Detection in Surveillance Scenes Using Autoencoders

Article 16 October 2023

Anomaly detection in surveillance videos: a thematic taxonomy of deep models, review and performance analysis

Article 27 August 2022

Unethical human action recognition using deep learning based hybrid model for video forensics

Article 21 February 2023

References

Foorthuis, R.: On the nature and types of anomalies: a review of deviations in data. Int. J. Data Sci. Anal. 12(4), 297–331 (2021)
Article PubMed PubMed Central Google Scholar
Bergmann, P.; Fauser, M.; Sattlegger, D.; Steger, C.: Mvtec ad–a comprehensive real-world dataset for unsupervised anomaly detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9592–9600 (2019)
Himeur, Y.; Ghanem, K.; Alsalemi, A.; Bensaali, F.; Amira, A.: Artificial intelligence based anomaly detection of energy consumption in buildings: a review, current trends and new perspectives. Appl. Energy 287, 116601 (2021)
Article Google Scholar
Ruff, L.; Kauffmann, J.R.; Vandermeulen, R.A.; Montavon, G.; Samek, W.; Kloft, M.; Dietterich, T.G.; Müller, K.-R.: A unifying review of deep and shallow anomaly detection. In: Proceedings of the IEEE (2021)
Thudumu, S.; Branch, P.; Jin, J.; Singh, J.J.: A comprehensive survey of anomaly detection techniques for high dimensional big data. J. Big Data 7(1), 1–30 (2020)
Article Google Scholar
Ullah, W.; Ullah, A.; Haq, I.U.; Muhammad, K.; Sajjad, M.; Baik, S.W.: Cnn features with bi-directional lstm for real-time anomaly detection in surveillance networks. Multim. Tools Appl. 80(11), 16979–16995 (2021)
Article Google Scholar
Landi, F.; Snoek, C.G.; Cucchiara, R.: Anomaly locality in video surveillance. arXiv preprint arXiv:1901.10364 (2019)
Nguyen, T.-N.; Meunier, J.: Anomaly detection in video sequence with appearance-motion correspondence. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 1273–1283 (2019)
Vu, H.; Nguyen, T.D.; Le, T.; Luo, W.; Phung, D.: Robust anomaly detection in videos using multilevel representations. In: Proceedings of the AAAI conference on artificial intelligence, vol. 33, pp. 5216–5223 (2019)
Ionescu, R.T.; Khan, F.S.; Georgescu, M.-I.; Shao, L.: Object-centric auto-encoders and dummy anomalies for abnormal event detection in video. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 7842–7851 (2019)
Huynh-The, T.; Hua, C.-H.; Kim, D.-S.: Encoding pose features to images with data augmentation for 3-d action recognition. IEEE Trans. Ind. Infor. 16(5), 3100–3111 (2019)
Article Google Scholar
Tran, D.; Bourdev, L.; Fergus, R.; Torresani, L.; Paluri, M.: Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp. 4489–4497 (2015)
Mohammadi, S.; Kiani, H.; Perina, A.; Murino, V.: Violence detection in crowded scenes using substantial derivative. In: 2015 12th IEEE international conference on advanced video and signal based surveillance (AVSS), IEEE , pp. 1–6 (2015)
Zhang, T.; Yang, Z.; Jia, W.; Yang, B.; Yang, J.; He, X.: A new method for violence detection in surveillance scenes. Multim. Tools Appl. 75(12), 7327–7349 (2016)
Article Google Scholar
Shah, A.P.; Lamare, J.-B.; Nguyen-Anh, T.; Hauptmann, A.: Cadp: A novel dataset for cctv traffic camera based accident analysis. In: 2018 15th IEEE international conference on advanced video and signal based surveillance (AVSS), IEEE, pp. 1–9 (2018)
Maha Vishnu, V.; Rajalakshmi, M.; Nedunchezhian, R.: Intelligent traffic video surveillance and accident detection system with dynamic traffic signal control. Cluster Comput. 21(1), 135–147 (2018)
Article Google Scholar
Singh, D.; Mohan, C.K.: Deep spatio-temporal representation for detection of road accidents using stacked autoencoder. IEEE Trans. Intell. Transport. Sys. 20(3), 879–887 (2018)
Article Google Scholar
Sabokrou, M.; Fayyaz, M.; Fathy, M.; Moayed, Z.; Klette, R.: Deep-anomaly: fully convolutional neural network for fast anomaly detection in crowded scenes. Comput. Vision Image Understand. 172, 88–97 (2018)
Article Google Scholar
Chong, Y.S.; Tay, Y.H.: Abnormal event detection in videos using spatiotemporal autoencoder. In: International Symposium on Neural Networks, pp. 189–196 (2017). Springer
Sultani, W.; Chen, C.; Shah, M.: Real-world anomaly detection in surveillance videos. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6479–6488 (2018)
Zhu, Y.; Newsam, S.: Motion-aware feature for improved video anomaly detection. arXiv preprint arXiv:1907.10211 (2019)
Li, L.; Jiang, R.; He, Z.; Chen, X.M.; Zhou, X.: Trajectory data-based traffic flow studies: a revisit. Trans. Res. Part C: Emerg. Technol. 114, 225–240 (2020)
Article Google Scholar
Tian, Y.; Dehghan, A.; Shah, M.: On detection, data association and segmentation for multi-target tracking. IEEE Trans. Patt. Anal. Mach. Intell. 41(9), 2146–2160 (2018)
Article Google Scholar
Cai, W.; Wei, Z.: Piigan: generative adversarial networks for pluralistic image inpainting. IEEE Access 8, 48451–48463 (2020)
Article Google Scholar
You, H.; Tian, S.; Yu, L.; Lv, Y.: Pixel-level remote sensing image recognition based on bidirectional word vectors. IEEE Trans. Geosci. Remote Sens. 58(2), 1281–1293 (2019)
Article ADS Google Scholar
Yang, Z.-L.; Guo, X.-Q.; Chen, Z.-M.; Huang, Y.-F.; Zhang, Y.-J.: Rnn-stega: linguistic steganography based on recurrent neural networks. IEEE Trans. Inform. Forens. Security 14(5), 1280–1295 (2018)
Article Google Scholar
Zhang, L.; Zhu, G.; Shen, P.; Song, J.; Afaq Shah, S.; Bennamoun, M.: Learning spatiotemporal features using 3dcnn and convolutional lstm for gesture recognition. In: Proceedings of the IEEE international conference on computer vision workshops, pp. 3120–3128 (2017)
Sharma, R.; Sungheetha, A.; et al.: An efficient dimension reduction based fusion of CNN and SVM model for detection of abnormal incident in video surveillance. J. Soft Comput. Paradigm (JSCP) 3(02), 55–69 (2021)
Article Google Scholar
Li, Y.; Liu, M.; Rehg, J.: In the eye of the beholder: gaze and actions in first person video. In: IEEE Transactions on pattern analysis and machine intelligence (2021)
Varghese, E.B.; Thampi, S.M.: A deep learning approach to predict crowd behavior based on emotion. In: International conference on smart multimedia, pp. 296–307 (2018). Springer
Maqsood, R.; Bajwa, U.I.; Saleem, G.; Raza, R.H.; Anwar, M.W.: Anomaly recognition from surveillance videos using 3d convolution neural network. Multim. Tools Appl. 80(12), 18693–18716 (2021)
Article Google Scholar
Abavisani, M.; Joze, H.R.V.; Patel, V.M.: Improving the performance of unimodal dynamic hand-gesture recognition with multimodal training. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1165–1174 (2019)
Koppikar, U.; Sujatha, C.; Patil, P.; Mudenagudi, U.: Real-world anomaly detection using deep learning. In: International conference on intelligent computing and communication, pp. 333–342 (2019). Springer
Chalapathy, R.; Chawla, S.: Deep learning for anomaly detection: A survey. arXiv preprint arXiv:1901.03407 (2019)
Kazakos, E.; Nagrani, A.; Zisserman, A.; Damen, D.: Epic-fusion: audio-visual temporal binding for egocentric action recognition. In: Proceedings of the IEEE/CVF International conference on computer vision, pp. 5492–5501 (2019)
Feichtenhofer, C.; Fan, H.; Xiong, B.; Girshick, R.; He, K.: A large-scale study on unsupervised spatiotemporal representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3299–3309 (2021)
Reichstein, M.; Camps-Valls, G.; Stevens, B.; Jung, M.; Denzler, J.; Carvalhais, N.; et al.: Deep learning and process understanding for data-driven earth system science. Nature 566(7743), 195–204 (2019)
Article ADS CAS PubMed Google Scholar
Zhang, Z.; Tao, D.: Slow feature analysis for human action recognition. IEEE Trans. Patt. Anal. Mach. Intell. 34(3), 436–450 (2012)
Article Google Scholar
Jayaraman, D.; Grauman, K.: Slow and steady feature analysis: higher order temporal coherence in video. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3852–3861 (2016)
Qian, R.; Meng, T.; Gong, B.; Yang, M.-H.; Wang, H.; Belongie, S.; Cui, Y.: Spatiotemporal contrastive video representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 6964–6974 (2021)
Hong, X.; Lan, Y.; Pang, L.; Guo, J.; Cheng, X.: Transformation driven visual reasoning. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp. 6903–6912 (2021)
Sabokrou, M.; Fayyaz, M.; Fathy, M.; Klette, R.: Deep-cascade: cascading 3d deep neural networks for fast anomaly detection and localization in crowded scenes. IEEE Trans. Image Process. 26(4), 1992–2004 (2017)
Article ADS MathSciNet PubMed Google Scholar
Luo, W.; Liu, W.; Gao, S.: Remembering history with convolutional lstm for anomaly detection. In: 2017 IEEE International conference on multimedia and expo (ICME), IEEE , pp. 439–444 (2017)
Ahsan, U.; Sun, C.; Essa, I.: Discrimnet: semi-supervised action recognition from videos using generative adversarial networks. arXiv:1801.07230 (2018)
Mir, A.M.; Yousaf, M.H.; Dawood, H.: Criminal action recognition using spatiotemporal human motion acceleration descriptor. J. Electr. Imag. 27(6), 063016 (2018)
Google Scholar
Ahmed, W.; Yousaf, M.H.; Yasin, A.: Robust suspicious action recognition approach using pose descriptor. Math. Prob. Eng. (2021). https://doi.org/10.1155/2021/2449603
Article Google Scholar
Perez, M.; Liu, J.; Kot, A.C.: Interaction relational network for mutual action recognition. IEEE Trans. Multim. 24, 366–376 (2021)
Ko, K.-E.; Sim, K.-B.: Deep convolutional framework for abnormal behavior detection in a smart surveillance system. Eng. Appl. Artif. Intell. 67, 226–234 (2018)
Article Google Scholar
Sahoo, S.P.; Ari, S.: On an algorithm for human action recognition. Expert Sys. Appl. 115, 524–534 (2019)
Article Google Scholar
Ke, Q.; Bennamoun, M.; An, S.; Sohel, F.; Boussaid, F.: Leveraging structural context models and ranking score fusion for human interaction prediction. IEEE Trans. Multim. 20(7), 1712–1723 (2017)
Article Google Scholar

Download references

Acknowledgements

The authors acknowledge the funding from Natio-nal Centre for Robotics and Automation (NCRA) for this research work.

Author information

Authors and Affiliations

Department of Telecommunication Engineering, University of Engineering and Technology, Taxila, 47050, Pakistan
Waqas Ahmed
Department of Computer Engineering, University of Engineering and Technology, Taxila, 47050, Pakistan
Muhammad Haroon Yousaf
Swarm Robotic Lab, National Centre for Robotics and Automation (NCRA), Taxila, Pakistan
Muhammad Haroon Yousaf

Authors

Waqas Ahmed
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Haroon Yousaf
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Muhammad Haroon Yousaf.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ahmed, W., Yousaf, M.H. A Deep Autoencoder-Based Approach for Suspicious Action Recognition in Surveillance Videos. Arab J Sci Eng 49, 3517–3532 (2024). https://doi.org/10.1007/s13369-023-08038-7

Download citation

Received: 04 June 2022
Accepted: 30 April 2023
Published: 08 July 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s13369-023-08038-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Deep Autoencoder-Based Approach for Suspicious Action Recognition in Surveillance Videos

Abstract

Access this article

Similar content being viewed by others

Anomaly Detection in Surveillance Scenes Using Autoencoders

Anomaly detection in surveillance videos: a thematic taxonomy of deep models, review and performance analysis

Unethical human action recognition using deep learning based hybrid model for video forensics

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Deep Autoencoder-Based Approach for Suspicious Action Recognition in Surveillance Videos

Abstract

Access this article

Similar content being viewed by others

Anomaly Detection in Surveillance Scenes Using Autoencoders

Anomaly detection in surveillance videos: a thematic taxonomy of deep models, review and performance analysis

Unethical human action recognition using deep learning based hybrid model for video forensics

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation