Robust learning for real-world anomalies in surveillance videos

Mumtaz, Aqib; Sargano, Allah Bux; Habib, Zulfiqar

doi:10.1007/s11042-023-14425-x

Robust learning for real-world anomalies in surveillance videos

Published: 31 January 2023

Volume 82, pages 20303–20322, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

413 Accesses
Explore all metrics

Abstract

Anomaly detection has significant importance for developing autonomous surveillance systems. Real-world anomalous events are far more complex and harder to capture due to diverse human behaviors and a wide range of anomaly types. A key factor in defining activity is the temporal length or duration of the activity. The time period required for an anomalous activity to be completely understandable and meaningful depends on the nature and speed of the event. Some events are as fast to be captured within a few frames; however, some activities are slow and may require several thousands of video frames to define an activity. Deep learning architectures have a limited input temporal sequence length and suffer from learning very long sequences. There is a need to re-investigate the problem from the frame sequences perspective to better define an activity in the limited temporal length. In this research work, our contribution is two-fold. Firstly, a novel strategy of dynamic frame-skipping is proposed for producing meaningful temporal sequences for model learning. Secondly, a new deep learning model based on the Inflated Inception network (I3D) is proposed for learning spatial and temporal information from video frames. In order to evaluate the performance of the proposed model, experiments are performed on one of the most challenging real-world anomalies UCF-Crime dataset. The results confirm that the proposed model is robust and significantly outperforms state-of-the-art methods in terms of accuracy. In addition to this, the proposed model has achieved the highest F1 score for fast and slow activities, such as explosions, road accidents, robbery, and stealing, and the AUC score of 0.837.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

AnomalyNet: a spatiotemporal motion-aware CNN approach for detecting anomalies in real-world autonomous surveillance

Article 02 January 2024

Designing a Deep Learning Model for Video Anomaly Detection-Based Surveillance

CNN features with bi-directional LSTM for real-time anomaly detection in surveillance networks

Article 20 August 2020

Data availability

Not applicable.

References

Adam A, Rivlin E, Shimshoni I, Reinitz D (2008) Robust real-time unusual event detection using multiple fixed-location monitors. IEEE Trans Pattern Anal Mach Intell 30(3):555–560
Article Google Scholar
Bai S et al (2019) Traffic anomaly detection via perspective map based on spatial-temporal information matrix. In: Proc. CVPR Workshops, pp 117–124
Google Scholar
Basharat A, Gritai A, Shah M (2008) Learning object motion patterns for anomaly detection and improved object detection. In: 26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp 1–8
Google Scholar
Carreira J, Zisserman A (2017) Quo Vadis, action recognition? A new model and the kinetics dataset. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6299–6308
Google Scholar
Chalapathy R, Toth E, Chawla S (2019) Group anomaly detection using deep generative models. Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol 11051 LNAI, pp 173–189
Cheng KW, Chen YT, Fang WH (2015) Gaussian process regression-based video anomaly detection and localization with hierarchical feature representation. IEEE Trans Image Process 24(12):5288–5301
Article MathSciNet MATH Google Scholar
Chidananda K, Kumar S (2022) Human anomaly detection in surveillance videos: a review. Inf Commun Technol Compet Strateg:791–802
Chong YS, Tay YH (2015) Modeling representation of videos for anomaly detection using deep learning: a review. arXiv Prepr. arXiv1505.00523
Chong YS, Tay YH (2017) Abnormal event detection in videos using spatiotemporal autoencoder. In: International symposium on neural networks, pp 189–196
Google Scholar
Cong Y, Yuan J, Liu J (2011) Sparse reconstruction cost for abnormal event detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 3449–3456
Google Scholar
Dhole H, Sutaone M, Vyas V (2019) Anomaly detection using convolutional spatiotemporal autoencoder. In: 2019 10th international conference on computing, communication and networking technologies, ICCCNT 2019
Google Scholar
Dong F, Zhang Y, Nie X (2020) Dual discriminator generative adversarial network for video anomaly detection. IEEE Access 8
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 580–587
Google Scholar
Gong D et al (2019) Memorizing normality to detect anomaly: memory-augmented deep autoencoder for unsupervised anomaly detection. In: Proceedings of the IEEE international conference on computer vision, pp 1705–1714
Google Scholar
Hasan M, Choi J, Neumann J, Roy-Chowdhury AK, Davis LS (2016) Learning temporal regularity in video sequences. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 733–742
Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 770–778
Google Scholar
He C, Shao J, Sun J (2018) An anomaly-introduced learning method for abnormal event detection. Multimed Tools Appl 77(22):29573–29588
Article Google Scholar
Hinami R, Mei T, Satoh S (2017) Joint detection and recounting of abnormal events by learning deep generic knowledge. In: Proceedings of the IEEE international conference on computer vision
Google Scholar
Hou R, Chen C, Shah M (2017) Tube Convolutional Neural Network (T-CNN) for action detection in videos. In: Proceedings of the IEEE international conference on computer vision, vol 2017-Octob, pp 5822–5831
Google Scholar
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, pp 448–456
Google Scholar
Ionescu RT, Khan FS, Georgescu MI, Shao L (2019) Object-centric auto-encoders and dummy anomalies for abnormal event detection in video. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 7842–7851.
Kay W et al (2017) The kinetics human action video dataset. arXiv Prepr. arXiv1705.06950
Kim J, Grauman K (2009) Observe locally, infer globally: a space-time MRF for detecting abnormal activities with incremental updates. In: IEEE conference on computer vision and pattern recognition, pp 2921–2928
Google Scholar
Kratz L, Nishino K (2009) Anomaly detection in extremely crowded scenes using spatio-temporal motion pattern models. In: IEEE conference on computer vision and pattern recognition, pp 1446–1453
Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Google Scholar
Kuehne H, Jhuang H, Garrote E, Poggio T, Serre T (2011) HMDB: a large video database for human motion recognition. In: Proceedings of the IEEE international conference on computer vision, pp 2556–2563
Google Scholar
Li W, Mahadevan V, Vasconcelos N (2014) Anomaly detection and localization in crowded scenes. IEEE Trans Pattern Anal Mach Intell 36(1):18–32
Article Google Scholar
Liu W, Luo W, Lian D, Gao S (2018) Future frame prediction for anomaly detection - a new baseline. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 6536–6545
Google Scholar
Liu Y, Liu J, Lin J, Zhao M, Song L (2022) Appearance-motion united auto-encoder framework for video anomaly detection. IEEE Trans. Circuits Syst. II Express Briefs
Lu C, Shi J, Jia J (2013) Abnormal event detection at 150 FPS in MATLAB. In: Proceedings of the IEEE international conference on computer vision, pp 2720–2727
Google Scholar
Luo W, Liu W, Gao S (2017) Remembering history with convolutional LSTM for anomaly detection. In: IEEE International Conference on Multimedia and Expo (ICME), pp 439–444
Chapter Google Scholar
Luo W, Liu W, Gao S (2017) A revisit of sparse coding based anomaly detection in stacked rnn framework. In: Proceedings of the IEEE international conference on computer vision, pp 341–349
Google Scholar
Maqsood R, Bajwa UI, Saleem G, Raza RH, Anwar MW (2021) Anomaly recognition from surveillance videos using 3D convolution neural network. Multimed Tools Appl 80(12):18693–18716
Article Google Scholar
Mehran R, Oyama A, Shah M (2009) Abnormal crowd behavior detection using social force model. In: IEEE conference on computer vision and pattern recognition, pp 935–942
Google Scholar
Mumtaz A, Sargano AB, Habib Z (2018) Violence detection in surveillance videos with deep network using transfer learning. In: 2nd European Conference on Electrical Engineering and Computer Science (EECS), pp 558–563
Google Scholar
Mumtaz A, Sargano AB, Habib Z (2020) Fast learning through deep multi-net CNN model for violence recognition in video surveillance
Narasimhan MG, Sowmya Kamath S (2018) Dynamic video anomaly detection and localization using sparse denoising autoencoders. Multimed Tools Appl 77(11):13173–13195
Article Google Scholar
Nayak R, Pati UC, Das SK (2020) A comprehensive review on deep learning-based methods for video anomaly detection. Image Vis Comput 106:104078
Article Google Scholar
Ramachandra B, Jones M (2020) Street scene: a new dataset and evaluation protocol for video anomaly detection. In: The IEEE winter conference on applications of computer vision, pp 2569–2578
Google Scholar
Ramachandra B, Jones MJ, Vatsavai RR (2020) A survey of single-scene video anomaly detection. IEEE Trans Pattern Anal Mach Intell 44:1–18
Article Google Scholar
Ravanbakhsh M, Nabi M, Sangineto E, Marcenaro L, Regazzoni C, Sebe N (2017) Abnormal event detection in videos using generative adversarial nets. In: Proceedings - International Conference on Image Processing, ICIP, pp 1577–1581
Google Scholar
Sabokrou M, Fayyaz M, Fathy M, Klette R (2017) Deep-cascade: cascading 3D deep neural networks for fast anomaly detection and localization in crowded scenes. IEEE Trans Image Process 26(4):1992–2004
Article MathSciNet MATH Google Scholar
Sabokrou M, Fayyaz M, Fathy M, Moayed Z, Klette R (2018) Deep-anomaly: fully convolutional neural network for fast anomaly detection in crowded scenes. Comput Vis Image Underst 172:88–97
Article MATH Google Scholar
Saligrama V, Konrad J, Jodoin PM (2010) Video anomaly identification. IEEE Signal Process Mag 27:18–33
Article Google Scholar
Sargano AB, Angelov P, Habib Z (2016) Human action recognition from multiple views based on view-invariant feature descriptor using support vector machines. Appl Sci 6(10):309
Article Google Scholar
Sargano AB, Wang X, Angelov P, Habib Z (2017) Human action recognition using transfer learning with deep representations. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp 463–469
Chapter Google Scholar
Sargano A, Angelov P, Habib Z (2017) A comprehensive review on handcrafted and learning-based action representation approaches for human activity recognition. Appl Sci 7(1):110
Article Google Scholar
Se SAP, Ravanbakhsh M, Nabi M, Mousavi H, Sangineto E, Sebe N (2018) Plug-and-play CNN for crowd motion analysis: An application in abnormal event detection. In: Proceedings - 2018 IEEE winter conference on applications of computer vision, WACV 2018
Google Scholar
Shah AP, Lamare JB, Nguyen-Anh T, Hauptmann A (2019) CADP: a novel dataset for CCTV traffic camera based accident analysis. In: IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp 1–9
Google Scholar
Shao J, Loy C-C, Kang K, Wang X (2016) Slicing convolutional neural network for crowd video understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5620–5628
Google Scholar
Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos, pp 1–9
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition, pp 1–14
Soomro K, Zamir AR, Shah M (2012) UCF101: a dataset of 101 human actions classes from videos in the wild. arXiv Prepr. arXiv1212.0402
Sultani W, Chen C, Shah M (2018) Real-world anomaly detection in surveillance videos. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 6479–6488
Google Scholar
Tang Y, Zhao L, Zhang S, Gong C, Li G, Yang J (2020) Integrating prediction and reconstruction for anomaly detection. Pattern Recogn Lett 129:123–130
Article Google Scholar
Tian Y, Pang G, Chen Y, Singh R, Verjans JW, Carneiro G (2021) Weakly-supervised video anomaly detection with robust temporal feature magnitude learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 4975–4986
Google Scholar
Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning spatiotemporal features with 3D convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 4489–4497
Google Scholar
Ullah H, Ullah M, Conci N (2014) Dominant motion analysis in regular and irregular crowd scenes. In: International workshop on human behavior understanding, pp 62–72
Chapter Google Scholar
Ullah W, Ullah A, Hussain T, Khan ZA, Baik SW (2021) An efficient anomaly recognition framework using an attention residual lstm in surveillance videos. Sensors 21(8):2811
Article Google Scholar
Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on machine learning, pp 1096–1103
Chapter Google Scholar
Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol PA (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11(12):3371–3408
MathSciNet MATH Google Scholar
Wang G, Yuan X, Zhang A, Hsu H-M, Hwang J-N (2019) Anomaly candidate identification and starting time estimation of vehicles from traffic videos. In: AI City Challenge Workshop, IEEE/CVF Computer Vision and Pattern Recognition (CVPR) Conference, Long Beach, California, pp 382–390
Google Scholar
Xu D, Ricci E, Yan Y, Song J, Sebe N (2015) Learning deep representations of appearance and motion for anomalous event detection. In: In British Machine Vision Conference (BMVC), pp 1–3
Google Scholar
Xu D, Yan Y, Ricci E, Sebe N (2017) Detecting anomalous events in videos by learning deep representations of appearance and motion. Comput Vis Image Underst 156:117–127
Article Google Scholar
Ye M, Peng X, Gan W, Wu W, Qiao Y (2019) AnoPCN: Video anomaly detection via deep predictive coding network. In: Proceedings of the 27th ACM international conference on multimedia, pp 1805–1813
Chapter Google Scholar
Yuan FN, Zhang L, Shi JT, Xia X, Li G (2019) Theories and applications of auto-encoder neural networks: a literature survey. Jisuanji Xuebao/Chinese J Comput 42(1):203–230
Google Scholar
Zhao B, Fei-Fei L, Xing EP (2011) Online detection of unusual events in videos via dynamic sparse coding. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 3313–3320
Google Scholar
Zhao Y, Deng B, Shen C, Liu Y, Lu H, Hua XS (2017) Spatio-temporal AutoEncoder for video anomaly detection. Proceedings of the 25th ACM international conference on multimedia, pp 1933–1941
Zhong JX, Li N, Kong W, Liu S, Li TH, Li G (2019) Graph convolutional label noise cleaner: Train a plug-and-play action classifier for anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 1237–1246
Google Scholar
Zhu Y, Newsam S (2019) Motion-aware feature for improved video anomaly detection 30th Br. Mach. Vis. Conf. 2019, BMVC 2019
Zhu Y, Nayak NM, Roy-Chowdhury AK (2013) Context-aware activity recognition and anomaly detection in video. IEEE J Sel Top Signal Process 7(1):91–101
Article Google Scholar
Zhu S, Chen C, Sultani W (2020) Video anomaly detection for smart surveillance. arXiv Prepr. arXiv2004.00222

Download references

Funding

This research is supported by the PDE-GIR project, which has received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Sklodowska-Curie grant agreement No 778035.

Author information

Authors and Affiliations

Department of Computer Science, COMSATS University Islamabad, Lahore, 54000, Pakistan
Aqib Mumtaz, Allah Bux Sargano & Zulfiqar Habib

Authors

Aqib Mumtaz
View author publications
You can also search for this author in PubMed Google Scholar
Allah Bux Sargano
View author publications
You can also search for this author in PubMed Google Scholar
Zulfiqar Habib
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A.M., A.B.S. and Z.H. conceived and designed the research direction; A.M. proposed/implemented methodology and performed the research experiments; A.B.S. and Z.H. analyzed the data; A.B.S. and A.M. contributed reagents/materials/analysis tools; A.M. wrote the research paper. All authors have read and agreed to the published version of the manuscript.

Corresponding authors

Correspondence to Allah Bux Sargano or Zulfiqar Habib.

Ethics declarations

Institutional review board statement

Not applicable.

Informed consent

Not applicable.

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Mumtaz, A., Sargano, A.B. & Habib, Z. Robust learning for real-world anomalies in surveillance videos. Multimed Tools Appl 82, 20303–20322 (2023). https://doi.org/10.1007/s11042-023-14425-x

Download citation

Received: 26 May 2022
Revised: 11 October 2022
Accepted: 21 January 2023
Published: 31 January 2023
Issue Date: May 2023
DOI: https://doi.org/10.1007/s11042-023-14425-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust learning for real-world anomalies in surveillance videos

Abstract

Access this article

Similar content being viewed by others

AnomalyNet: a spatiotemporal motion-aware CNN approach for detecting anomalies in real-world autonomous surveillance

Designing a Deep Learning Model for Video Anomaly Detection-Based Surveillance

CNN features with bi-directional LSTM for real-time anomaly detection in surveillance networks

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Institutional review board statement

Informed consent

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Robust learning for real-world anomalies in surveillance videos

Abstract

Access this article

Similar content being viewed by others

AnomalyNet: a spatiotemporal motion-aware CNN approach for detecting anomalies in real-world autonomous surveillance

Designing a Deep Learning Model for Video Anomaly Detection-Based Surveillance

CNN features with bi-directional LSTM for real-time anomaly detection in surveillance networks

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Institutional review board statement

Informed consent

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation