Abstract
Anomaly detection has significant importance for developing autonomous surveillance systems. Real-world anomalous events are far more complex and harder to capture due to diverse human behaviors and a wide range of anomaly types. A key factor in defining activity is the temporal length or duration of the activity. The time period required for an anomalous activity to be completely understandable and meaningful depends on the nature and speed of the event. Some events are as fast to be captured within a few frames; however, some activities are slow and may require several thousands of video frames to define an activity. Deep learning architectures have a limited input temporal sequence length and suffer from learning very long sequences. There is a need to re-investigate the problem from the frame sequences perspective to better define an activity in the limited temporal length. In this research work, our contribution is two-fold. Firstly, a novel strategy of dynamic frame-skipping is proposed for producing meaningful temporal sequences for model learning. Secondly, a new deep learning model based on the Inflated Inception network (I3D) is proposed for learning spatial and temporal information from video frames. In order to evaluate the performance of the proposed model, experiments are performed on one of the most challenging real-world anomalies UCF-Crime dataset. The results confirm that the proposed model is robust and significantly outperforms state-of-the-art methods in terms of accuracy. In addition to this, the proposed model has achieved the highest F1 score for fast and slow activities, such as explosions, road accidents, robbery, and stealing, and the AUC score of 0.837.
Similar content being viewed by others
Data availability
Not applicable.
References
Adam A, Rivlin E, Shimshoni I, Reinitz D (2008) Robust real-time unusual event detection using multiple fixed-location monitors. IEEE Trans Pattern Anal Mach Intell 30(3):555–560
Bai S et al (2019) Traffic anomaly detection via perspective map based on spatial-temporal information matrix. In: Proc. CVPR Workshops, pp 117–124
Basharat A, Gritai A, Shah M (2008) Learning object motion patterns for anomaly detection and improved object detection. In: 26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp 1–8
Carreira J, Zisserman A (2017) Quo Vadis, action recognition? A new model and the kinetics dataset. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6299–6308
Chalapathy R, Toth E, Chawla S (2019) Group anomaly detection using deep generative models. Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol 11051 LNAI, pp 173–189
Cheng KW, Chen YT, Fang WH (2015) Gaussian process regression-based video anomaly detection and localization with hierarchical feature representation. IEEE Trans Image Process 24(12):5288–5301
Chidananda K, Kumar S (2022) Human anomaly detection in surveillance videos: a review. Inf Commun Technol Compet Strateg:791–802
Chong YS, Tay YH (2015) Modeling representation of videos for anomaly detection using deep learning: a review. arXiv Prepr. arXiv1505.00523
Chong YS, Tay YH (2017) Abnormal event detection in videos using spatiotemporal autoencoder. In: International symposium on neural networks, pp 189–196
Cong Y, Yuan J, Liu J (2011) Sparse reconstruction cost for abnormal event detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 3449–3456
Dhole H, Sutaone M, Vyas V (2019) Anomaly detection using convolutional spatiotemporal autoencoder. In: 2019 10th international conference on computing, communication and networking technologies, ICCCNT 2019
Dong F, Zhang Y, Nie X (2020) Dual discriminator generative adversarial network for video anomaly detection. IEEE Access 8
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 580–587
Gong D et al (2019) Memorizing normality to detect anomaly: memory-augmented deep autoencoder for unsupervised anomaly detection. In: Proceedings of the IEEE international conference on computer vision, pp 1705–1714
Hasan M, Choi J, Neumann J, Roy-Chowdhury AK, Davis LS (2016) Learning temporal regularity in video sequences. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 733–742
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 770–778
He C, Shao J, Sun J (2018) An anomaly-introduced learning method for abnormal event detection. Multimed Tools Appl 77(22):29573–29588
Hinami R, Mei T, Satoh S (2017) Joint detection and recounting of abnormal events by learning deep generic knowledge. In: Proceedings of the IEEE international conference on computer vision
Hou R, Chen C, Shah M (2017) Tube Convolutional Neural Network (T-CNN) for action detection in videos. In: Proceedings of the IEEE international conference on computer vision, vol 2017-Octob, pp 5822–5831
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, pp 448–456
Ionescu RT, Khan FS, Georgescu MI, Shao L (2019) Object-centric auto-encoders and dummy anomalies for abnormal event detection in video. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 7842–7851.
Kay W et al (2017) The kinetics human action video dataset. arXiv Prepr. arXiv1705.06950
Kim J, Grauman K (2009) Observe locally, infer globally: a space-time MRF for detecting abnormal activities with incremental updates. In: IEEE conference on computer vision and pattern recognition, pp 2921–2928
Kratz L, Nishino K (2009) Anomaly detection in extremely crowded scenes using spatio-temporal motion pattern models. In: IEEE conference on computer vision and pattern recognition, pp 1446–1453
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Kuehne H, Jhuang H, Garrote E, Poggio T, Serre T (2011) HMDB: a large video database for human motion recognition. In: Proceedings of the IEEE international conference on computer vision, pp 2556–2563
Li W, Mahadevan V, Vasconcelos N (2014) Anomaly detection and localization in crowded scenes. IEEE Trans Pattern Anal Mach Intell 36(1):18–32
Liu W, Luo W, Lian D, Gao S (2018) Future frame prediction for anomaly detection - a new baseline. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 6536–6545
Liu Y, Liu J, Lin J, Zhao M, Song L (2022) Appearance-motion united auto-encoder framework for video anomaly detection. IEEE Trans. Circuits Syst. II Express Briefs
Lu C, Shi J, Jia J (2013) Abnormal event detection at 150 FPS in MATLAB. In: Proceedings of the IEEE international conference on computer vision, pp 2720–2727
Luo W, Liu W, Gao S (2017) Remembering history with convolutional LSTM for anomaly detection. In: IEEE International Conference on Multimedia and Expo (ICME), pp 439–444
Luo W, Liu W, Gao S (2017) A revisit of sparse coding based anomaly detection in stacked rnn framework. In: Proceedings of the IEEE international conference on computer vision, pp 341–349
Maqsood R, Bajwa UI, Saleem G, Raza RH, Anwar MW (2021) Anomaly recognition from surveillance videos using 3D convolution neural network. Multimed Tools Appl 80(12):18693–18716
Mehran R, Oyama A, Shah M (2009) Abnormal crowd behavior detection using social force model. In: IEEE conference on computer vision and pattern recognition, pp 935–942
Mumtaz A, Sargano AB, Habib Z (2018) Violence detection in surveillance videos with deep network using transfer learning. In: 2nd European Conference on Electrical Engineering and Computer Science (EECS), pp 558–563
Mumtaz A, Sargano AB, Habib Z (2020) Fast learning through deep multi-net CNN model for violence recognition in video surveillance
Narasimhan MG, Sowmya Kamath S (2018) Dynamic video anomaly detection and localization using sparse denoising autoencoders. Multimed Tools Appl 77(11):13173–13195
Nayak R, Pati UC, Das SK (2020) A comprehensive review on deep learning-based methods for video anomaly detection. Image Vis Comput 106:104078
Ramachandra B, Jones M (2020) Street scene: a new dataset and evaluation protocol for video anomaly detection. In: The IEEE winter conference on applications of computer vision, pp 2569–2578
Ramachandra B, Jones MJ, Vatsavai RR (2020) A survey of single-scene video anomaly detection. IEEE Trans Pattern Anal Mach Intell 44:1–18
Ravanbakhsh M, Nabi M, Sangineto E, Marcenaro L, Regazzoni C, Sebe N (2017) Abnormal event detection in videos using generative adversarial nets. In: Proceedings - International Conference on Image Processing, ICIP, pp 1577–1581
Sabokrou M, Fayyaz M, Fathy M, Klette R (2017) Deep-cascade: cascading 3D deep neural networks for fast anomaly detection and localization in crowded scenes. IEEE Trans Image Process 26(4):1992–2004
Sabokrou M, Fayyaz M, Fathy M, Moayed Z, Klette R (2018) Deep-anomaly: fully convolutional neural network for fast anomaly detection in crowded scenes. Comput Vis Image Underst 172:88–97
Saligrama V, Konrad J, Jodoin PM (2010) Video anomaly identification. IEEE Signal Process Mag 27:18–33
Sargano AB, Angelov P, Habib Z (2016) Human action recognition from multiple views based on view-invariant feature descriptor using support vector machines. Appl Sci 6(10):309
Sargano AB, Wang X, Angelov P, Habib Z (2017) Human action recognition using transfer learning with deep representations. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp 463–469
Sargano A, Angelov P, Habib Z (2017) A comprehensive review on handcrafted and learning-based action representation approaches for human activity recognition. Appl Sci 7(1):110
Se SAP, Ravanbakhsh M, Nabi M, Mousavi H, Sangineto E, Sebe N (2018) Plug-and-play CNN for crowd motion analysis: An application in abnormal event detection. In: Proceedings - 2018 IEEE winter conference on applications of computer vision, WACV 2018
Shah AP, Lamare JB, Nguyen-Anh T, Hauptmann A (2019) CADP: a novel dataset for CCTV traffic camera based accident analysis. In: IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp 1–9
Shao J, Loy C-C, Kang K, Wang X (2016) Slicing convolutional neural network for crowd video understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5620–5628
Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos, pp 1–9
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition, pp 1–14
Soomro K, Zamir AR, Shah M (2012) UCF101: a dataset of 101 human actions classes from videos in the wild. arXiv Prepr. arXiv1212.0402
Sultani W, Chen C, Shah M (2018) Real-world anomaly detection in surveillance videos. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 6479–6488
Tang Y, Zhao L, Zhang S, Gong C, Li G, Yang J (2020) Integrating prediction and reconstruction for anomaly detection. Pattern Recogn Lett 129:123–130
Tian Y, Pang G, Chen Y, Singh R, Verjans JW, Carneiro G (2021) Weakly-supervised video anomaly detection with robust temporal feature magnitude learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 4975–4986
Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning spatiotemporal features with 3D convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 4489–4497
Ullah H, Ullah M, Conci N (2014) Dominant motion analysis in regular and irregular crowd scenes. In: International workshop on human behavior understanding, pp 62–72
Ullah W, Ullah A, Hussain T, Khan ZA, Baik SW (2021) An efficient anomaly recognition framework using an attention residual lstm in surveillance videos. Sensors 21(8):2811
Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on machine learning, pp 1096–1103
Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol PA (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11(12):3371–3408
Wang G, Yuan X, Zhang A, Hsu H-M, Hwang J-N (2019) Anomaly candidate identification and starting time estimation of vehicles from traffic videos. In: AI City Challenge Workshop, IEEE/CVF Computer Vision and Pattern Recognition (CVPR) Conference, Long Beach, California, pp 382–390
Xu D, Ricci E, Yan Y, Song J, Sebe N (2015) Learning deep representations of appearance and motion for anomalous event detection. In: In British Machine Vision Conference (BMVC), pp 1–3
Xu D, Yan Y, Ricci E, Sebe N (2017) Detecting anomalous events in videos by learning deep representations of appearance and motion. Comput Vis Image Underst 156:117–127
Ye M, Peng X, Gan W, Wu W, Qiao Y (2019) AnoPCN: Video anomaly detection via deep predictive coding network. In: Proceedings of the 27th ACM international conference on multimedia, pp 1805–1813
Yuan FN, Zhang L, Shi JT, Xia X, Li G (2019) Theories and applications of auto-encoder neural networks: a literature survey. Jisuanji Xuebao/Chinese J Comput 42(1):203–230
Zhao B, Fei-Fei L, Xing EP (2011) Online detection of unusual events in videos via dynamic sparse coding. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 3313–3320
Zhao Y, Deng B, Shen C, Liu Y, Lu H, Hua XS (2017) Spatio-temporal AutoEncoder for video anomaly detection. Proceedings of the 25th ACM international conference on multimedia, pp 1933–1941
Zhong JX, Li N, Kong W, Liu S, Li TH, Li G (2019) Graph convolutional label noise cleaner: Train a plug-and-play action classifier for anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 1237–1246
Zhu Y, Newsam S (2019) Motion-aware feature for improved video anomaly detection 30th Br. Mach. Vis. Conf. 2019, BMVC 2019
Zhu Y, Nayak NM, Roy-Chowdhury AK (2013) Context-aware activity recognition and anomaly detection in video. IEEE J Sel Top Signal Process 7(1):91–101
Zhu S, Chen C, Sultani W (2020) Video anomaly detection for smart surveillance. arXiv Prepr. arXiv2004.00222
Funding
This research is supported by the PDE-GIR project, which has received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Sklodowska-Curie grant agreement No 778035.
Author information
Authors and Affiliations
Contributions
A.M., A.B.S. and Z.H. conceived and designed the research direction; A.M. proposed/implemented methodology and performed the research experiments; A.B.S. and Z.H. analyzed the data; A.B.S. and A.M. contributed reagents/materials/analysis tools; A.M. wrote the research paper. All authors have read and agreed to the published version of the manuscript.
Corresponding authors
Ethics declarations
Institutional review board statement
Not applicable.
Informed consent
Not applicable.
Conflict of interest
The authors declare no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mumtaz, A., Sargano, A.B. & Habib, Z. Robust learning for real-world anomalies in surveillance videos. Multimed Tools Appl 82, 20303–20322 (2023). https://doi.org/10.1007/s11042-023-14425-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-14425-x