Abstract
An anomaly is a pattern, behavior, or event that does not frequently happen in an environment. Video anomaly detection has always been a challenging task. Home security, public area monitoring, and quality control in production lines are only a few applications of video anomaly detection. The spatio-temporal nature of the videos, the lack of an exact definition for anomalies, and the inefficiencies of feature extraction for videos are examples of the challenges that researchers face in video anomaly detection. To find a solution to these challenges, we propose a method that uses parallel deep structures to extract informative features from the videos. The method consists of different units including an attention unit, frame sampling units, spatial and temporal feature extractors, and thresholding. Using these units, we propose a video anomaly detection that aggregates the results of four parallel structures. Aggregating the results brings generality and flexibility to the algorithm. The proposed method achieves satisfying results for four popular video anomaly detection benchmarks.
Similar content being viewed by others
Data availability
The datasets analyzed during the current study are publicly available. The Avenue dataset is available at http://www.cse.cuhk.edu.hk/leojia/projects/detectabnormal/dataset.html. UCSD Ped1 and UCSD Ped2 datasets are available at http://www.svcl.ucsd.edu/projects/anomaly/dataset.html. ShanghaiTech dataset is available at https://svip-lab.github.io/dataset/campus_dataset.html.
References
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). https://doi.org/10.48550/arXiv.1409.1556
Nayak, R., Pati, U., Das, S.: A comprehensive review on deep learning-based methods for video anomaly detection. Image Vis. Comput. 106, 104078 (2021). https://doi.org/10.1016/j.imavis.2020.104078
Acsintoae, A., Florescu, A., Georgescu, M., Mare, T., Sumedrea, P., Ionescu, R., Shahbaz Khan, F., Shah, M.: Ubnormal: new benchmark for supervised open-set video anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20143–20153 (2022). https://doi.org/10.48550/arXiv.2111.08644
Raja, R., Sharma, P., Mahmood, M., Saini, D.: Analysis of anomaly detection in surveillance video: recent trends and future vision. Multimed. Tools Appl. 82, 12635–12651 (2023). https://doi.org/10.1007/s11042-022-13954-1
Zhang, J., Jia, Y., Xie, W., Tu, Z.: Zoom transformer for skeleton-based group activity recognition. IEEE Trans. Circuits Syst. Video Technol. 32, 8646–8659 (2022). https://doi.org/10.1109/TCSVT.2022.3193574
Wang, Z., Zou, Y., Zhang, Z.: Cluster attention contrast for video anomaly detection. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 2463–2471 (2020). https://doi.org/10.1145/3394171.3413529
Li, H., Achim, A., Bull, D.: Unsupervised video anomaly detection using feature clustering. IET Signal Proc. 6, 521–533 (2012). https://doi.org/10.1049/iet-spr.2011.0074
Chang, Y., Tu, Z., Xie, W., Yuan, J.: Clustering driven deep autoencoder for video anomaly detection. In: European Conference on Computer Vision, pp. 329–345 (2020). https://doi.org/10.1007/978-3-030-58555-6_20
Piciarelli, C., Micheloni, C., Foresti, G.L.: Trajectory-based anomalous event detection. IEEE Trans. Circuits Syst. Video Technol. 18, 1544–1554 (2008). https://doi.org/10.1109/TCSVT.2008.2005599
Fu, Z., Hu, W., Tan, T.: Similarity based vehicle trajectory clustering and anomaly detection. In: IEEE International Conference on Image Processing, vol. 2, pp. II-602 (2005). https://doi.org/10.1109/ICIP.2005.1530127
Asad, M., Jiang, H., Yang, J., Tu, E., Malik, A.A.: Multi-stream 3D latent feature clustering for abnormality detection in videos. Appl. Intell. 52, 1126–1143 (2022). https://doi.org/10.1007/s10489-021-02356-9
Vafaei Sadr, A., Bassett, B.A., Kunz, M.A.: Flexible framework for anomaly detection via dimensionality reduction. Neural Comput. Appl. (2021). https://doi.org/10.1007/s00521-021-05839-5
Singh, D., Mohan, C.K.: Deep spatio-temporal representation for detection of road accidents using stacked autoencoder. IEEE Trans. Intell. Transp. Syst. 20, 879–887 (2018). https://doi.org/10.1109/TITS.2018.2835308
Sabokrou, M., Fathy, M., Hoseini, M.: Video anomaly detection and localisation based on the sparsity and reconstruction error of auto-encoder. Electron. Lett. 52, 1122–1124 (2016). https://doi.org/10.1049/el.2016.0440
Sabokrou, M., Fayyaz, M., Fathy, M., Klette, R.: Deep-cascade: cascading 3d deep neural networks for fast anomaly detection and localization in crowded scenes. IEEE Trans. Image Process. 26, 1992–2004 (2017). https://doi.org/10.1109/TIP.2017.2670780
Wang, T., Qiao, M., Lin, Z., Li, C., Snoussi, H., Liu, Z., Choi, C.: Generative neural networks for anomaly detection in crowded scenes. IEEE Trans. Inf. Forensics Secur. 14, 1390–1399 (2018). https://doi.org/10.1109/TIFS.2018.2878538
Gong, D., Liu, L., Le, V., Saha, B., Mansour, M., Venkatesh, S., Hengel, A.: Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1705–1714 (2019). https://doi.org/10.1109/ICCV.2019.00179
Sun, J., Wang, X., Xiong, N., Shao, J.: Learning sparse representation with variational auto-encoder for anomaly detection. IEEE Access 6, 33353–33361 (2018). https://doi.org/10.1109/ACCESS.2018.2848210
Chu, W., Xue, H., Yao, C., Cai, D.: Sparse coding guided spatiotemporal feature learning for abnormal event detection in large videos. IEEE Trans. Multimed. 21, 246–255 (2018). https://doi.org/10.1109/TMM.2018.2846411
Sabokrou, M., Fayyaz, M., Fathy, M., Moayed, Z., Klette, R.: Deep-anomaly: fully convolutional neural network for fast anomaly detection in crowded scenes. Comput. Vis. Image Underst. 172, 88–97 (2018). https://doi.org/10.1016/j.cviu.2018.02.006
Yu, Q., Kavitha, M.S., Kurita, T.: Mixture of experts with convolutional and variational autoencoders for anomaly detection. Appl. Intell. 51, 3241–3254 (2021). https://doi.org/10.1007/s10489-020-01944-5
Luo, W., Liu, W., Lian, D., Tang, J., Duan, L., Peng, X., Gao, S.: Video anomaly detection with sparse coding inspired deep neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 43, 1070–1084 (2019). https://doi.org/10.1109/TPAMI.2019.2944377
Aslam, N., Kolekar, M.: DeMAAE: deep multiplicative attention-based autoencoder for identification of peculiarities in video sequences. Vis. Comput. (2023). https://doi.org/10.1007/s00371-023-02882-2
Hu, J., Zhu, E., Wang, S., Liu, X., Guo, X., Yin, J.: An efficient and robust unsupervised anomaly detection method using ensemble random projection in surveillance videos. Sensors 19, 4145 (2019). https://doi.org/10.3390/s19194145
Chandrakala, S., Deepak, K., Revathy, G.: Anomaly detection in surveillance videos: a thematic taxonomy of deep models, review and performance analysis. Artif. Intell. Rev. 56, 3319–3368 (2023). https://doi.org/10.1007/s10462-022-10258-6
Chang, Y., Tu, Z., Xie, W., Luo, B., Zhang, S., Sui, H., Yuan, J.: Video anomaly detection with spatio-temporal dissociation. Pattern Recogn. 122, 108213 (2022). https://doi.org/10.1016/j.patcog.2021.108213
Zhong, Y., Chen, X., Hu, Y., Tang, P., Ren, F.: Bidirectional spatio-temporal feature learning with multiscale evaluation for video anomaly detection. IEEE Trans. Circuits Syst. Video Technol. 32, 8285–8296 (2022). https://doi.org/10.1109/TCSVT.2022.3190539
Liu, W., Chang, H., Ma, B., Shan, S., Chen, X.: Diversity-measurable anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12147–12156 (2023). https://doi.org/10.48550/arXiv.2303.05047
Chaurasia, R., Jaiswal, U.: Spatio-temporal based video anomaly detection using deep neural networks. Int. J. Inf. Technol. 15, 1569–1581 (2023). https://doi.org/10.1007/s41870-023-01193-y
Yadav, D., Jain, A., Asati, S., Yadav, A.: Video anomaly detection for pedestrian surveillance. Comput. Vis. Mach. Intell. Proc. CVMI 2022, 489–500 (2023). https://doi.org/10.1007/978-981-19-7867-8_39
Gayal, B., Patil, S.: Detection and localization of anomalies in video surveillance using novel optimization based deep convolutional neural network. Multimed. Tools Appl. (2023). https://doi.org/10.1007/s11042-023-14917-w
Ye, M., Peng, X., Gan, W., Wu, W., Qiao, Y.: Anopcn: video anomaly detection via deep predictive coding network. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 1805–1813 (2019). https://doi.org/10.1145/3343031.3350899
Lai, Y., Liu, R., Han, Y.: Video anomaly detection via predictive autoencoder with gradient-based attention. In: IEEE International Conference on Multimedia and Expo, pp. 1–6 (2020). https://doi.org/10.1109/ICME46284.2020.9102894
Zhang, Y., Nie, X., He, R., Chen, M., Yin, Y.: Normality learning in multispace for video anomaly detection. IEEE Trans. Circuits Syst. Video Technol. 31, 3694–3706 (2020). https://doi.org/10.1109/TCSVT.2020.3039798
Wang, X., Che, Z., Jiang, B., Xiao, N., Yang, K., Tang, J., Ye, J., Wang, J., Qi, Q.: Robust unsupervised video anomaly detection by multipath frame prediction. IEEE Trans. Neural Netw. Learn. Syst. 33, 2301–2312 (2021). https://doi.org/10.1109/TNNLS.2021.3083152
https://github.com/OlafenwaMoses/ImageAI. Accessed 31 May 2023
https://keras.io/api/applications/vgg/#vgg16-function. Accessed 31 May 2023
Shi, X., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., Woo, W.C.: Convolutional LSTM network: a machine learning approach for precipitation nowcasting. Adv. Neural. Inf. Process. Syst. (2015). https://doi.org/10.48550/arXiv.1506.04214
Mukherjee, S., Ghosh, S., Ghosh, S., Kumar, P., Roy, P.P.: Predicting video-frames using encoder-convlstm combination. In: ICASSP IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2027–2031 (2019). https://doi.org/10.1109/ICASSP.2019.8682158
Lin, Z., Li, M., Zheng, Z., Cheng, Y., Yuan, C.: Self-attention convlstm for spatiotemporal prediction. Proc. AAAI Conf. Artif. Intell. 34, 11531–11538 (2020). https://doi.org/10.1609/aaai.v34i07.6819
Desai, P., Sujatha, C., Chakraborty, S., Ansuman, S., Bhandari, S., Kardiguddi, S.: Next frame prediction using ConvLSTM. J. Phys. Conf. Ser. 2161, 012024 (2022). https://doi.org/10.1088/1742-6596/2161/1/012024
Luo, W., Liu, W., Gao, S.: Remembering history with convolutional lstm for anomaly detection. In: IEEE International Conference on Multimedia and Expo, pp. 439–444 (2017). https://doi.org/10.1109/ICME.2017.8019325
Sabih, M., Vishwakarma, D.: Crowd anomaly detection with LSTMs using optical features and domain knowledge for improved inferring. Vis. Comput. 38, 1719–1730 (2022). https://doi.org/10.1007/s00371-021-02100-x
http://www.cse.cuhk.edu.hk/leojia/projects/detectabnormal/dataset.html. Accessed 30 May 2023
http://www.svcl.ucsd.edu/projects/anomaly/dataset.html. Accessed 30 May 2023
https://svip-lab.github.io/dataset/campus_dataset.html. Accessed 12 Dec 2022
Funding
The authors do not receive financial support from any organization for the submitted work.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Consent for publication
The authors have written this paper for educational purposes.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Rahimpour, S.M., Kazemi, M., Moallem, P. et al. Video anomaly detection based on attention and efficient spatio-temporal feature extraction. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03361-y
Accepted:
Published:
DOI: https://doi.org/10.1007/s00371-024-03361-y