Dual Attention Mechanisms Based Auto-Encoder for Video Anomaly Detection

Gu, Jiatao; Zeng, Jing; Ji, Genlin

doi:10.1007/978-3-031-06794-5_13

Jiatao Gu¹¹,
Jing Zeng¹¹ &
Genlin Ji¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13338))

Included in the following conference series:

International Conference on Adaptive and Intelligent Systems

1603 Accesses

Abstract

Video anomaly detection refers to the identification of abnormal behaviors that do not conform to normal patterns. Reconstruction of video frames based on auto-encoder is the current mainstream video anomaly detection method. If frames have higher reconstruction error than the threshold, these frames will be treated as the anomalous frames. However, auto-encoders lack attention to global information and channel dependence. The attention mechanism enables the neural network to accurately focus on input-related elements and becomes an important part of the neural network. In order to focus the feature of both channel and spatial dimensions, we propose dual attention mechanisms based auto-encoder (DAMAE) for video anomaly detection. After each down-sampling, the feature map is operated by two kinds of attention processing. The feature map is divided into specific groups. Every individual group can autonomously enhance its learnt expression and suppress possible noise. By fusing channel attention and spatial attention, DAMAE is able to capture the pixel-level pairwise relationship and channel dependence. Compared with traditional auto-encoder in the process of each up-sampling, the feature with channel attention and spatial attention can reconstruct the normal pattern of the video better. Experimental results show that our method is superior to other advanced methods, which proves the effectiveness of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Attention-based misaligned spatiotemporal auto-encoder for video anomaly detection

Article 13 April 2024

DeMAAE: deep multiplicative attention-based autoencoder for identification of peculiarities in video sequences

Article 21 May 2023

Cross-Modal Two-Stream Target Focused Network for Video Anomaly Detection

References

Cong, Y., Yuan, J., Liu, J.: Sparse reconstruction cost for abnormal event detection. In: The 24th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2011, Colordo Springs, CO, USA, 20–25 June 2011, pp. 3449–3456. IEEE Computer Society (2011)
Google Scholar
Lu, C., Shi, J., Jia, J.: Abnormal event detection at 150 FPS in MATLAB. In: ICCV, pp. 2720–2727. IEEE Computer Society (2013)
Google Scholar
Zhao, B., Fei-Fei, L., Xing, E.P.: Online detection of unusual events in videos via dynamic sparse coding. In: CVPR, pp. 3313–3320. IEEE Computer Society (2011)
Google Scholar
Tung, F., Zelek, J.S., Clausi, D.A.: Goal-based trajectory analysis for unusual behaviour detection in intelligent surveillance. Image Vis. Comput. 29(4), 230–240 (2011)
Article Google Scholar
Shi, Y., Tian, Y., Wang, Y., Huang, T.: Sequential deep trajectory descriptor for action recognition with three-stream CNN. IEEE Trans. Multim. 19(7), 1510–1520 (2017)
Article Google Scholar
Wang, X., Tieu, K., Grimson, E.: Learning semantic scene models by trajectory analysis. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 110–123. Springer, Heidelberg (2006). https://doi.org/10.1007/11744078_9
Chapter Google Scholar
Xu, Z., Zeng, X., Ji, G., Sheng, B.: Improved anomaly detection in surveillance videos with multiple probabilistic models inference. Intell. Autom. Soft Comput. 31(3), 1703–1717 (2022)
Article Google Scholar
Chen, W., Xie, D., Zhang, Y., Pu, S.: All you need is a few shifts: Designing efficient convolutional neural networks for image classification. In: CVPR, pp. 7241–7250. Computer Vision Foundation/IEEE (2019)
Google Scholar
Xue, Z.: Semi-supervised convolutional generative adversarial network for hyperspectral image classification. IET Image Process. 14(4), 709–719 (2020)
Article Google Scholar
Crawford, E., Pineau, J.: Spatially invariant unsupervised object detection with convolutional neural networks. In: AAAI, pp. 3412–3420. AAAI Press (2019)
Google Scholar
Liu, Z., Shi, S., Duan, Q., Zhang, W., Zhao, P.: Salient object detection for RGB-D image by single stream recurrent convolution neural network. Neurocomputing 363, 46–57 (2019)
Article Google Scholar
Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A.K., Davis, L.S.: Learning temporal regularity in video sequences. In: CVPR, pp. 733–742. IEEE Computer Society (2016)
Google Scholar
Liu, W., Luo, W., Lian, D., Gao, S.: Future frame prediction for anomaly detection - A new baseline. In: CVPR, pp. 6536–6545. Computer Vision Foundation/IEEE Computer Society (2018)
Google Scholar
Luo, W., Liu, W., Gao, S.: Remembering history with convolutional LSTM for anomaly detection. In: ICME, pp. 439–444. IEEE Computer Society (2017)
Google Scholar
Sultani, W., Chen, C., Shah, M.: Real-world anomaly detection in surveillance videos. In: CVPR, pp. 6479–6488. Computer Vision Foundation /IEEE Computer Society (2018)
Google Scholar
Xiang, X., Ren, W., Qiu, Y., Zhang, K., Lv, N.: Multi-object tracking method based on eficient channel attention and switchable atrous convolution. Neural Process. Lett. 53(4), 2747–2763 (2021)
Article Google Scholar
Li, P., Chen, P., Xie, Y., Zhang, D.: Bi-modal learning with channel-wise attention for multi-label image classification. IEEE Access 8, 9965–9977 (2020)
Article Google Scholar
Hou, G., Qin, J., Xiang, X., Tan, Y., Xiong, N.N.: Af-net: A medical image segmentation network based on attention mechanism and feature fusion. Comput. Mater. Continua 69(2), 1877–1891 (2021)
Article Google Scholar
Li, Y., Wang, X.: Person re-identification based on joint loss and multiple attention. Intell. Autom. Soft Comput. 30(2), 563–573 (2021)
Article MathSciNet Google Scholar
Prabhu, K., SathishKumar, S., Sivachitra, M., Dineshkumar, S., Sathiyabama, P.: Facial expression recognition using enhanced convolution neural network with attention mechanism. Comput. Syst. Sci. Eng. 41(1), 415–426 (2022)
Article Google Scholar
Fan, D., Wang, W., Cheng, M., Shen, J.: Shifting more attention to video salient object detection. In: CVPR, pp. 8554–8564. Computer Vision Foundation/IEEE (2019)
Google Scholar
Nasaruddin, N., Muchtar, K., Afdhal, A., Dwiyantoro, A.P.J.: Deep anomaly detection through visual attention in surveillance videos. J. Big Data 7(1), 87 (2020)
Article Google Scholar
Wang, C., Yao, Y., Yao, H.: Video anomaly detection method based on future frame prediction and attention mechanism. In: CCWC, pp. 405–407. IEEE (2021)
Google Scholar
Zhang, W., Wang, G., Huang, M., Wang, H., Wen, S.: Generative adversarial networks for abnormal event detection in videos based on self-attention mechanism. IEEE Access 9, 124847–124860 (2021)
Article Google Scholar
Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H.: Dual attention network for scene segmentation. In: CVPR, pp. 3146–3154. Computer Vision Foundation/IEEE (2019)
Google Scholar
Deng, L., Wang, X., Jiang, F., Doss, R.: Eeg-based emotion recognition via capsule nework with channel-wise attention and lstm models. CCF Trans. Pervasive Comput. Interact. 3(4), 425–435 (2021)
Article Google Scholar
Cao, Y., Xu, J., Lin, S., Wei, F., Hu, H.: Gcnet: Non-local networks meet squeeze-excitation networks and beyond. In: ICCV Workshops, pp. 1971–1980. IEEE (2019)
Google Scholar
Ma, B., Wang, X., Zhang, H., Li, F., Dan, J.: CBAM-GAN: generative adversarial networks based on convolutional block attention module. In: Sun, X., Pan, Z., Bertino, E. (eds.) ICAIS 2019. LNCS, vol. 11632, pp. 227–236. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-24274-9_20
Chapter Google Scholar
Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., Hu, Q.: Eca-net: Efficient channel attention for deep convolutional neural networks. In: CVPR, pp. 11531–11539. Computer Vision Foundation/IEEE (2020)
Google Scholar
Zhang, Q., Yang, Y.: Sa-net: Shuffle attention for deep convolutional neural networks. In: ICASSP, pp. 2235–2239. IEEE (2021)
Google Scholar
Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: ShuffleNet V2: practical guidelines for efficient CNN architecture design. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 122–138. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_8
Chapter Google Scholar
Mahadevan, V., Li, W., Bhalodia, V., Vasconcelos, N.: Anomaly detection in crowded scenes. In: CVPR, pp. 1975–1981. IEEE Computer Society (2010)
Google Scholar
Mehran, R., Oyama, A., Shah, M.: Abnormal crowd behavior detection using social force model. In: CVPR, pp. 935–942. IEEE Computer Society (2009)
Google Scholar
Kim, J., Grauman, K.: Observe locally, infer globally: A space-time MRF for detecting abnormal activities with incremental updates. In: CVPR, pp. 2921–2928. IEEE Computer Society (2009)
Google Scholar
Chong, Y.S., Tay, Y.H.: Abnormal event detection in videos using spatiotemporal autoencoder. In: Cong, F., Leung, A., Wei, Q. (eds.) ISNN 2017. LNCS, vol. 10262, pp. 189–196. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59081-3_23
Chapter Google Scholar
Fan, Y., Wen, G., Li, D., Qiu, S., Levine, M.D., Xiao, F.: Video anomaly detection and localization via gaussian mixture fully convolutional variational autoencoder. Comput. Vis. Image Underst. 195, 102920 (2020)
Article Google Scholar
Gong, D., Liu, L., Le, V., Saha, B., Mansour, M.R., Venkatesh, S., van den Hengel, A.: Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection. In: ICCV, pp. 1705–1714. IEEE (2019)
Google Scholar
Luo, W., et al.: Video anomaly detection with sparse coding inspired deep neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 43(3), 1070–1084 (2021)
Article Google Scholar

Download references

Funding

This work was supported by the National Science Foundation of China under Grant No. 41971343.

Author information

Authors and Affiliations

School of Computer and Electronic Information, Nanjing Normal University, Nanjing, China
Jiatao Gu, Jing Zeng & Genlin Ji

Authors

Jiatao Gu
View author publications
You can also search for this author in PubMed Google Scholar
Jing Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Genlin Ji
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Genlin Ji .

Editor information

Editors and Affiliations

Nanjing University of Information Science and Technology, Nanjing, China
Xingming Sun
Nanjing University of Information Science and Technology, Nanjing, China
Xiaorui Zhang
Jinan University, Guangzhou, China
Zhihua Xia
Purdue University, West Lafayette, IN, USA
Elisa Bertino

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gu, J., Zeng, J., Ji, G. (2022). Dual Attention Mechanisms Based Auto-Encoder for Video Anomaly Detection. In: Sun, X., Zhang, X., Xia, Z., Bertino, E. (eds) Artificial Intelligence and Security. ICAIS 2022. Lecture Notes in Computer Science, vol 13338. Springer, Cham. https://doi.org/10.1007/978-3-031-06794-5_13

Download citation

DOI: https://doi.org/10.1007/978-3-031-06794-5_13
Published: 04 July 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-06793-8
Online ISBN: 978-3-031-06794-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Dual Attention Mechanisms Based Auto-Encoder for Video Anomaly Detection

Abstract

Access this chapter

Similar content being viewed by others

Attention-based misaligned spatiotemporal auto-encoder for video anomaly detection

DeMAAE: deep multiplicative attention-based autoencoder for identification of peculiarities in video sequences

Cross-Modal Two-Stream Target Focused Network for Video Anomaly Detection

References

Funding

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Dual Attention Mechanisms Based Auto-Encoder for Video Anomaly Detection

Abstract

Access this chapter

Similar content being viewed by others

Attention-based misaligned spatiotemporal auto-encoder for video anomaly detection

DeMAAE: deep multiplicative attention-based autoencoder for identification of peculiarities in video sequences

Cross-Modal Two-Stream Target Focused Network for Video Anomaly Detection

References

Funding

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation