HN-MUM: heterogeneous video anomaly detection network with multi-united-memory module

Li, Hongjun; Wang, Yunlong; Chen, Mingyi; Li, Jiaxin

doi:10.1007/s11042-023-15154-x

HN-MUM: heterogeneous video anomaly detection network with multi-united-memory module

Published: 20 March 2023

Volume 82, pages 31521–31538, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

277 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

In this paper, we present a Heterogeneous Network with Multi-United-Memory (HN-MUM) module, which integrates motion and appearance to solve the Video Anomaly Detection (VAD) problem. First, we present a heterogeneous dual-flow network to process the motion and appearance information independently based on the notion of “specific analysis of particular issues” and the distinction between motion and appearance. Then, motivated by the notion of “view of connection” and the relationships between motion and appearance, we combine the motion and appearance features in the decoding phase. This is achieved by using a memory module to memorize and reconstruct the combined representation by matching the motion patterns with the appearance in memory items. On the other hand, we observe that a single memory module is unable to adequately capture all typical patterns. In light of this, we propose the Multi-United-Memory (MUM), which is consisted of three basic memory modules. Each basic memory module fuses the relevant motion and appearance elements, which is helpful to memorize the motion-appearance-united representation in the memory in a related manner. To the best of our knowledge, this is the first effort to use a multi-level unified-thought memory module to detect abnormalities. On UCSD Ped2, CUHK Avenue, and Shanghai Tech, HN-MUM is able to attain AUC values of 97.1%, 88.2%, and 76.2%, respectively. Extensive experiments on three benchmark datasets show that HN-MUM performs competitively with state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Spatio-Temporal United Memory for Video Anomaly Detection

A novel spatio-temporal memory network for video anomaly detection

Article 22 March 2024

MTM-net: a multidimensional two-stage memory-guided network for vedio abnormal detection

Article 07 November 2023

Data availability

We provide original and editable data appearing in the submitted article, including figures, tables and experimental results.

References

Abati D, Porrello A, Calderara S, Cucchiara R (2018) Latent space autoregression for novelty detection. Processing of the IEEE Conference on Computer Vision and Pattern Recognition, pp 481–490
Google Scholar
Cai Q, Pan Y, Yao T, Yan C, Mei T (2018) Memory matching networks for one-shot image recognition. Processing of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4080–4088
Google Scholar
Chang YP, Tu ZG, Xie W, Luo B, Zhang SF, Sui HG (2022) Video anomaly detection with spatio-temporal dissociation. Pattern Recogn 122:1–12
Google Scholar
Chen H, Shen J, Wang L, Song J (2017) Leveraging stacked denoising autoencoder in prediction of pathogen-host protein-protein interactions. Processing of the 2017 IEEE international congress on big data, pp 368–375
Google Scholar
Fan C, Zhang X, Zhang S, Wang W, Zhang C, Huang H (2019) heterogeneous memory enhanced multimodal attention model for video question answering. Processing of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1999–2007
Google Scholar
Fanta H, Shao Z, Ma L (2020) SiTGRU: single-tunnelled gated recurrent unit for abnormality detection. Inf Sci 524:15–32
Google Scholar
Giorno AD, Bagnell JA, Hebert M (2016) A discriminative framework for anomaly detection in large videos. Processing of the European Conference on Computer Vision, pp 334–349
Google Scholar
Gong D, Liu L, Le L, Saha B, Mansour MR, Venkatesh S, Hengel A (2020) memorizing normality to detect anomaly: memory-augmented deep autoencoder for unsupervised anomaly detection. Processing of the IEEE International Conference on Computer Vision, pp 1705–1714
Google Scholar
Han QL, Wang HF, Yang L, Wu M, Kou JQ, Du QS, Li NF (2020) Real-time adversarial GAN-based abnormal crowd behavior detection. J Real-Time Image Proc 17(6):2153–2162
Google Scholar
Hasan M, Choi J, Neumann J, Roy-Chowdhury AK, Davis LS (2016) Learning temporal regularity in video sequences. Processing of the IEEE Conference on Computer Vision and Pattern Recognition, pp 733–742
Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Processing of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778
Google Scholar
Ionescu RT, Smeureanu S, Alexe B, Popescu M (2017) Unmasking the abnormal events in video. Processing of the IEEE international conference on computer vision, pp 2914–2922
Google Scholar
Kang M, Lee K, Lee YH, Suh C (2020) Autoencoder-based graph construction for semi-supervised learning. Processing of the European conference on computer vision, pp 500–517
Google Scholar
Kingma D, Ba J (2015) Adam: a method for stochastic optimization. Processing of the International Conference on Learning Representations
Google Scholar
Kumar A, Irsoy O, Ondruska P, Iyyer M, Bradbury J, Gulrajani I, Zhong V, Paulus R, Socher R (2016) ask me anything: dynamic memory networks for natural language processing. Processing of the International Conference on Machine Learning, pp 2068–2078
Google Scholar
Kumar K (2019) EVS-DK: Event video skimming using deep keyframe. J Vis Commun Image Represent 58:345–352
Google Scholar
Kumar K, Kumar A, Bahuguna A (2017) D-CAD: deep and crowded anomaly detection. Proceedings of the 7th international conference on computer and communication technology, pp 100–105
Google Scholar
Kumar K, Shrimankar DD (2017) F-DES: fast and deep event summarization. IEEE Trans Multimedia 20(2):323–334
Google Scholar
Kumar K, Shrimankar DD (2018) Deep event learning boost-up approach: Delta. Multimed Tools Appl 77(20):26635–26655
Google Scholar
Kumar K, Shrimankar DD, Singh N (2016) Equal partition based clustering approach for event summarization in videos. 2016 12th international conference on signal-image technology & internet-based systems (SITIS), pp 119–126
Google Scholar
Kumar K, Shrimankar DD, Singh N (2018) V-less: a video from linear event summaries. Proceedings of 2nd international conference on Computer Vision & Image Processing, pp 385–395
Google Scholar
Kumar K, Shrimankar DD, Singh N (2018) Eratosthenes sieve based key-frame extraction technique for event summarization in videos. Multimed Tools Appl 77(6):7383–7404
Google Scholar
Lee S, Sung J, Yu Y, Kim G (2018) A memory network approach for story-based temporal summarization of 360 videos. Processing of the IEEE conference on computer vision and pattern recognition, pp 1410–1419
Google Scholar
Li RR, Liu WJ, Yang L, Sun SH, Hu W, Zhang F, Li W (2018) DeepUNet: a deep fully convolutional network for pixel-level sea-land segmentation. IEEE J Sel Top 11(11):3954–3962
Google Scholar
Li W, Mahadevan V, Vasconcelos N (2014) Anomaly detection and localization in crowded scenes. IEEE Trans Pattern Anal Mach Intell 36(1):18–32
Google Scholar
Liu W, Luo WX, Lian DZ, Gao SH (2018) Future frame prediction for anomaly detection -- a new baseline. Processing of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6536–6545
Google Scholar
Lu C, Shi J, Jia J (2013) Abnormal event detection at 150 fps in MATLAB. Processing of the IEEE international conference on computer vision, pp 2720–2727
Google Scholar
Łukasz K, Ofir N, Aurko R, Samy B (2017) Learning to remember rare events. Processing of the International Conference on Learning Representations
Google Scholar
Luo W, Liu W, Gao S (2017) A revisit of sparse coding based anomaly detection in stacked RNN framework. Processing of the IEEE international conference on computer vision, pp 341–349
Google Scholar
Luo W, Liu W, Gao S (2017) Remembering history with convolutional LSTM for anomaly detection. Processing of the IEEE international conference on multimedia and expo, pp 439–444
Google Scholar
Luo W, Liu W, Lian D, Tang J, Duan L, Peng X, Gao S (2021) Video anomaly detection with sparse coding inspired deep neural networks. IEEE Trans Pattern Anal Mach Intell 43(3):1070–1084
Google Scholar
Mahadevan V, Li W, Bhalodia V, Vasconcelos N (2010) anomaly detection in crowded scenes. Processing of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1975–1981
Google Scholar
Mathieu M, Couprie C, LeCun Y (2015) Deep multi-scale video prediction beyond mean square error. Processing of the International Conference on Learning Representations
Google Scholar
Medel JR, Savakis A (2016) Anomaly detection in video using predictive convolutional long short-term memory networks. Processing of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1–27
Google Scholar
Mehran R, Oyama A, Shah M (2009) Abnormal crowd behavior detection using social force model. Processing of the IEEE Conference on Computer Vision and Pattern Recognition, pp 935–942
Google Scholar
Morais R, Le V, Tran T, Saha B, Mansour M, Venkatesh S (2019) Learning regularity in skeleton trajectories for anomaly detection in videos. Processing of the IEEE Conference on Computer Vision and Pattern Recognition, pp 11988–11996
Google Scholar
Nguyen TN, Meunier J (2019) Anomaly detection in video sequence with appearance-motion correspondence. Processing of the IEEE international conference on computer vision, pp 1273–1283
Google Scholar
Park H, Noh J, Ham B (2020) Learning memory-guided normality for anomaly detection. Processing of the IEEE conference on computer vision and pattern recognition, pp 14360–14369
Google Scholar
Paszke A, Gross S, Chintala S, Chanan G, Yang E, Devito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in PyTorch. Processing of the Conference and Workshop on Neural Information Processing Systems
Google Scholar
Quan Z, Zeng W, Li X, Liu Y, Yu Y, Yang W (2020) Recurrent neural networks with external addressable long-term and working memory for learning long-term dependences. IEEE Trans Neural Netw Learn Syst 31:813–826
MathSciNet Google Scholar
Stewart R, Ermon S (2017) Label-free supervision of neural networks with physics and domain knowledge. Proceeding of the 31st Association for the Advancement of artificial intelligence conference, pp 2576–2582
Google Scholar
Wang DL, Wang SY (2021) Abnormal event detection algorithm based on dual attention future frame prediction and gap fusion discrimination. J Electron Imaging 30(2):023009
Google Scholar
Weston J, Chopra S, Bordes A (2015) Memory networks. Processing of the International Conference on Learning Representations
Google Scholar
Weston JE, Szlam AD, Fergus RD, Sukhbaatar S (2015) End-to-end memory networks. Processing of the Conference and Workshop on Neural Information Processing Systems, pp 2440–2448
Google Scholar
Xu D, Yan Y, Ricci E, Sebe N (2017) Detecting anomalous events in videos by learning deep representations of appearance and motion. Comput Vis Image Under 156:117–127
Google Scholar
Ye M, Peng X, Gan W, Wu W, Qiao Y (2019) Anopcn: video anomaly detection via deep predictive coding network. Processing of the 27th ACM multimedia conference, pp 1805–1813
Google Scholar
Yong SC, Yong HT (2017) Abnormal event detection in videos using spatiotemporal autoencoder. Processing of the international symposium on neural networks, pp 189–196
Google Scholar
Zhao Y, Deng B, Shen C, Liu Y, Lu H, Hua XS (2017) Spatiotemporal AutoEncoder for video anomaly detection. Processing of the 25th ACM multimedia conference, pp 1933–1941
Google Scholar
Zhu M, Pan P, Chen W, Yang Y (2019) DMGAN: dynamic memory generative adversarial networks for text-to-image synthesis. Processing of the IEEE international conference on computer vision, pp 5795–5803
Google Scholar

Download references

Code availability

We are pleased to share code that is used in work submitted for publication.

Funding

This work is supported in part by National Natural Science Foundation of China under Grant 61871241, Grant 61971245 and Grant 61976120, in part by Nanjing University State Key Lab. for Novel Software Technology under Grant KFKT2019B15, in part by Nantong Science and Technology Program JC2021131 and in part by Postgraduate Research and Practice Innovation Program of Jiangsu Province KYCX21_3084 and KYCX22_3340.

Author information

Authors and Affiliations

School of Information Science and Technology, Nantong University, Nantong, 226019, Jiangsu, People’s Republic of China
Hongjun Li, Yunlong Wang, Mingyi Chen & Jiaxin Li
State Key Lab. for Novel Software Technology, Nanjing University, Nanjing, 210023, Jiangsu, People’s Republic of China
Hongjun Li
Nantong Research Institute for Advanced Communication Technologies, Nantong, 226019, Jiangsu, People’s Republic of China
Hongjun Li

Authors

Hongjun Li
View author publications
You can also search for this author in PubMed Google Scholar
Yunlong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Mingyi Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jiaxin Li
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Hongjun Li, Yunlong Wang, Mingyi Chen, Jiaxin Li. The first draft of the manuscript was written by Hongjun Li and Yunlong Wang, all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Hongjun Li.

Ethics declarations

Conflict of interest

None.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, H., Wang, Y., Chen, M. et al. HN-MUM: heterogeneous video anomaly detection network with multi-united-memory module. Multimed Tools Appl 82, 31521–31538 (2023). https://doi.org/10.1007/s11042-023-15154-x

Download citation

Received: 13 June 2022
Revised: 26 August 2022
Accepted: 13 March 2023
Published: 20 March 2023
Issue Date: August 2023
DOI: https://doi.org/10.1007/s11042-023-15154-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HN-MUM: heterogeneous video anomaly detection network with multi-united-memory module

Abstract

Access this article

Similar content being viewed by others

Spatio-Temporal United Memory for Video Anomaly Detection

A novel spatio-temporal memory network for video anomaly detection

MTM-net: a multidimensional two-stage memory-guided network for vedio abnormal detection

Data availability

References

Code availability

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

HN-MUM: heterogeneous video anomaly detection network with multi-united-memory module

Abstract

Access this article

Similar content being viewed by others

Spatio-Temporal United Memory for Video Anomaly Detection

A novel spatio-temporal memory network for video anomaly detection

MTM-net: a multidimensional two-stage memory-guided network for vedio abnormal detection

Data availability

References

Code availability

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation