Skip to main content
Log in

Dy-MIL: dynamic multiple-instance learning framework for video anomaly detection

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Anomaly detection is an extremely challenging task in the field of visual understanding because it involves identifying events that deviate significantly from normal patterns. One of the primary reasons for the difficulty of this task is the diversity and complexity of anomalous events. Therefore, it is impossible for us to collect all types of anomalies and label them. In recent work, weakly supervised methods become one of the optimal solutions for anomaly detection. Thus, in this paper, we focus on weakly supervised learning and propose a dynamic multiple-instance learning framework for video anomaly detection, which develops a dynamic ranking method combined the k-max-selection scheme to enlarge the inter-class distance between anomalous and normal instances by only using video-level labels. Experimental results demonstrate that our framework achieves superior improvements on three benchmark datasets, including the ShanghaiTech dataset, UCF Crime dataset and NUT dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data availability

The data that support the findings of this study are openly available at https://gitcode.com/mirrors/stevenLiuWen/ano_pred_cvpr2018, https://www.crcv.ucf.edu/projects/real-world/, and https://pan.baidu.com/s/1ZNXxOZFl8pBAyjJpS4yE0A, reference number [3, 22, 24].

References

  1. Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A.K., Davis, L.S.: Learning temporal regularity in video sequences. In: International Conference on Pattern Recognition (ICPR), pp. 733–742 (2016)

  2. Zhao, Y., Deng, B., Shen, C., Liu, Y., Lu, H., Hua, X.-S.: Spatio-temporal autoencoder for video anomaly detection. In: ACM International Conference on Multimedia (ACM MM), pp. 1933–1941 (2017)

  3. Sultani, W., Chen, C., Shah, M.: Real-world anomaly detection in surveillance videos. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6479–6488 (2018)

  4. Feng, J.C., Hong, F.T., Zheng, W.S.: Mist: Multiple instance self-training framework for video anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14009–14018 (2021)

  5. Zaheer, M.Z., Mahmood, A., Astrid, M., Lee, S.-I.: Claws: Clustering assisted weakly supervised learning with normalcy suppression for anomalous event detection. In: European Conference on Computer Vision (ECCV), pp. 358–376 (2020)

  6. Zhu, Y., Newsam, S.: Motion-aware feature for improved video anomaly detection. In: 30th British Machine Vision Conference (BMVC), pp. 1–12 (2020)

  7. Degardin, B., Proena, H.: Iterative weak/self-supervised classification framework for abnormal events detection. Pattern Recogn. Lett. 145(1), 50–57 (2021)

    Article  ADS  Google Scholar 

  8. Zhong, J., Li, N., Kong, W., Liu, S., Li, T.H., Li, G.: Graph convolutional label noise cleaner: Train a plug-and-play action classifier for anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1237–1246 (2019)

  9. Wan, B., Fang, Y., Xia, X., Mei, J.: Weakly supervised video anomaly detection via center-guided discriminative learning. In: IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2020)

  10. Lin, S., Clark, R., Birke, R., Schönborn, S., Trigoni, N., Roberts, S.: Anomaly detection for time series using vae-lstm hybrid model. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4322–4326 (2020)

  11. Liu, W., Luo, W., Lian, D., Gao, S.: Future frame prediction for anomaly detection–a new baseline. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6536–6545 (2018)

  12. Zaheer, M.Z., Mahmood, A., Khan, M.H., Segu, M., Yu, F., Lee, S.-I.: Generative cooperative learning for unsupervised video anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14744–14754 (2022)

  13. Liu, Z., Nie, Y., Long, C., Zhang, Q., Li, G.: A hybrid video anomaly detection framework via memory-augmented flow reconstruction and flow-guided frame prediction. In: IEEE International Conference on Computer Vision (ICCV), pp. 13588–13597 (2021)

  14. Ye, M., Peng, X., Gan, W., Wu, W., Qiao, Y.: Anopcn: Video anomaly detection via deep predictive coding network. In: ACM International Conference on Multimedia (ACM MM), pp. 1805–1813 (2019)

  15. Tang, Y., Zhao, L., Zhang, S., Gong, C., Li, G., Yang, J.: Integrating prediction and reconstruction for anomaly detection. Pattern Recogn. Lett. 129(1), 123–130 (2020)

    Article  ADS  Google Scholar 

  16. Chong, Y.S., Tay, Y.H.: Abnormal event detection in videos using spatio-temporal autoencoder. In: International Symposium on Neural Networks (ISNN), pp. 189–196 (2017)

  17. Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A.K., Davis, L.S.: Learning temporal regularity in video sequences. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 733–742 (2016)

  18. Dubey, S., Boragule, A., Jeon, M.: 3D ResNet with ranking loss function for abnormal activity detection in videos. In: International Conference on Control, Automation and Information Sciences (ICCAIS), pp. 1–6 (2019)

  19. Tan, W., Yao, Q., Liu, J.: Overlooked video classification in weakly supervised video anomaly detection. In: arXiv Preprint arXiv: 2210.06688 (2023)

  20. Doshi, K., Yilmaz, Y.: Any-shot sequential anomaly detection in surveillance videos. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 4037–4042 (2020)

  21. Georgescu, M.I., Ionescu, R., Khan, F.S., Popescu, M., Shah, M.: A background-agnostic framework with adversarial training for abnormal event detection in video. IEEE Transactions on Pattern Analysis and Machine Intelligence PP(1), 1–18 (2021)

  22. Li, Q., Yang, R., Xiao, F., Bhanu, B., Zhang, F.: Attention-based anomaly detection in multi-view surveillance videos. Knowl.-Based Syst. 252(2), 1–11 (2022)

    Google Scholar 

  23. Lu, J., Batra, D., Parikh, D., Lee, S.: Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. In: Advances in Neural Information Processing Systems (NIPS), pp. 13–23 (2019)

  24. Luo, W., Liu, W., Gao, S.: A revisit of sparse coding based anomaly detection in stacked RNN framework. In: IEEE International Conference on Computer Vision (ICCV), pp. 341–349 (2017)

  25. Hinami, R., Mei, T., Satoh, S.: Joint detection and recounting of abnormal events by learning deep generic knowledge. In: IEEE International Conference on Computer Vision (ICCV), pp. 3639–3647 (2017)

  26. Sharma, P., Ding, N., Goodman, S., Soricut, R.: Conceptual captions: A cleaned, hypernymed, image alt-text dataset for automatic image captioning. In: 56th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 2556–2565 (2018)

  27. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  Google Scholar 

  28. Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: International Conference on Machine Learning (ICML), pp. 1–8 (2010)

  29. He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: European Conference on Computer Vision (ECCV), pp. 630–645 (2016)

  30. Duchi, J.C., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12(1), 2121–2159 (2011)

    MathSciNet  Google Scholar 

  31. Joo, H.K., Vo, K., Yamazaki, K., Le, N.: Clip-tsa: Clip-assisted temporal self-attention for weakly-supervised video anomaly detection. In: International Conference on Image Processing (ICIP), pp. 3230–3234 (2023)

  32. Lu, C., Shi, J., Jia, J.: Abnormal event detection at 150 fps in matlab. In: IEEE International Conference on Computer Vision (ICCV), pp. 2720–2727 (2013)

  33. Wan, B., Fang, Y., Xia, X., Mei, J.: Weakly supervised video anomaly detection via center-guided discriminative learning. In: IEEE International Conference on Multimedia and Expositions (ICME), pp. 1–6 (2020)

  34. Sapkota, H., Yu, Q.: Bayesian nonparametric submodular video partition for robust anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3202–3211 (2022)

  35. Zaigham Zaheer, M., Mahmood, A., Haris Khan, M., Segu, M., Yu, F., Lee, S.-I.: Generative cooperative learning for unsupervised video anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14724–14734 (2022)

Download references

Author information

Authors and Affiliations

Authors

Contributions

Chen Li: Conceptualization, Methodology, Software, Data curation, Investigation, Writing- Original draft preparation, Validation. Mo Chen: Data curation, Visualization, Methodology, Investigation, Validation.

Corresponding author

Correspondence to Mo Chen.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Communicated by A. Liu.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, C., Chen, M. Dy-MIL: dynamic multiple-instance learning framework for video anomaly detection. Multimedia Systems 30, 11 (2024). https://doi.org/10.1007/s00530-023-01237-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00530-023-01237-0

Keywords

Navigation