Abstract
Video Anomaly Detection (VAD) is a significant task, which refers to taking a video clip as input and outputting class labels, e.g., normal or abnormal, at the frame level. Wang et al. proposed a method called DSTJiP, which trains the model by solving Decoupled Spatial and Temporal Jigsaw Puzzles and achieves impressive VAD performance. However, the model sometimes fails to detect abnormal human actions where abnormal motions are accompanied by normal motions. The reason is that the model learns representations of little- and non-motion parts of training examples, resulting in being insensitive to abnormal motions. To circumvent this problem, we propose to solve Spatial and Augmented Temporal Jigsaw Puzzles (SATJiP) as an extension of DSTJiP. SATJiP encourages the model to focus on motions by a novel pretext task, enabling it to detect abnormal motions accompanied by normal motions. Experiments conducted on three standard VAD benchmarks demonstrate that SATJiP outperforms the state-of-the-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Astrid, M., Zaheer, M.Z., Lee, J.Y., Lee, S.I.: Learning not to reconstruct anomalies. In: Proceedings of BMVC (2021)
Astrid, M., Zaheer, M.Z., Lee, S.I.: Synthetic temporal anomaly guided end-to-end video anomaly detection. In: Proceedings of ICCVW (2021)
Barbalau, A., et al.: SSMTL++: revisiting self-supervised multi-task learning for video anomaly detection. Comput. Vis. Image Underst. 229, 103656 (2023)
Cai, R., Zhang, H., Liu, W., Gao, S., Hao, Z.: Appearance-motion memory consistency network for video anomaly detection. In: Proceedings of AAAI (2021)
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Computi. Surv. (CSUR) 41(3), 1–58 (2009)
Chang, Y., Tu, Z., Xie, W., Yuan, J.: Clustering driven deep autoencoder for video anomaly detection. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12360, pp. 329–345. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58555-6_20
Chen, C., et al.: Comprehensive regularization in a bi-directional predictive network for video anomaly detection. In: Proceedings of AAAI, vol. 36 (2022)
Deng, H., Zhang, Z., Zou, S., Li, X.: Bi-directional frame interpolation for unsupervised video anomaly detection. In: Proceedings of WACV (2023)
Feichtenhofer, C., Li, Y., He, K., et al.: Masked autoencoders as spatiotemporal learners. In: Proceedings of NeurIPS, vol. 35 (2022)
Feng, X., Song, D., Chen, Y., Chen, Z., Ni, J., Chen, H.: Convolutional transformer based dual discriminator generative adversarial networks for video anomaly detection. In: Proceedings of MM (2021)
Georgescu, M., Barbalau, A., Ionescu, R.T., Khan, F.S., Popescu, M., Shah, M.: Anomaly detection in video via self-supervised and multi-task learning. In: Proceedings of CVPR (2021)
Gong, D., et al.: Memorizing normality to detect anomaly: memory-augmented deep autoencoder for unsupervised anomaly detection. In: Proceedings of ICCV (2019)
Huang, X., Zhao, C., Wu, Z.: A video anomaly detection framework based on appearance-motion semantics representation consistency. In: Proceedings of ICASSP (2023)
Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: Flownet 2.0: evolution of optical flow estimation with deep networks. In: Proceedings of CVPR (2017)
Ionescu, R.T., Khan, F.S., Georgescu, M.I., Shao, L.: Object-centric auto-encoders and dummy anomalies for abnormal event detection in video. In: Proceedings of CVPR (2019)
Lai, Y., Han, Y., Wang, Y.: Anomaly detection with prototype-guided discriminative latent embeddings. In: Proceedings of ICDM (2021)
Lee, S., Kim, H.G., Ro, Y.M.: BMAN: bidirectional multi-scale aggregation networks for abnormal event detection. IEEE Trans. Image Process. 29, 2395–2408 (2020)
Liu, W., Luo, W., Lian, D., Gao, S.: Future frame prediction for anomaly detection–a new baseline. In: Proceedings of CVPR (2018)
Liu, Y., Liu, J., Zhao, M., Yang, D., Zhu, X., Song, L.: Learning appearance-motion normality for video anomaly detection. In: Proceedings of ICME (2022)
Liu, Z., Nie, Y., Long, C., Zhang, Q., Li, G.: A hybrid video anomaly detection framework via memory-augmented flow reconstruction and flow-guided frame prediction. In: Proceedings of ICCV (2021)
Lu, C., Shi, J., Jia, J.: Abnormal event detection at 150 FPS in MATLAB. In: Proceedings of ICCV (2013)
Luo, W., Liu, W., Gao, S.: Remembering history with convolutional LSTM for anomaly detection. In: Proceedings of ICME (2017)
Luo, W., Liu, W., Gao, S.: A revisit of sparse coding based anomaly detection in stacked RNN framework. In: Proceedings of ICCV (2017)
Mahadevan, V., Li, W., Bhalodia, V., Vasconcelos, N.: Anomaly detection in crowded scenes. In: Proceedings of CVPR (2010)
Park, H., Noh, J., Ham, B.: Learning memory-guided normality for anomaly detection. In: Proceedings of CVPR (2020)
Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: feature learning by inpainting. In: Proceedings of CVPR (2016)
Redmon, J., Farhadi, A.: YOLOv3: An incremental improvement. CoRR abs/ arXiV: 1804.02767 (2018)
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: Visual explanations from deep networks via gradient-based localization. In: Proceedings of CVPR (2017)
Shen, L., Matsukawa, T., Suzuki, E.: Detecting video anomalous events with an enhanced abnormality score. In: Proceedings of PRICAI, vol. 13629 (2022)
Sun, C., Shi, C., Jia, Y., Wu, Y.: Learning event-relevant factors for video anomaly detection. In: Proceedings of AAAI, vol. 37 (2023)
Tang, Y., Zhao, L., Zhang, S., Gong, C., Li, G., Yang, J.: Integrating prediction and reconstruction for anomaly detection. Pattern Recogn. Lett. 129, 123–130 (2020)
Tishby, N., Zaslavsky, N.: Deep learning and the information bottleneck principle. In: Proceedings of ITW, pp. 1–5 (2015)
Tong, Z., Song, Y., Wang, J., Wang, L.: VideoMAE: masked autoencoders are data-efficient learners for self-supervised video pre-training. In: Procedings of NeurIPS, vol. 35 (2022)
Vu, H., Nguyen, T.D., Travers, A., Venkatesh, S., Phung, D.: Energy-based localized anomaly detection in video surveillance. In: Proceedings of PAKDD (2017)
Wang, G., Wang, Y., Qin, J., Zhang, D., Bao, X., Huang, D.: Video anomaly detection by solving decoupled spatio-temporal jigsaw puzzles. In: Proceedings of ECCV (2022). https://doi.org/10.1007/978-3-031-20080-9_29
Wang, X., Wang, X., et al.: Robust unsupervised video anomaly detection by multipath frame prediction. IEEE Trans. Neural Netw, Learn. Syst. 33(6), 2301–2312 (2022)
Wang, Y., Qin, C., Bai, Y., Xu, Y., Ma, X., Fu, Y.: Making reconstruction-based method great again for video anomaly detection. In: Proceedings of ICDM (2022)
Wang, Z., Zou, Y., Zhang, Z.: Cluster attention contrast for video anomaly detection. In: Proceedings of MM (2020)
Yang, Z., Liu, J., Wu, Z., Wu, P., Liu, X.: Video event restoration based on keyframes for video anomaly detection. In: Proceedings of CVPR (2023)
Ye, M., Peng, X., Gan, W., Wu, W., Qiao, Y.: AnoPCN: video anomaly detection via deep predictive coding network. In: Proceedings of MM (2019)
Yu, G., et al.: Cloze test helps: effective video anomaly detection via learning to complete video events. In: Proceedings of MM (2020)
Zhou, W., Li, Y., Zhao, C.: Object-guided and motion-refined attention network for video anomaly detection. In: Proceedings of ICME (2022)
Acknowledgment
This work was partially supported by JST, the establishment of university fellowships towards the creation of science technology innovation, Grant Number JPMJFS2132.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Shen, L., Matsukawa, T., Suzuki, E. (2024). SATJiP: Spatial and Augmented Temporal Jigsaw Puzzles for Video Anomaly Detection. In: Yang, DN., Xie, X., Tseng, V.S., Pei, J., Huang, JW., Lin, J.CW. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2024. Lecture Notes in Computer Science(), vol 14645. Springer, Singapore. https://doi.org/10.1007/978-981-97-2242-6_3
Download citation
DOI: https://doi.org/10.1007/978-981-97-2242-6_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-2241-9
Online ISBN: 978-981-97-2242-6
eBook Packages: Computer ScienceComputer Science (R0)