A multi-scale feature fusion spatial–channel attention model for background subtraction

Yang, Yizhong; Xia, Tingting; Li, Dajin; Zhang, Zhang; Xie, Guangjun

doi:10.1007/s00530-023-01139-1

A multi-scale feature fusion spatial–channel attention model for background subtraction

Regular Paper
Published: 28 July 2023

Volume 29, pages 3609–3623, (2023)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Yizhong Yang¹,
Tingting Xia¹,
Dajin Li¹,
Zhang Zhang¹ &
…
Guangjun Xie¹

209 Accesses
3 Citations
Explore all metrics

Abstract

Background subtraction is an essential task in computer vision, and is often used as a pre-processing step for many advanced tasks. In this work, we propose a novel multi-scale feature fusion attention mechanism network to tackle cross-scene background subtraction. The cross-fusion of feature maps at different stages of the encoder makes the features input into the decoder contain low-level and high-level information. The spatial–channel attention based on the weight matrix makes the model focus on processing information related to foreground extraction. We evaluate the proposed model on the CDnet-2014 dataset with two scene-independent evaluation strategies and obtain competitive F-Measure. In addition, to evaluate the generalization ability of the model, we perform a cross-dataset evaluation scheme on the LASIESTA and SBI2015 datasets. The overall F-Measure of the model is 0.89 and 0.93, respectively. Experimental results demonstrate that the model performs well compared to the current state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A multi-scale inputs and labels model for background subtraction

Article 26 June 2023

RGBD deep multi-scale network for background subtraction

Article 10 May 2022

Multiscale Cascaded Scene-Specific Convolutional Neural Networks for Background Subtraction

Data availability

The authors declare that the data supporting the findings of this study are available within the article.

References

Stauffer C., Grimson W.E.L.: Adaptive background mixture models for real-time tracking. In: Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149), Conference Paper pp. 246–52 vol. 2, (1999)
Elgammal, A., Harwood, D., Davis, L.: Non-parametric model for background subtraction. In: European Conference on Computer Vision, pp. 751–767. Springer, New York (2000)
Google Scholar
Barnich O., Van Droogenbroeck M., Ieee: VIBE: a powerful random technique to estimate the background in video sequences. In: IEEE International Conference on Acoustics, Speech and Signal Processing, Taipei, TAIWAN, 2009, pp. 945–948, (2009).
Heikkila, M., Pietikainen, M.: A texture-based method for modeling the background and detecting moving objects. IEEE Trans. Pattern Anal. Mach. Intell. 28(4), 657–662 (2006)
Article Google Scholar
Braham M., Van Droogenbroeck M.: Deep Background Subtraction with Scene-Specific Convolutional Neural Networks, in 23rd International Conference on Systems, Signals and Image Processing (IWSSIP), Bratislava, SLOVAKIA, 2016, pp. 113–116, (2016)
Bakkay M. C. et al.: BScGAN: deep background subtraction with conditional generative adversarial networks, in 25th IEEE International Conference on Image Processing (ICIP), Athens, GREECE, 2018, pp. 4018–4022, (2018).
Zeng, D., Zhu, M.: Background subtraction using multiscale fully convolutional network. IEEE Access 6, 16010–16021 (2018)
Article Google Scholar
Braham M., Pierard S., Van Droogenbroeck M.: Semantic background subtraction, in 2017 IEEE International Conference on Image Processing (ICIP), 2017, pp. 4552–4556: Ieee.
Babaee, M., Dinh, D.T., Rigoll, G.: A deep convolutional neural network for video sequence background subtraction (in English). Pattern Recogn. 76, 635–649 (2018)
Article Google Scholar
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Lim, L.A., Keles, H.Y.: Learning multi-scale features for foreground segmentation. Pattern Anal. Appl. 23(3), 1369–1380 (2019)
Article Google Scholar
Simonyan K., Zisserman A. J. C. S.: Very Deep Convolutional Networks for Large-Scale Image Recognition (2014)
Long et al.: Fully convolutional networks for semantic segmentation, Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2015: 3431–3440, (2017).
Ronneberger O., Fischer P., Brox T. J. S. I. P.: U-Net: convolutional networks for biomedical image segmentation, International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015, pp. 234–241, (2015).
Wang, J., et al.: Deep high-resolution representation learning for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 43(10), 3349–3364 (2021)
Article Google Scholar
St-Charles, P.-L., Bilodeau, G.-A., Bergevin, R.: SuBSENSE: a universal change detection method with local adaptive sensitivity. IEEE Trans. Image Process. 24(1), 359–373 (2015)
Article MathSciNet MATH Google Scholar
Singh, R.P., Sharma, P.: Instance-vote-based motion detection using spatially extended hybrid feature space. Vis. Comput. 37(6), 1527–1543 (2020)
Article Google Scholar
Zhao X., Wang G., He Z., Liang D., Zhang S., Tan J. J. T. V. C.: Unsupervised inner-point-pairs model for unseen-scene and online moving object detection, pp. 1–17, (2022).
Sultana, M., Mahmood, A., Jung, S.K.: Unsupervised moving object segmentation using background subtraction and optimal adversarial noise sample search (in English). Pattern Recogn. 129, 11 (2022). (Art. no. 108719)
Article Google Scholar
Cioppa A., Van Droogenbroeck M., Braham M.: Real-time semantic background subtraction, in 2020 IEEE International Conference on Image Processing (ICIP), 2020, pp. 3214–3218: IEEE.
Sultana, M., Bouwmans, T., Giraldo, J.H., Jung, S.K.: Robust Foreground Segmentation in RGBD Data from Complex Scenes Using Adversarial Networks, pp. 3–16. Springer International Publishing, Cham (2021)
Google Scholar
Wang, Y., Luo, Z., Jodoin, P.-M.: Interactive deep learning method for segmenting moving objects. Pattern Recogn. Lett. 96, 66–75 (2017)
Article Google Scholar
Patil, P.W., Dudhane, A., Murala, S., Gonde, A.B.: Deep adversarial network for scene independent moving object segmentation (in English). IEEE Signal Process. Lett. 28, 489–493 (2021)
Article Google Scholar
Mandal, M., Vipparthi, S.K.: Scene independency matters: an empirical study of scene dependent and scene independent evaluation for CNN-based change detection (in English). IEEE Trans. Intell. Transport. Syst. 23(3), 2031–2044 (2022)
Article Google Scholar
Mandal, M., Dhar, V., Mishra, A., Vipparthi, S.K., Abdel-Mottaleb, M.: 3DCD: scene independent end-to-end spatiotemporal feature learning framework for change detection in unseen videos. IEEE Trans. Image Process. 30, 546–558 (2021)
Article Google Scholar
Mandal, M., Dhar, V., Mishra, A., Vipparthi, S.K.: 3DFR: a swift 3D feature reductionist framework for scene independent change detection. IEEE Signal Process. Lett. 26(12), 1882–1886 (2019)
Article Google Scholar
Tezcan M. O., Ishwar P., Konrad J., Soc I. C.: BSUV-Net: a fully-convolutional neural network for background subtraction of unseen videos, in IEEE Winter Conference on Applications of Computer Vision (WACV), Snowmass, CO, 2020, pp. 2763–2772, 2020.
Tezcan, M.O., Ishwar, P., Konrad, J.: BSUV-Net 2.0: spatio-temporal data augmentations for video-agnostic supervised background subtraction. IEEE Access 9, 53849–53860 (2021)
Article Google Scholar
Zhang, J., Zhang, X., Zhang, Y., Duan, Y., Li, Y., Pan, Z.: Meta-knowledge learning and domain adaptation for unseen background subtraction. IEEE Trans. Image Process. 30, 9058–9068 (2021)
Article Google Scholar
Kajo, I., Kas, M., Ruichek, Y., Kamel, N.: Tensor based completion meets adversarial learning: a win-win solution for change detection on unseen videos. Comput. Vis. Image Understand. 226, 103584 (2023)
Article Google Scholar
Houhou, I., Zitouni, A., Ruichek, Y., Bekhouche, S.E., Kas, M., Taleb-Ahmed, A.: RGBD deep multi-scale network for background subtraction (in English). Int. J. Multimed. Inf. 11(3), 395–407 (2022)
Article Google Scholar
Wang Y. et al.: CDnet 2014: an expanded change detection benchmark dataset, in 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, 2014, pp. 393–+, 2014.
Cuevas, C., Yáñez, E.M., García, N.: Labeled dataset for integral evaluation of moving object detection algorithms: LASIESTA. Comput. Vis. Image Underst. 152, 103–117 (2016)
Article Google Scholar
Maddalena L., Petrosino A.: Towards Benchmarking Scene Background Initialization, in 18th International Conference on Image Analysis and Processing (ICIAP), Genoa, ITALY, 2015, vol. 9281, pp. 469–476, 2015.
Lee, S.-H., Lee, G.-C., Yoo, J., Kwon, S.: WisenetMD: motion detection using dynamic background region analysis. Symmetry 11(5), 621 (2019)
Article Google Scholar
Qi Q. et al.: Background subtraction via regional multi-feature-frequency model in complex scenes (in English). Soft Comput. Article; Early Access p. 14, (2023).
Chacon-Murguia M. I., Guzman-Pando A.: Moving object detection in video sequences based on a two-frame temporal information CNN (in English), Neural Process. Lett. Article; Early Access p. 25.
Bouwmans, T., Javed, S., Sultana, M., Jung, S.K.: Deep neural network concepts for background subtraction: a systematic review and comparative evaluation. Neural Netw. 117, 8–66 (2019)
Article Google Scholar
Cuevas, C., García, N.: Improved background modeling for real-time spatio-temporal non-parametric moving object detection strategies. Image Vis. Comput. 31(9), 616–630 (2013)
Article Google Scholar
St-Charles P.-L., Bilodeau G.-A., Bergevin R.: A self-adjusting approach to change detection based on background word consensus, in Presented at the 2015 IEEE Winter Conference on Applications of Computer Vision, 2015.
Rahmon G., Bunyak F., Seetharaman G., Palaniappan K.: Motion U-Net: multi-cue encoder-decoder network for motion segmentation, in 2020 25th International Conference on Pattern Recognition (ICPR), Conference Paper pp. 8125–8132, (2020).
Berjón, D., Cuevas, C., Morán, F., García, N.: Real-time nonparametric background subtraction with tracking-based foreground update. Pattern Recogn. 74, 156–170 (2018)
Article Google Scholar
Haines, T.S.F., Xiang, T.: Background Subtraction with DirichletProcess Mixture Models. IEEE Trans. Pattern Anal. Mach. Intell. 36(4), 670–683 (2014)
Article Google Scholar
Maddalena L., Petrosino A.: The SOBS algorithm: What are the limits?, in 2012 IEEE computer society conference on computer vision and pattern recognition workshops, 2012, pp. 21–26: IEEE.
Maddalena, L., Petrosino, A.: A self-organizing approach to background subtraction for visual surveillance applications. IEEE Trans. Image Process. 17(7), 1168–1177 (2008)
Article MathSciNet Google Scholar
Zhao, C., Hu, K., Basu, A.: Universal background subtraction based on arithmetic distribution neural network. IEEE Trans. Image Process. 31, 2934–2949 (2022)
Article Google Scholar
Kim, J.-Y., Ha, J.-E.: Foreground objects detection using a fully convolutional network with a background model image and multiple original images. IEEE Access 8, 159864–159878 (2020)
Article Google Scholar

Download references

Funding

This work was supported by the National Natural Science Foundation of China under Grant 61674049 and U19A2053, and the Fundamental Research Funds for the Central Universities of China under Grant JZ2021HGQA0262.

Author information

Authors and Affiliations

School of Microelectronics, Hefei University of Technology, Hefei, 230009, China
Yizhong Yang, Tingting Xia, Dajin Li, Zhang Zhang & Guangjun Xie

Authors

Yizhong Yang
View author publications
You can also search for this author in PubMed Google Scholar
Tingting Xia
View author publications
You can also search for this author in PubMed Google Scholar
Dajin Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Guangjun Xie
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Yizhong Yang and Guangjun Xie supervised the project; Tingting Xia, Dajin Li and Yizhong Yang mainly conducted experiments, and collected and analyzed the data; Zhang Zhang provided guidance in the algorithms; Yizhong Yang, Tingting Xia and Dajin Li wrote and revised the manuscript. All authors discussed the results and gave suggestions on the revision of the manuscript.

Corresponding authors

Correspondence to Yizhong Yang or Guangjun Xie.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Communicated by Y. Kong.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yang, Y., Xia, T., Li, D. et al. A multi-scale feature fusion spatial–channel attention model for background subtraction. Multimedia Systems 29, 3609–3623 (2023). https://doi.org/10.1007/s00530-023-01139-1

Download citation

Received: 26 November 2022
Accepted: 12 July 2023
Published: 28 July 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s00530-023-01139-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A multi-scale feature fusion spatial–channel attention model for background subtraction

Abstract

Access this article

Similar content being viewed by others

A multi-scale inputs and labels model for background subtraction

RGBD deep multi-scale network for background subtraction

Multiscale Cascaded Scene-Specific Convolutional Neural Networks for Background Subtraction

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A multi-scale feature fusion spatial–channel attention model for background subtraction

Abstract

Access this article

Similar content being viewed by others

A multi-scale inputs and labels model for background subtraction

RGBD deep multi-scale network for background subtraction

Multiscale Cascaded Scene-Specific Convolutional Neural Networks for Background Subtraction

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation