Scale channel attention network for image segmentation

Chen, Jianjun; Tian, Youliang; Ma, Wei; Mao, Zhengdong; Hu, Yue

doi:10.1007/s11042-020-08921-7

Scale channel attention network for image segmentation

Published: 18 November 2020

Volume 80, pages 16473–16489, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Jianjun Chen^1,2,
Youliang Tian³,
Wei Ma¹,
Zhengdong Mao¹ &
…
Yue Hu¹

425 Accesses
4 Citations
Explore all metrics

Abstract

The object scale variation results in a negative effect on image segmentation performance. Spatial pyramid pooling module or the attention mechanism are two widely used components in deep neural networks to handle this problem. Applying the single component commonly achieves limited benefit. To push the limit, in this paper, we propose a scale channel attention network (SCA-Net), which enhances the fusion feature of multi-scale by using channel attention components. After the multiple-scale pooling step, the multi-scale spatial information distributes in different feature channels. Meanwhile, the channel attention block is employed to guide SCA-Net focus on the object-relevant scale channels. We further explore the channel attention block and find a simple yet effective structure to combine global average pooling and global maximum pooling, resulting in a robust global information encoder. The SCA-Net does not contain any time-consuming post-processing, which is an extra step after the neural network for the segmentation result optimization. The assessment results on PASCAL VOC 2012 and Cityscapes benchmarks achieve the test set performance of 75.5% and 77.0%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 5

SSD: Single Shot MultiBox Detector

CBAM: Convolutional Block Attention Module

Attention mechanisms in computer vision: A survey

Article Open access 15 March 2022

Notes

https://www.tensorflow.org/

References

Adelson EH, Anderson CH, Bergen JR, Burt PJ, Ogden JM (1984) Pyramid methods in image processing. RCA Eng 29(6):33–41
Google Scholar
Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation, vol 39, pp 2481–2495
Bluche T (2016) Joint line segmentation and transcription for end-to-end handwritten paragraph recognition. In: Advances in neural information processing systems, pp 838–846
Bulo SR, Neuhold G, Kontschieder P (2017) Loss max-pooling for semantic image segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp 7082–7091
Burt PJ (1988) Attention mechanisms for vision in a dynamic world. In: [1998 Proceedings] 9th international conference on pattern recognition. IEEE, pp 977–987
Chen Liang-Chieh, Yi Y, Wang J, Wei X, Yuille AL (2016) Attention to scale: Scale-aware semantic image segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3640–3649
Chen L, Zhang H, Xiao J, Nie L, Shao J, Liu W, Chua T-S (2017) Sca-cnn: Spatial and channel-wise attention in convolutional networks for image captioning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5659–5667
Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 801–818
Chen Liang-Chieh, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article Google Scholar
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3223
Corbetta M, Shulman GL (2002) Control of goal-directed and stimulus-driven attention in the brain. Nature Rev Neurosci 3(3):201
Article Google Scholar
Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338
Article Google Scholar
Hariharan B, Arbeláez P, Bourdev L, Maji S, Malik J (2011) Semantic contours from inverse detectors. In: 2011 International conference on computer vision. IEEE, pp 991–998
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Computer vision and pattern recognition, 770–778
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969
Huan D, Liu Z, Shi R (2018) Salient object segmentation based on depth-aware image layering. Multimed Tools Appl, 1–14
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Huang Z, Wang X, Huang L, Huang C, Wei Y, Liu W (2018) Ccnet: Criss-cross attention for semantic segmentation. arXiv:1811.11721
Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 11:1254–1259
Article Google Scholar
Jie H, Li S, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Lin G, Shen C, Van Den Hengel A, Reid I (2016) Efficient piecewise training of deep structured models for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3194–3203
Lin T-Y, Dollár P., Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Lin G, Milan A, Shen C, Reid I (2017) Refinenet: Multi-path refinement networks for high-resolution semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1925–1934
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Milletari F, Navab N, Ahmadi S-A (2016) V-net: Fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV). IEEE, pp 565–571
Mnih V, Heess N, Heess AG, et al. (2014) Recurrent models of visual attention. In: Advances in neural information processing systems, pp 2204–2212
Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV). IEEE Computer Society, pp 1520–1528
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 234–241
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Tian Y, Guo J, Yulei W, Lin H (2019) Towards attack and defense views of rational delegation of computation. IEEE Access
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Wang C, Yang J, Wang K, Lai S-H (2017) Multi-scale energy optimization for object proposal generation. Multimed Tools Appl 76(8):10481–10499
Article Google Scholar
Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3156–3164
Woo S, Park J, Lee J-Y, In SK (2018) Cbam Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 3–19
Xie H, Yang D, Sun N, Chen Z, Zhang Y (2019) Automated pulmonary nodule detection in ct images using deep convolutional neural networks. Pattern Recogn 85:109–119
Article Google Scholar
Xie H, Fang S, Zha Z-J, Yang Y, Li Y, Zhang Y (2019) Convolutional attention networks for scene text recognition. ACM Trans Multimed Comput Comm Appl (TOMM) 15(1s):3
Google Scholar
Zhang Y, Li K, Li K, Wang L, Zhong B, Yun F (2018) Image super-resolution using very deep residual channel attention networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 286–301
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2881–2890
Zheng S, Jayasumana S, Romera-Paredes B, Vineet V, Zhizhong S, Dalong D, Huang C, Torr PHS (2015) Conditional random fields as recurrent neural networks. In: Proceedings of the IEEE international conference on computer vision, pp 1529–1537
Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2921–2929

Download references

Acknowledgements

This paper is partly supported by the National Key Research and Development Program of China (2017YFB0803301) and the Major Scientific and Technological Special Project of Guizhou Province (20183001).

Author information

Authors and Affiliations

National Engineering Laboratory for Information Security Technologies, Institute of Information Engineering, Chinese Academy of Sciences, Beijing, 100093, China
Jianjun Chen, Wei Ma, Zhengdong Mao & Yue Hu
School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Jianjun Chen
Guizhou Provincial Key Laboratory of Public Big Data, College of Computer Science and Technology, GuiZhou University, Guiyang, Guizhou, 550025, China
Youliang Tian

Authors

Jianjun Chen
View author publications
You can also search for this author in PubMed Google Scholar
Youliang Tian
View author publications
You can also search for this author in PubMed Google Scholar
Wei Ma
View author publications
You can also search for this author in PubMed Google Scholar
Zhengdong Mao
View author publications
You can also search for this author in PubMed Google Scholar
Yue Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Youliang Tian.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, J., Tian, Y., Ma, W. et al. Scale channel attention network for image segmentation. Multimed Tools Appl 80, 16473–16489 (2021). https://doi.org/10.1007/s11042-020-08921-7

Download citation

Received: 07 May 2019
Revised: 18 February 2020
Accepted: 07 April 2020
Published: 18 November 2020
Issue Date: May 2021
DOI: https://doi.org/10.1007/s11042-020-08921-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scale channel attention network for image segmentation

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

CBAM: Convolutional Block Attention Module

Attention mechanisms in computer vision: A survey

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Scale channel attention network for image segmentation

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

CBAM: Convolutional Block Attention Module

Attention mechanisms in computer vision: A survey

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation