BG-Net: boundary-guidance network for object consistency maintaining in semantic segmentation

Cheng, Xiji; Huang, Shiliang; Liao, Bingyan; Wang, Yayun; Luo, Xiao

doi:10.1007/s00371-023-02787-0

BG-Net: boundary-guidance network for object consistency maintaining in semantic segmentation

Original article
Published: 15 February 2023

Volume 40, pages 373–391, (2024)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Xiji Cheng²^na1,
Shiliang Huang¹^na1,
Bingyan Liao ORCID: orcid.org/0000-0001-9544-4728¹,
Yayun Wang¹ &
…
Xiao Luo²

335 Accesses
1 Altmetric
Explore all metrics

Abstract

Semantic segmentation suffers from boundary shift and shape deformation problems due to the neglect of overall guidance information. Motivated by that the object boundaries have a stronger representation for overall information of target objects, we propose a simple yet effective network, boundary-guidance network (BG-Net), for object consistency maintaining in semantic segmentation. In our work, we explore the pixels both near and far from the boundary, characterizing each pixel by utilizing pixel-boundary association. The boundary feature is integrated into the segmentation feature to mitigate boundary shift and shape deformation problem. We explicitly supervise the angle and distance information of pixels pointing to the nearest object boundary. Then the association can be learned by geometric modelling. Meanwhile, the low-level feature emphasized up-sampling (LFEU) module is designed to supplement the detail representation in high-level feature without direct interference. Finally, we evaluate our method on Cityscapes and CamVid datasets. The experimental results demonstrate the superiority of our BG-Net.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving Semantic Segmentation via Decoupled Body and Edge Supervision

Global Boundary Refinement for Semantic Segmentation via Optimal Transport

SegFix: Model-Agnostic Boundary Refinement for Segmentation

Data Availability

The datasets used in this paper are public datasets and can be obtained by contacting the relevant providers.

References

Peng, G., Yang, S., Wang, H.: Refine for semantic segmentation based on parallel convolutional network with attention model. Neural Process. Lett. 53(6), 4177–4188 (2021)
Article Google Scholar
Liu, S., Ye, H., Jin, K., Cheng, H.: Ct-unet: context-transfer-unet for building segmentation in remote sensing images. Neural Process. Lett. 53(6), 4257–4277 (2021)
Article Google Scholar
Everingham, M., Gool, L.V., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vision 88(2), 303–338 (2010)
Article Google Scholar
Lin, T.Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) European Conference on Computer Vision. Springer, Cham (2014)
Google Scholar
Mottaghi, R., Chen, X., Liu, X., Cho, N.G., Lee, S.W., Fidler, S., Yuille, A.: The role of context for object detection and semantic segmentation in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 891-898 (2014)
Neuhold, G., Ollmann, T., Bulo, S.R., Kontschieder, P.: The mapillary vistas dataset for semantic understanding of street scenes. In: IEEE International Conference on Computer Vision (2017)
Abu Alhaija, H., Mustikovela, S.K., Mescheder, L., Geiger, A., Rother, C.: Augmented reality meets computer vision : efficient data generation for urban driving scenes. Int. J. Comput. Vis. 2, 1–12 (2017)
Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. PAMI 40(4), 834–848 (2018)
Article Google Scholar
Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. In: ICLR (2016)
Chen, L.C., Papandreou, G., Scroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. CoRR (2017) arXiv:1706.05587
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)
Yuan, Y., Chen, X., Wang, J.: Object-contextual representations for semantic segmentation. arXiv preprint arXiv:1909.11065 (2019)
Zhang, H., Dana, K., Shi, J., Zhang, Z., Wang, X., Tyagi, A., Agrawal, A.: Context encoding for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7151–7160 (2018)
Zhang, H., Zhang, H., Wang, C., Xie, J.: Co-occurrent features in semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 548–557 (2019)
Li, X., Zhong, Z., Wu, J., Yang, Y., Lin, Z., Liu, H.: Expectation-maximization attention networks for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 9167–9176 (2019)
He, J., Deng, Z., Zhou, L., Wang, Y., Qiao, Y.: Adaptive pyramid context network for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7519–7528 (2019)
Zhang, D., Zhang, H., Tang, J., Hua, X.-S., Sun, Q.: Self-regulation for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6953–6963 (2021)
Zheng, S., Lu, J., Zhao, H., Zhu, X., Luo, Z., Wang, Y., Fu, Y., Feng, J., Xiang, T., Torr, P.H., et al.: Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6881–6890 (2021)
Chen, X., Yuan, Y., Zeng, G., Wang, J.: Semi-supervised semantic segmentation with cross pseudo supervision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2613–2622 (2021)
Acuna, D., Kar, A., Fidler, S.: Devil is in the edges: Learning semantic boundaries from noisy annotations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 11075–11083 (2019)
Chen, X., Williams, B.M., Vallabhaneni, S.R., Czanner, G., Williams, R., Zheng, Y.: Learning active contour models for medical image segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 11632–11640 (2019)
Sun, K., Zhao, Y., Jiang, B., Cheng, T., Xiao, B., Liu, D., Mu, Y., Wang, X., Liu, W., Wang, J.: High-resolution representations for labeling pixels and regions. arXiv preprint arXiv:1904.04514 (2019)
Ke, T.-W., Hwang, J.-J., Liu, Z., Yu, S.X.: Adaptive affinity fields for semantic segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 587–602 (2018)
Liu, Y., Cheng, M.-M., Hu, X., Wang, K., Bai, X.: Richer convolutional features for edge detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3000–3009 (2017)
Ding, H., Jiang, X., Liu, A.Q., Thalmann, N.M., Wang, G.: Boundary-aware feature propagation for scene segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 6819–6829 (2019)
Fieraru, M., Khoreva, A., Pishchulin, L., Schiele, B.: Learning to refine human pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 205–214 (2018)
Huang, Z., Wang, X., Huang, L., Huang, C., Wei, Y., Liu, W.: Ccnet: Criss-cross attention for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 603–612 (2019)
Kuo, W., Angelova, A., Malik, J., Lin, T.-Y.: Shapemask: Learning to segment novel objects by refining shape priors. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 9207–9216 (2019)
Kirillov, A., Wu, Y., He, K., Girshick, R.: Pointrend: Image segmentation as rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9799–9808 (2020)
Yuan, Y., Xie, J., Chen, X., Wang, J.: Segfix: Model-agnostic boundary refinement for segmentation. In: European Conference on Computer Vision, pp. 489–506. Springer, (2020)
Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Article Google Scholar
Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1520–1528 (2015)
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-assisted Intervention, pp. 234–241. Springer, (2015)
Peng, X., Feris, R.S., Wang, X., Metaxas, D.N.: A recurrent encoder-decoder network for sequential face alignment. In: European Conference on Computer Vision, pp. 38–56. Springer, (2016)
Wu, H., Zhang, J., Huang, K., Liang, K., Yu, Y.: Fastfcn: Rethinking dilated convolution in the backbone for semantic segmentation. arXiv preprint arXiv:1903.11816 (2019)
Kirillov, A., Girshick, R., He, K., Dollár, P.: Panoptic feature pyramid networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6399–6408 (2019)
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 801–818 (2018)
Chen, L.-C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)
Yang, M., Yu, K., Zhang, C., Li, Z., Yang, K.: Denseaspp for semantic segmentation in street scenes. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3684–3692 (2018)
Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H.: Dual attention network for scene segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3146–3154 (2019)
Zhu, Z., Xu, M., Bai, S., Huang, T., Bai, X.: Asymmetric non-local neural networks for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 593–602 (2019)
Zhang, D., Zhang, H., Tang, J., Wang, M., Hua, X., Sun, Q.: Feature pyramid transformer. arXiv preprint arXiv:2007.09451 (2020)
Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794–7803 (2018)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Xiong, Y., Liao, R., Zhao, H., Hu, R., Bai, M., Yumer, E., Urtasun, R.: Upsnet: A unified panoptic segmentation network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8818–8826 (2019)
Liao, X., Yin, J., Chen, M., Qin, Z.: Adaptive payload distribution in multiple images steganography based on image texture features. IEEE Trans. Dependable Secur. Comput. (2020). https://doi.org/10.1109/TDSC.2020.3004708
Article Google Scholar
Liao, X., Yu, Y., Li, B., Li, Z., Qin, Z.: A new payload partition strategy in color image steganography. IEEE Trans. Circ. Syst. Video Technol. 30(3), 685–696 (2019)
Article Google Scholar
Liao, X., Li, K., Zhu, X., Liu, K.R.: Robust detection of image operator chain with two-stream convolutional neural network. IEEE J. Select. Topics Signal Process. 14(5), 955–968 (2020)
Article Google Scholar
Sun, Y., Chen, Q., He, X., Wang, J., Feng, H., Han, J., Ding, E., Cheng, J., Li, Z., Wang, J.: Singular value fine-tuning: Few-shot segmentation requires few-parameters fine-tuning. arXiv:2206.06122 [cs.CV] (2022)
Li, Z., Sun, Y., Zhang, L., Tang, J.: Ctnet: context-based tandem network for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. (2021). https://doi.org/10.1109/TPAMI.2021.3132068
Article Google Scholar
Sun, Y., Li, Z.: Ssa: Semantic structure aware inference for weakly pixel-wise dense predictions without cost. arXiv preprint arXiv:2111.03392 (2021)
Ma, Z., Yuan, M., Gu, J., Meng, W., Xu, S., Zhang, X.: Triple-strip attention mechanism-based natural disaster images classification and segmentation. Vis. Comput. 38(9), 3163–3173 (2022)
Article Google Scholar
Fu, Y., Chen, Q., Zhao, H.: CGFNet: cross-guided fusion network for RGB-thermal semantic segmentation. Vis. Comput. 38(9–10), 3243–3252 (2022)
Article Google Scholar
Liu, T., Cai, Y., Zheng, J., Thalmann, N.M.: Beacon: a boundary embedded attentional convolution network for point cloud instance segmentation. Vis. Comput. 38(7), 2303–2313 (2022)
Article Google Scholar
Takikawa, T., Acuna, D., Jampani, V., Fidler, S.: Gated-scnn: Gated shape cnns for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5229–5238 (2019)
Li, X., Li, X., Zhang, L., Cheng, G., Shi, J., Lin, Z., Tan, S., Tong, Y.: Improving semantic segmentation via decoupled body and edge supervision. arXiv preprint arXiv:2007.10035 (2020)
Roy, K., Sahay, R.R.: A robust multi-scale deep learning approach for unconstrained hand detection aided by skin segmentation. Vis. Comput. 38(8), 2801–2825 (2022)
Article Google Scholar
Jiang, M., Zhai, F., Kong, J.: Sparse attention module for optimizing semantic segmentation performance combined with a multi-task feature extraction network. Vis. Comput. 38(7), 2473–2488 (2022)
Article Google Scholar
Lyu, C., Hu, G., Wang, D.: Attention to fine-grained information: hierarchical multi-scale network for retinal vessel segmentation. Vis. Comput. (2020). https://doi.org/10.1007/s00371-020-02018-w
Article Google Scholar
Cheng, Z., Qu, A., He, X.: Contour-aware semantic segmentation network with spatial attention mechanism for medical image. Vis. Comput. 38(3), 749–762 (2022)
Article Google Scholar
Krähenbühl, P., Koltun, V.: Efficient inference in fully connected crfs with gaussian edge potentials. In: Advances in Neural Information Processing Systems, pp. 109–117 (2011)
Dias, P.A., Medeiros, H.: Semantic segmentation refinement by monte carlo region growing of high confidence detections. In: Asian Conference on Computer Vision, pp. 131–146. Springer, (2018)
Zhang, C., Lin, G., Liu, F., Yao, R., Shen, C.: Canet: Class-agnostic segmentation networks with iterative refinement and attentive few-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5217–5226 (2019)
Fan, M., Lai, S., Huang, J., Wei, X., Chai, Z., Luo, J., Wei, X.: Rethinking BiSeNet for real-time semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9716-9725 (2021)
Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. Adv. Neural Inf. Process. Syst. 28 (2015)
Kimmel, R., Kiryati, N., Bruckstein, A.M.: Sub-pixel distance maps and weighted distance transforms. J. Math. Imaging Vis. 6(2), 223–233 (1996)
Article MathSciNet Google Scholar
Brostow, G.J., Shotton, J., Fauqueur, J., Cipolla, R.: Segmentation and recognition using structure from motion point clouds. In: ECCV (1), pp. 44–57 (2008)
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint arXiv:1605.07146 (2016)
Tao, A., Sapra, K., Catanzaro, B.: Hierarchical multi-scale attention for semantic segmentation. arXiv preprint arXiv:2005.10821 (2020)
Chen, Y., Kalantidis, Y., Li, J., Yan, S., Feng, J.: \(A^2\)-nets: Double attention networks. Adv. Neural Inform. Process. Syst. 31 (2018)
Amirul Islam, M., Rochan, M., Bruce, N.D., Wang, Y.: Gated feedback refinement network for dense image labeling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3751–3759 (2017)
Mohan, R., Valada, A.: Efficientps: Efficient panoptic segmentation. arXiv preprint arXiv:2004.02307 (2020)
Cheng, B., Collins, M.D., Zhu, Y., Liu, T., Huang, T.S., Adam, H., Chen, L.-C.: Panoptic-deeplab: A simple, strong, and fast baseline for bottom-up panoptic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12475–12485 (2020)
Chen, L.-C., Lopes, R.G., Cheng, B., Collins, M.D., Cubuk, E.D., Zoph, B., Adam, H., Shlens, J.: Naive-student: Leveraging semi-supervised learning in video sequences for urban scene segmentation. European Conference on Computer Vision, ECCV 2020, pp. 695–714
Kundu, A., Vineet, V., Koltun, V.: Feature space optimization for semantic video segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3168–3175 (2016)
Bilinski, P., Prisacariu, V.: Dense decoder shortcut connections for single-pass semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6596–6605 (2018)
Chandra, S., Couprie, C., Kokkinos, I.: Deep spatio-temporal random fields for efficient video segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8915–8924 (2018)
Zhu, Y., Sapra, K., Reda, F.A., Shih, K.J., Newsam, S., Tao, A., Catanzaro, B.: Improving semantic segmentation via video propagation and label relaxation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8856–8865 (2019)

Download references

Acknowledgements

This study was funded by Zhejiang Provincial Key Laboratory of Harmonized Application of Vision & Transmission, No. 1199, Bin’an Road, Binjiang District, Hangzhou, China.

Author information

Xiji Cheng and Shiliang Huang contributed equally to this work.

Authors and Affiliations

ZheJiang Dahua Technology CO.,LTD, Hangzhou, 310000, China
Shiliang Huang, Bingyan Liao & Yayun Wang
Hangzhou Yulian Technology CO.,LTD, Hangzhou, 310000, China
Xiji Cheng & Xiao Luo

Authors

Xiji Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Shiliang Huang
View author publications
You can also search for this author in PubMed Google Scholar
Bingyan Liao
View author publications
You can also search for this author in PubMed Google Scholar
Yayun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiao Luo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bingyan Liao.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Cheng, X., Huang, S., Liao, B. et al. BG-Net: boundary-guidance network for object consistency maintaining in semantic segmentation. Vis Comput 40, 373–391 (2024). https://doi.org/10.1007/s00371-023-02787-0

Download citation

Accepted: 05 January 2023
Published: 15 February 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s00371-023-02787-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

BG-Net: boundary-guidance network for object consistency maintaining in semantic segmentation

Abstract

Access this article

Similar content being viewed by others

Improving Semantic Segmentation via Decoupled Body and Edge Supervision

Global Boundary Refinement for Semantic Segmentation via Optimal Transport

SegFix: Model-Agnostic Boundary Refinement for Segmentation

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

BG-Net: boundary-guidance network for object consistency maintaining in semantic segmentation

Abstract

Access this article

Similar content being viewed by others

Improving Semantic Segmentation via Decoupled Body and Edge Supervision

Global Boundary Refinement for Semantic Segmentation via Optimal Transport

SegFix: Model-Agnostic Boundary Refinement for Segmentation

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation