Simple yet effective joint guidance learning for few-shot semantic segmentation

Chang, Zhaobin; Lu, Yonggang; Ran, Xingcheng; Gao, Xiong; Zhao, Hong

doi:10.1007/s10489-023-04937-2

Simple yet effective joint guidance learning for few-shot semantic segmentation

Published: 26 August 2023

Volume 53, pages 26603–26621, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Zhaobin Chang¹,
Yonggang Lu ORCID: orcid.org/0000-0001-8926-2039¹,
Xingcheng Ran¹,
Xiong Gao¹ &
…
Hong Zhao²

323 Accesses
Explore all metrics

Abstract

Fully-supervised semantic segmentation methods are difficult to generalize to novel objects, and their fine-tuning often requires a sufficient number of fully-labeled images. Few-shot semantic segmentation (FSS) has recently attracted lots of attention due to its excellent capability for segmenting the novel object with only a few labeled images. Most of recent approaches follow the prototype learning paradigm and have made a significant improvement in segmentation performance. However, there exist two critical bottleneck problems to be solved. (1) Previous methods mainly focus on mining the foreground information of the target object, and class-specific prototypes are generated by solely leveraging average operation on the whole support image, which may lead to information loss, underutilization, or semantic confusion of the object. (2) Most existing methods unilaterally guide the object segmentation in the query image with support images, which may result in semantic misalignment due to the diversity of objects in the support and query sets. To alleviate the above challenging problems, we propose a simple yet effective joint guidance learning architecture to generate and align more compact and robust prototypes from two aspects. (1) We propose a coarse-to-fine prototype generation module to generate coarse-grained foreground prototypes and fine-grained background prototypes. (2) We design a joint guidance learning module for the prototype evaluation and optimization on both support and query images. Extensive experiments show that the proposed method can achieve superior segmentation results on PASCAL-5\(^{i}\) and COCO-20\(^{i}\) datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Target-Aware Bi-Transformer for Few-Shot Segmentation

Self-support Few-Shot Semantic Segmentation

A lightweight siamese transformer for few-shot semantic segmentation

Article 29 February 2024

Data Availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request

References

Yu H, Yang Z, Tan L, Wang Y, Sun W, Sun M, Tang Y (2018) Methods and datasets on semantic segmentation: A review. Neurocomputing 304:82–103
Article Google Scholar
Kim S, An S, Chikontwe P, Park SH (2021) Bidirectional rnn-based few shot learning for 3d medical image segmentation. Proceedings of the AAAI Conference on Artificial Intelligence 35:1808–1816
Article Google Scholar
Zhao N, Chua T-S, Lee GH (2021) Few-shot 3d point cloud semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 8873–8882
Kalluri T, Chandraker M (2022) Cluster-to-adapt: Few shot domain adaptation for semantic segmentation across disjoint labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 4121–4131
Wang Y, Zhang J, Kan M, Shan S, Chen X (2020) Self-supervised equivariant attention mechanism for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 12275–12284
Lin D, Dai J, Jia J, He K, Sun J (2016) Scribblesup: Scribble-supervised convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p 3159–3167
Pathak D, Krahenbuhl P, Darrell T (2015) Constrained convolutional neural networks for weakly supervised segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, p 1796–1804
Dai J, He K, Sun J (2015) Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, p 1635–1643
Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. Advances in neural information processing systems 30
Li G, Jampani V, Sevilla-Lara L, Sun D, Kim J, Kim J (2021) Adaptive prototype learning and allocation for few-shot segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 8334–8343
Zhang C, Lin G, Liu F, Yao R, Shen C (2019) Canet: Class-agnostic segmentation networks with iterative refinement and attentive few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 5217–5226
Tian Z, Zhao H, Shu M, Yang Z, Li R, Jia J (2020) Prior guided feature enrichment network for few-shot segmentation. IEEE transactions on pattern analysis and machine intelligence 44(2):1050–1065
Article Google Scholar
Zhang C, Lin G, Liu F, Guo J, Wu Q, Yao R (2019) Pyramid graph networks with connection attentions for region-based one-shot semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, p 9587–9595
Wang K, Liew JH, Zou Y, Zhou D, Feng J (2019) Panet: Few-shot image semantic segmentation with prototype alignment. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, p 9197–9206
Rao X, Lu T, Wang Z, Zhang Y (2022) Few-shot semantic segmentation via frequency guided neural network. IEEE Signal Processing Letters 29:1092–1096
Article Google Scholar
Chang Z, Lu Y, Wang X, Ran X (2022) Mgnet: Mutual-guidance network for few-shot semantic segmentation. Eng Appl Artif Intell 116:105431
Article Google Scholar
Fan Q, Pei W, Tai Y-W, Tang C-K (2022) Self-support few-shot semantic segmentation. In: Proceedings of the European Conference on Computer Vision, p 701–719
Chen J, Gao B-B, Lu Z, Xue J-H, Wang C, Liao Q (2021) Scnet: Enhancing few-shot semantic segmentation by self-contrastive background prototypes. arXiv preprint. arXiv:2104.09216
Yang L, Zhuo W, Qi L, Shi Y, Gao Y (2021) Mining latent classes for few-shot segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, p 8721–8730
Zhang J-W, Sun Y, Yang Y, Chen W (2022) Feature-proxy transformer for few-shot segmentation. arXiv preprint. arXiv:2210.06908
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S et al. (2020) An image is worth 16x16 words: Transformers for image recognition at scale. In: Proceedings of the International Conference on Learning Representations. https://doi.org/10.48550/arXiv.2010.11929
Sun G, Liu Y, Liang J, Van Gool L (2021) Boosting few-shot semantic segmentation with transformers. arXiv preprint. arXiv:2108.02266
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p 3431–3440
Dai J, Qi H, Xiong Y, Li Y, Zhang G, Hu H, Wei Y (2017) Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, p 764–773
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p 2881–2890
Wei Y, Xiao H, Shi H, Jie Z, Feng J, Huang TS (2018) Revisiting dilated convolution: A simple approach for weakly-and semi-supervised semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p 7268–7277
Liu J, Bao Y, Xie G-S, Xiong H, Sonke J-J, Gavves E (2022) Dynamic prototype convolution network for few-shot semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 11553–11562
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Article MathSciNet Google Scholar
Wu T, Huang J, Gao G, Wei X, Wei X, Luo X, Liu CH (2021) Embedded discriminative attention mechanism for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 16765–16774
Jiang P-T, Han L-H, Hou Q, Cheng M-M, Wei Y (2021) Online attention accumulation for weakly supervised semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 44(10):7062–7077. https://doi.org/10.1109/TPAMI.2021.3092573
Article Google Scholar
Huang Y, Kang D, Jia W, Liu L, He X (2022) Channelized axial attention-considering channel relation within spatial attention for semantic segmentation. Proceedings of the AAAI Conference on Artificial Intelligence 36:1016–1025
Article Google Scholar
Shaban A, Bansal S, Liu Z, Essa I, Boots B (2017) One-shot learning for semantic segmentation. In: Proceedings of the European Conference on Computer Vision, p 1–17
Wu Z, Shi X, Lin G, Cai J (2021) Learning meta-class memory for few-shot semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, p 517–526
Lu Z, He S, Zhu X, Zhang L, Song Y-Z, Xiang T (2021) Simpler is better: Few-shot semantic segmentation with classifier weight transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, p 8741–8750
Wang W, Duan L, Wang Y, En Q, Fan J, Zhang Z (2022) Remember the difference: Cross-domain few-shot semantic segmentation via meta-memory transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 7065–7074
Mao B, Wang L, Xiang S, Pan C (2022) Task-aware adaptive attention learning for few-shot semantic segmentation. Neurocomputing 494:104–115
Article Google Scholar
Yang B, Wan F, Liu C, Li B, Ji X, Ye Q (2021) Part-based semantic transform for few-shot semantic segmentation. IEEE Transactions on Neural Networks and Learning Systems. https://doi.org/10.1109/TNNLS.2021.3084252
Article Google Scholar
Dong N, Xing EP (2018) Few-shot semantic segmentation with prototype learning. In: Proceedings of the British Machine Vision Conference, p 79–91
Zhang X, Wei Y, Yang Y, Huang TS (2020) Sg-one: Similarity guidance network for one-shot semantic segmentation. IEEE Ttransactions on Cybernetics 50(9):3855–3865. https://doi.org/10.1109/TCYB.2020.2992433
Article Google Scholar
Ding H, Zhang H, Jiang X (2023) Self-regularized prototypical network for few-shot semantic segmentation. Pattern Recogn 133:109018
Article Google Scholar
Yang B, Liu C, Li B, Jiao J, Ye Q (2020) Prototype mixture models for few-shot semantic segmentation. In: European Conference on Computer Vision, p 763–778
Vinyals O, Blundell C, Lillicrap T, Kavukcuoglu K, Wierstra D, (2016) Matching networks for one shot learning. Advances in Neural Information Processing Systems 29:3630–3648
Yao X, Cao Q, Feng X, Cheng G, Han J (2021) Scale-aware detailed matching for few-shot aerial image semantic segmentation. IEEE Trans Geosci Remote Sens 60:1–11
Google Scholar
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: Proceedings of the European Conference on Computer Vision, p 740–755
Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338. https://doi.org/10.1007/s11263-009-0275-4
Article Google Scholar
Hariharan B, Arbeláez P, Bourdev L, Maji S, Malik J (2011) Semantic contours from inverse detectors. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, p 991–998. https://doi.org/10.1109/ICCV.2011.6126343
Boudiaf M, Kervadec H, Masud ZI, Piantanida P, Ben Ayed I, Dolz J (2021) Few-shot segmentation without meta-learning: A good transductive inference is all you need? In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 13979–13988
Liu W, Zhang C, Lin G, Liu F (2020) Crnet: Cross-reference networks for few-shot segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 4165–4173
Nguyen K, Todorovic S (2019) Feature weighting and boosting for few-shot segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, p 622–631
Siam M, Oreshkin B, Jagersand M (2019) Adaptive masked proxies for few-shot segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, p 5249–5258
Zhang X, Wei Y, Li Z, Yan C, Yang Y (2021) Rich embedding features for one-shot semantic segmentation. IEEE Transactions on Neural Networks and Learning Systems 33(11):6484–6493
Article Google Scholar
Gao G, Fang Z, Han C, Wei Y, Liu CH, Yan S (2022) Drnet: Double recalibration network for few-shot semantic segmentation. IEEE Trans Image Process 31:6733–6746. https://doi.org/10.1109/TIP.2022.3215905
Article Google Scholar
Liu Y, Zhang X, Zhang S, He X (2020) Part-aware prototype network for few-shot semantic segmentation. In: Proceedings of the European Conference on Computer Vision, p 142–158
Wang H, Yang Y, Jiang X, Cao X, Zhen X (2020) You only need the image: Unsupervised few-shot semantic segmentation with co-guidance network. In: 2020 IEEE International Conference on Image Processing (ICIP), p 1496–1500. https://doi.org/10.1109/ICIP40778.2020.9190849
Pambala AK, Dutta T, Biswas S (2021) Sml: Semantic meta-learning for few-shot semantic segmentation. Pattern Recogn Lett 147:93–99
Article Google Scholar
Liu B, Ding Y, Jiao J, Ji X, Ye Q (2021) Anti-aliasing semantic reconstruction for few-shot semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 9747–9756
Wang H, Zhang X, Hu Y, Yang Y, Cao X, Zhen X (2020) Few-shot semantic segmentation with democratic attention networks. In: Proceedings of the European Conference on Computer Vision, p 730–746
Zhang G, Kang G, Yang Y, Wei Y (2021) Few-shot segmentation via cycle-consistent transformer. Advances in Neural Information Processing Systems 34:21984–21996
Google Scholar
Wang H, Yang Y, Cao X, Zhen X, Snoek C, Shao L (2021) Variational prototype inference for few-shot semantic segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, p 525–534
Lang C, Cheng G, Tu B, Han J (2022) Learning what not to segment: A new perspective on few-shot segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 8057–8067
Zhang B, Xiao J, Qin T (2021) Self-guided and cross-guided learning for few-shot segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 8312–8321
Liu Y, Liu N, Cao Q, Yao X, Han J, Shao L (2022) Learning non-target knowledge for few-shot semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 11573–11582
Tang Y, Yu Y (2022) Query-guided prototype learning with decoder alignment and dynamic fusion in few-shot segmentation. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM). https://doi.org/10.1145/3555314
Xie G-S, Liu J, Xiong H, Shao L (2021) Scale-aware graph neural network for few-shot semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 5475–5484
Liu B, Jiao J, Ye Q (2021) Harmonic feature activation for few-shot semantic segmentation. IEEE Trans Image Process 30:3142–3153

Download references

Acknowledgements

This work was funded by the National Key Research and Development Program of China (Grant No. 2017YFE0111900), the National Natural Science Foundation of China (Grant No. 62166025), the Science and Technology Program of Gansu Province (Grant No. 23JRRA1133), and the Gansu Haizhi Characteristic Demonstration Project (Grant No. GSHZTS 2022-2).

Author information

Authors and Affiliations

School of Information Science and Engineering, Lanzhou University, Lanzhou, 730000, China
Zhaobin Chang, Yonggang Lu, Xingcheng Ran & Xiong Gao
School of Computer and Communication, Lanzhou University of Technology, Lanzhou, 730050, China
Hong Zhao

Authors

Zhaobin Chang
View author publications
You can also search for this author in PubMed Google Scholar
Yonggang Lu
View author publications
You can also search for this author in PubMed Google Scholar
Xingcheng Ran
View author publications
You can also search for this author in PubMed Google Scholar
Xiong Gao
View author publications
You can also search for this author in PubMed Google Scholar
Hong Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yonggang Lu.

Ethics declarations

Conflicts of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chang, Z., Lu, Y., Ran, X. et al. Simple yet effective joint guidance learning for few-shot semantic segmentation. Appl Intell 53, 26603–26621 (2023). https://doi.org/10.1007/s10489-023-04937-2

Download citation

Accepted: 02 August 2023
Published: 26 August 2023
Issue Date: November 2023
DOI: https://doi.org/10.1007/s10489-023-04937-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Simple yet effective joint guidance learning for few-shot semantic segmentation

Abstract

Access this article

Similar content being viewed by others

Target-Aware Bi-Transformer for Few-Shot Segmentation

Self-support Few-Shot Semantic Segmentation

A lightweight siamese transformer for few-shot semantic segmentation

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Simple yet effective joint guidance learning for few-shot semantic segmentation

Abstract

Access this article

Similar content being viewed by others

Target-Aware Bi-Transformer for Few-Shot Segmentation

Self-support Few-Shot Semantic Segmentation

A lightweight siamese transformer for few-shot semantic segmentation

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation