Skip to main content
Log in

Simple yet effective joint guidance learning for few-shot semantic segmentation

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Fully-supervised semantic segmentation methods are difficult to generalize to novel objects, and their fine-tuning often requires a sufficient number of fully-labeled images. Few-shot semantic segmentation (FSS) has recently attracted lots of attention due to its excellent capability for segmenting the novel object with only a few labeled images. Most of recent approaches follow the prototype learning paradigm and have made a significant improvement in segmentation performance. However, there exist two critical bottleneck problems to be solved. (1) Previous methods mainly focus on mining the foreground information of the target object, and class-specific prototypes are generated by solely leveraging average operation on the whole support image, which may lead to information loss, underutilization, or semantic confusion of the object. (2) Most existing methods unilaterally guide the object segmentation in the query image with support images, which may result in semantic misalignment due to the diversity of objects in the support and query sets. To alleviate the above challenging problems, we propose a simple yet effective joint guidance learning architecture to generate and align more compact and robust prototypes from two aspects. (1) We propose a coarse-to-fine prototype generation module to generate coarse-grained foreground prototypes and fine-grained background prototypes. (2) We design a joint guidance learning module for the prototype evaluation and optimization on both support and query images. Extensive experiments show that the proposed method can achieve superior segmentation results on PASCAL-5\(^{i}\) and COCO-20\(^{i}\) datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data Availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request

References

  1. Yu H, Yang Z, Tan L, Wang Y, Sun W, Sun M, Tang Y (2018) Methods and datasets on semantic segmentation: A review. Neurocomputing 304:82–103

    Article  Google Scholar 

  2. Kim S, An S, Chikontwe P, Park SH (2021) Bidirectional rnn-based few shot learning for 3d medical image segmentation. Proceedings of the AAAI Conference on Artificial Intelligence 35:1808–1816

    Article  Google Scholar 

  3. Zhao N, Chua T-S, Lee GH (2021) Few-shot 3d point cloud semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 8873–8882

  4. Kalluri T, Chandraker M (2022) Cluster-to-adapt: Few shot domain adaptation for semantic segmentation across disjoint labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 4121–4131

  5. Wang Y, Zhang J, Kan M, Shan S, Chen X (2020) Self-supervised equivariant attention mechanism for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 12275–12284

  6. Lin D, Dai J, Jia J, He K, Sun J (2016) Scribblesup: Scribble-supervised convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p 3159–3167

  7. Pathak D, Krahenbuhl P, Darrell T (2015) Constrained convolutional neural networks for weakly supervised segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, p 1796–1804

  8. Dai J, He K, Sun J (2015) Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, p 1635–1643

  9. Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. Advances in neural information processing systems 30

  10. Li G, Jampani V, Sevilla-Lara L, Sun D, Kim J, Kim J (2021) Adaptive prototype learning and allocation for few-shot segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 8334–8343

  11. Zhang C, Lin G, Liu F, Yao R, Shen C (2019) Canet: Class-agnostic segmentation networks with iterative refinement and attentive few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 5217–5226

  12. Tian Z, Zhao H, Shu M, Yang Z, Li R, Jia J (2020) Prior guided feature enrichment network for few-shot segmentation. IEEE transactions on pattern analysis and machine intelligence 44(2):1050–1065

    Article  Google Scholar 

  13. Zhang C, Lin G, Liu F, Guo J, Wu Q, Yao R (2019) Pyramid graph networks with connection attentions for region-based one-shot semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, p 9587–9595

  14. Wang K, Liew JH, Zou Y, Zhou D, Feng J (2019) Panet: Few-shot image semantic segmentation with prototype alignment. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, p 9197–9206

  15. Rao X, Lu T, Wang Z, Zhang Y (2022) Few-shot semantic segmentation via frequency guided neural network. IEEE Signal Processing Letters 29:1092–1096

    Article  Google Scholar 

  16. Chang Z, Lu Y, Wang X, Ran X (2022) Mgnet: Mutual-guidance network for few-shot semantic segmentation. Eng Appl Artif Intell 116:105431

    Article  Google Scholar 

  17. Fan Q, Pei W, Tai Y-W, Tang C-K (2022) Self-support few-shot semantic segmentation. In: Proceedings of the European Conference on Computer Vision, p 701–719

  18. Chen J, Gao B-B, Lu Z, Xue J-H, Wang C, Liao Q (2021) Scnet: Enhancing few-shot semantic segmentation by self-contrastive background prototypes. arXiv preprint. arXiv:2104.09216

  19. Yang L, Zhuo W, Qi L, Shi Y, Gao Y (2021) Mining latent classes for few-shot segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, p 8721–8730

  20. Zhang J-W, Sun Y, Yang Y, Chen W (2022) Feature-proxy transformer for few-shot segmentation. arXiv preprint. arXiv:2210.06908

  21. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S et al. (2020) An image is worth 16x16 words: Transformers for image recognition at scale. In: Proceedings of the International Conference on Learning Representations. https://doi.org/10.48550/arXiv.2010.11929

  22. Sun G, Liu Y, Liang J, Van Gool L (2021) Boosting few-shot semantic segmentation with transformers. arXiv preprint. arXiv:2108.02266

  23. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p 3431–3440

  24. Dai J, Qi H, Xiong Y, Li Y, Zhang G, Hu H, Wei Y (2017) Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, p 764–773

  25. Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p 2881–2890

  26. Wei Y, Xiao H, Shi H, Jie Z, Feng J, Huang TS (2018) Revisiting dilated convolution: A simple approach for weakly-and semi-supervised semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p 7268–7277

  27. Liu J, Bao Y, Xie G-S, Xiong H, Sonke J-J, Gavves E (2022) Dynamic prototype convolution network for few-shot semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 11553–11562

  28. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

    Article  MathSciNet  Google Scholar 

  29. Wu T, Huang J, Gao G, Wei X, Wei X, Luo X, Liu CH (2021) Embedded discriminative attention mechanism for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 16765–16774

  30. Jiang P-T, Han L-H, Hou Q, Cheng M-M, Wei Y (2021) Online attention accumulation for weakly supervised semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 44(10):7062–7077. https://doi.org/10.1109/TPAMI.2021.3092573

    Article  Google Scholar 

  31. Huang Y, Kang D, Jia W, Liu L, He X (2022) Channelized axial attention-considering channel relation within spatial attention for semantic segmentation. Proceedings of the AAAI Conference on Artificial Intelligence 36:1016–1025

    Article  Google Scholar 

  32. Shaban A, Bansal S, Liu Z, Essa I, Boots B (2017) One-shot learning for semantic segmentation. In: Proceedings of the European Conference on Computer Vision, p 1–17

  33. Wu Z, Shi X, Lin G, Cai J (2021) Learning meta-class memory for few-shot semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, p 517–526

  34. Lu Z, He S, Zhu X, Zhang L, Song Y-Z, Xiang T (2021) Simpler is better: Few-shot semantic segmentation with classifier weight transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, p 8741–8750

  35. Wang W, Duan L, Wang Y, En Q, Fan J, Zhang Z (2022) Remember the difference: Cross-domain few-shot semantic segmentation via meta-memory transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 7065–7074

  36. Mao B, Wang L, Xiang S, Pan C (2022) Task-aware adaptive attention learning for few-shot semantic segmentation. Neurocomputing 494:104–115

    Article  Google Scholar 

  37. Yang B, Wan F, Liu C, Li B, Ji X, Ye Q (2021) Part-based semantic transform for few-shot semantic segmentation. IEEE Transactions on Neural Networks and Learning Systems. https://doi.org/10.1109/TNNLS.2021.3084252

    Article  Google Scholar 

  38. Dong N, Xing EP (2018) Few-shot semantic segmentation with prototype learning. In: Proceedings of the British Machine Vision Conference, p 79–91

  39. Zhang X, Wei Y, Yang Y, Huang TS (2020) Sg-one: Similarity guidance network for one-shot semantic segmentation. IEEE Ttransactions on Cybernetics 50(9):3855–3865. https://doi.org/10.1109/TCYB.2020.2992433

    Article  Google Scholar 

  40. Ding H, Zhang H, Jiang X (2023) Self-regularized prototypical network for few-shot semantic segmentation. Pattern Recogn 133:109018

    Article  Google Scholar 

  41. Yang B, Liu C, Li B, Jiao J, Ye Q (2020) Prototype mixture models for few-shot semantic segmentation. In: European Conference on Computer Vision, p 763–778

  42. Vinyals O, Blundell C, Lillicrap T, Kavukcuoglu K, Wierstra D, (2016) Matching networks for one shot learning. Advances in Neural Information Processing Systems 29:3630–3648

  43. Yao X, Cao Q, Feng X, Cheng G, Han J (2021) Scale-aware detailed matching for few-shot aerial image semantic segmentation. IEEE Trans Geosci Remote Sens 60:1–11

    Google Scholar 

  44. Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: Proceedings of the European Conference on Computer Vision, p 740–755

  45. Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338. https://doi.org/10.1007/s11263-009-0275-4

    Article  Google Scholar 

  46. Hariharan B, Arbeláez P, Bourdev L, Maji S, Malik J (2011) Semantic contours from inverse detectors. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, p 991–998. https://doi.org/10.1109/ICCV.2011.6126343

  47. Boudiaf M, Kervadec H, Masud ZI, Piantanida P, Ben Ayed I, Dolz J (2021) Few-shot segmentation without meta-learning: A good transductive inference is all you need? In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 13979–13988

  48. Liu W, Zhang C, Lin G, Liu F (2020) Crnet: Cross-reference networks for few-shot segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 4165–4173

  49. Nguyen K, Todorovic S (2019) Feature weighting and boosting for few-shot segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, p 622–631

  50. Siam M, Oreshkin B, Jagersand M (2019) Adaptive masked proxies for few-shot segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, p 5249–5258

  51. Zhang X, Wei Y, Li Z, Yan C, Yang Y (2021) Rich embedding features for one-shot semantic segmentation. IEEE Transactions on Neural Networks and Learning Systems 33(11):6484–6493

    Article  Google Scholar 

  52. Gao G, Fang Z, Han C, Wei Y, Liu CH, Yan S (2022) Drnet: Double recalibration network for few-shot semantic segmentation. IEEE Trans Image Process 31:6733–6746. https://doi.org/10.1109/TIP.2022.3215905

    Article  Google Scholar 

  53. Liu Y, Zhang X, Zhang S, He X (2020) Part-aware prototype network for few-shot semantic segmentation. In: Proceedings of the European Conference on Computer Vision, p 142–158

  54. Wang H, Yang Y, Jiang X, Cao X, Zhen X (2020) You only need the image: Unsupervised few-shot semantic segmentation with co-guidance network. In: 2020 IEEE International Conference on Image Processing (ICIP), p 1496–1500. https://doi.org/10.1109/ICIP40778.2020.9190849

  55. Pambala AK, Dutta T, Biswas S (2021) Sml: Semantic meta-learning for few-shot semantic segmentation. Pattern Recogn Lett 147:93–99

    Article  Google Scholar 

  56. Liu B, Ding Y, Jiao J, Ji X, Ye Q (2021) Anti-aliasing semantic reconstruction for few-shot semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 9747–9756

  57. Wang H, Zhang X, Hu Y, Yang Y, Cao X, Zhen X (2020) Few-shot semantic segmentation with democratic attention networks. In: Proceedings of the European Conference on Computer Vision, p 730–746

  58. Zhang G, Kang G, Yang Y, Wei Y (2021) Few-shot segmentation via cycle-consistent transformer. Advances in Neural Information Processing Systems 34:21984–21996

    Google Scholar 

  59. Wang H, Yang Y, Cao X, Zhen X, Snoek C, Shao L (2021) Variational prototype inference for few-shot semantic segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, p 525–534

  60. Lang C, Cheng G, Tu B, Han J (2022) Learning what not to segment: A new perspective on few-shot segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 8057–8067

  61. Zhang B, Xiao J, Qin T (2021) Self-guided and cross-guided learning for few-shot segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 8312–8321

  62. Liu Y, Liu N, Cao Q, Yao X, Han J, Shao L (2022) Learning non-target knowledge for few-shot semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 11573–11582

  63. Tang Y, Yu Y (2022) Query-guided prototype learning with decoder alignment and dynamic fusion in few-shot segmentation. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM). https://doi.org/10.1145/3555314

  64. Xie G-S, Liu J, Xiong H, Shao L (2021) Scale-aware graph neural network for few-shot semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 5475–5484

  65. Liu B, Jiao J, Ye Q (2021) Harmonic feature activation for few-shot semantic segmentation. IEEE Trans Image Process 30:3142–3153

Download references

Acknowledgements

This work was funded by the National Key Research and Development Program of China (Grant No. 2017YFE0111900), the National Natural Science Foundation of China (Grant No. 62166025), the Science and Technology Program of Gansu Province (Grant No. 23JRRA1133), and the Gansu Haizhi Characteristic Demonstration Project (Grant No. GSHZTS 2022-2).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yonggang Lu.

Ethics declarations

Conflicts of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chang, Z., Lu, Y., Ran, X. et al. Simple yet effective joint guidance learning for few-shot semantic segmentation. Appl Intell 53, 26603–26621 (2023). https://doi.org/10.1007/s10489-023-04937-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-023-04937-2

Keywords

Navigation