Skip to main content
Log in

Category-Aware Siamese Learning Network for Few-Shot Segmentation

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

Few-shot segmentation (FS) which aims to segment unseen query image based on a few annotated support samples is an active problem in computer vision and multimedia field. It is known that the core issue of FS is how to leverage the annotated information from the support images to guide query image segmentation. Existing methods mainly adopt Siamese Convolutional Neural Network (SCNN) which first encodes both support and query images and then utilizes the masked Global Average Pooling (GAP) to facilitate query image pixel-level representation and segmentation. However, this pipeline generally fails to fully exploit the category/class coherent information between support and query images. For FS task, one can observe that both support and query images share the same category information. This inherent property provides an important cue for FS task. However, previous methods generally fail to fully exploit it for FS task. To overcome this limitation, in this paper, we propose a novel Category-aware Siamese Learning Network (CaSLNet) to encode both support and query images. The proposed CaSLNet conducts Category Consistent Learning (CCL) for both support images and query images and thus can achieve the information communication between support and query images more sufficiently. Comprehensive experimental results on several public datasets demonstrate the advantage of our proposed CaSLNet. Our code is publicly available at https://github.com/HuiSun123/CaSLN.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data Availability

The data that support the findings of this study are openly available at http://host.robots.ox.ac.uk/pascal/VOC/voc2012 and https://cocodataset.org/#download.

References

  1. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016. pp. 770–8.

  2. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017. pp. 4700–8.

  3. Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S. Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017. pp. 2117–25.

  4. Badrinarayanan V, Kendall A, Cipolla R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell. 2017;39(12):2481–95.

    Article  Google Scholar 

  5. Stan S, Rostami M. Privacy preserving domain adaptation for semantic segmentation of medical images. arXiv e-prints, 2101. 2021.

  6. Siam M, Elkerdawy S, Jagersand M, Yogamani S. Deep semantic segmentation for automated driving: Taxonomy, roadmap and challenges. In: 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC). 2017. pp. 1–8.

  7. Huang C-L, Chen J-J, Chen C-J, Wu Y-G. Geological segmentation on UAV aerial image using shape-based LSM with dominant color. In: 2016 30th International Conference on Advanced Information Networking and Applications Workshops (WAINA). 2016. pp. 928–33.

  8. Lang C, Cheng G, Tu B, Li C, Han J. Base and meta: A new perspective on few-shot segmentation. IEEE Trans Patt Anal Mach Intell. 2023.

  9. Moon S, Sohn SS, Zhou H, Yoon S, Pavlovic V, Khan MH, Kapadia M. MSI: Maximize support-set information for few-shot segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023. pp. 19266–76.

  10. Kang D, Cho M. Integrative few-shot learning for classification and segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022. pp. 9979–90.

  11. Vilalta R, Drissi Y. A perspective view and survey of meta-learning. Artif Intell Rev. 2002;18:77–95.

    Article  Google Scholar 

  12. Vanschoren J. Meta-learning: A survey. arXiv:1810.03548 [Preprint]. 2018. Available from http://arxiv.org/abs/1810.03548.

  13. Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning. PMLR; 2017. pp. 1126–35.

  14. Tian Z, Zhao H, Shu M, Yang Z, Jia J. Prior guided feature enrichment network for few-shot segmentation. IEEE Trans Pattern Anal Mach Intell. 2020;PP(99):1–1.

    Article  Google Scholar 

  15. Zhang B, Xiao J, Qin T. Self-guided and cross-guided learning for few-shot segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, (CVPR). 2021. pp. 8312–21.

  16. Wang K, Liew JH, Zou Y, Zhou D, Feng J. Panet: Few-shot image semantic segmentation with prototype alignment. In: IEEE/CVF International Conference on Computer Vision, (ICCV). 2019. pp. 9196–205.

  17. Xie G-S, Liu J, Xiong H, Shao L. Scale-aware graph neural network for few-shot semantic segmentation. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2021. pp. 5471–80.

  18. Zhuge Y, Shen C. Deep reasoning network for few-shot semantic segmentation. In: MM ’21: ACM Multimedia Conference. 2021. pp. 5344–52.

  19. Shaban A, Bansal S, Liu Z, Essa I, Boots B. One-shot learning for semantic segmentation. In: British Machine Vision Conference 2017, (BMVC). 2017.

  20. Nguyen K, Todorovic S. Feature weighting and boosting for few-shot segmentation. In: IEEE/CVF International Conference on Computer Vision, (ICCV). 2019. pp. 622–31.

  21. Garcia-Garcia A, Orts-Escolano S, Oprea S, Villena-Martinez V, Garcia-Rodriguez J. A review on deep learning techniques applied to semantic segmentation. arxiv 2017. arXiv:1704.06857 [Preprint]. 2020. Available from http://arxiv.org/abs/1704.06857.

  22. Zhou Q, Wu X, Zhang S, Kang B, Ge Z, Jan Latecki L. Contextual ensemble network for semantic segmentation. Pattern Recogn. 2022;122:108290.

    Article  Google Scholar 

  23. Sun M, Li P, Ren J, Wang Z. Attention mechanism enhanced multi-layer edge perception network for deep semantic medical segmentation. Cogn Comput. 2023;15(1):348–58.

    Article  Google Scholar 

  24. Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. 2015. pp. 234–41.

  25. Lin G, Milan A, Shen C, Reid ID. Refinenet: Multi-path refinement networks for high-resolution semantic segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, (CVPR). 2017. pp. 5168–77.

  26. Xie J, Yu L, Zhu L, Chen X. Semantic image segmentation method with multiple adjacency trees and multiscale features. Cogn Comput. 2017;9(2):168–79.

    Article  Google Scholar 

  27. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS. IEEE Trans Pattern Anal Mach Intell. 2018;40(4):834–48.

    Article  Google Scholar 

  28. Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell. 2015;39(4):640–51.

    Google Scholar 

  29. Yu C, Wang J, Peng C, Gao C, Yu G, Sang N. Learning a discriminative feature network for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018. pp. 1857–66.

  30. Feng J, Wang X, Liu W. Deep graph cut network for weakly-supervised semantic segmentation. Sci China Inf Sci. 2021;64(3).

  31. Yi R, Huang Y, Guan Q, Pu M, Zhang R. Learning from pixel-level label noise: A new perspective for semi-supervised semantic segmentation. IEEE Trans Image Process. 2022;31:623–35.

    Article  Google Scholar 

  32. Zhou T, Wang W, Konukoglu E, Van Gool L. Rethinking semantic segmentation: A prototype view. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2022. pp. 2582–93.

  33. Guo M-H, Lu C-Z, Hou Q, Liu Z, Cheng M-M, Hu S-M. Segnext: Rethinking convolutional attention design for semantic segmentation. Adv Neural Inf Process Syst. 2022;35:1140–56.

    Google Scholar 

  34. Cai Q, Pan Y, Yao T, Yan C, Mei T. Memory matching networks for one-shot image recognition. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018. pp. 4080–8.

  35. Wang YN, Tian X, Zhong G. Ffnet: Feature fusion network for few-shot semantic segmentation. Cogn Comput. 2022;14:875–86.

    Article  Google Scholar 

  36. Gidaris S, Komodakis N. Dynamic few-shot visual learning without forgetting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018. pp. 4367–75.

  37. Sung F, Yang Y, Zhang L, Xiang T, Torr PH, Hospedales TM. Learning to compare: Relation network for few-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.

  38. Snell J, Swersky K, Zemel R. Prototypical networks for few-shot learning. In: Advances in Neural Information Processing Systems. 2017.

  39. Santoro A, Bartunov S, Botvinick M, Wierstra D, Lillicrap T. One-shot learning with memory-augmented neural networks. In: arXiv. (2016).

  40. Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning. PMLR; 2017. pp. 1126–35.

  41. Ge X, Qu Y, Shang C, Yang L, Shen Q. A self-adaptive discriminative autoencoder for medical applications. IEEE Trans Circuits Syst Video Technol. 2022:1–1.

  42. Jiang W, Huang K, Geng J, Deng X. Multi-scale metric learning for few-shot learning. IEEE Trans Circuits Syst Video Technol. 2021;31(3):1091–102.

    Article  Google Scholar 

  43. Koch G, Zemel R, Salakhutdinov R, et al. Siamese neural networks for one-shot image recognition. In: ICML Deep Learning Workshop, vol. 2. Lille; 2015.

  44. Afrasiyabi A, Larochelle H, Lalonde J-F, Gagné C. Matching feature sets for few-shot image classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2022. pp. 9014–24.

  45. Zhang M, Huang S, Li W, Wang D. Tree structure-aware few-shot image classification via hierarchical aggregation. In: European Conference on Computer Vision. 2022. pp. 453–70.

  46. Rakelly K, Shelhamer E, Darrell T, Efros AA, Levine S. Conditional networks for few-shot semantic segmentation. In: 6th International Conference on Learning Representations, (ICLR). 2018. pp. 780–8.

  47. Dong N, Xing EP. Few-shot semantic segmentation with prototype learning. In: British Machine Vision Conference. 2018.

  48. Tian P, Wu Z, Qi L, Wang L, Shi Y, Gao Y. Differentiable meta-learning model for few-shot semantic segmentation. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, (AAAI). 2020. pp. 12087–94.

  49. Yang B, Liu C, Li B, Jiao J, Ye Q. Prototype mixture models for few-shot semantic segmentation. In: European Conference on Computer Vision, (ECCV), vol. 12353. 2020. pp. 763–78.

  50. Ding H, Zhang H, Jiang X. Self-regularized prototypical network for few-shot semantic segmentation. Pattern Recogn. 2023;133:109018.

    Article  Google Scholar 

  51. Zhang C, Lin G, Liu F, Yao R, Shen C. Canet: Class-agnostic segmentation networks with iterative refinement and attentive few-shot learning. In: IEEE Conference on Computer Vision and Pattern Recognition, (CVPR). 2019. pp. 5217–26.

  52. Liu W, Zhang C, Lin G, Liu F. Crnet: Cross-reference networks for few-shot segmentation. In: 2020 IEEE Conference on Computer Vision and Pattern Recognition, (CVPR). 2020. pp. 4164–72.

  53. Liu Y, Zhang X, Zhang S, He X. Part-aware prototype network for few-shot semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, (CVPR), vol. 12354. 2020. pp. 142–58.

  54. Wu Z, Shi X, Lin G, Cai J. Learning meta-class memory for few-shot semantic segmentation. In: IEEE/CVF International Conference on Computer Vision, (ICCV). 2021. pp. 497–506.

  55. Lang C, Cheng G, Tu B, Han J. Learning what not to segment: A new perspective on few-shot segmentation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, (CVPR). 2022. pp. 8047–57.

  56. Min J, Kang D, Cho M. Hypercorrelation squeeze for few-shot segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021. pp. 6941–52.

  57. Tian Z, Lai X, Jiang L, Liu S, Shu M, Zhao H, Jia J. Generalized few-shot semantic segmentation. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2022. pp. 11553–62.

  58. Cheng G, Lang C, Han J. Holistic prototype activation for few-shot segmentation. IEEE Trans Pattern Anal Mach Intell. 2022;45(4):4650–66.

    Google Scholar 

  59. Deng J, Dong W, Socher R, Li L, Li K, Fei-Fei L. Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR). 2009. pp. 248–55.

  60. Shah R, Säckinger E, Bentz JW, Guyon I. CLIFF: Signature verification using a siamese time delay neural network. Int J Pattern Recognit Artif Intell. 1993;7(4):669–669.

  61. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, (ICLR). 2015.

  62. Lin T, Maire M, Belongie SJ, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL. Microsoft COCO: common objects in context. In: European Conference on Computer Vision, (ECCV), vol. 8693. 2014. pp. 740–55.

  63. Everingham M, Gool LV, Williams CKI, Winn J, Zisserman A. The Pascal visual object classes (VOC) challenge. Int J Comput Vision. 2010;88(2):303–38.

    Article  Google Scholar 

  64. Hariharan B, Arbeláez PA, Girshick RB, Malik J. Simultaneous detection and segmentation. In: European Conference on Computer Vision, (ECCV), vol. 8695. 2014. pp. 297–312.

  65. Siam M, Oreshkin B, Jagersand M. Adaptive masked proxies for few-shot segmentation. arXiv:1902.11123 [Preprint]. 2019. Available from http://arxiv.org/abs/1902.11123.

  66. Zhang C, Lin G, Liu F, Guo J, Wu Q, Yao R. Pyramid graph networks with connection attentions for region-based one-shot semantic segmentation. In: IEEE/CVF International Conference on Computer Vision, (ICCV). 2019. pp. 9586–94.

  67. Li G, Jampani V, Sevilla-Lara L, Sun D, Kim J, Kim J. Adaptive prototype learning and allocation for few-shot segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, (CVPR). 2021 pp. 8334–43.

  68. Lu Z, He S, Zhu X, Zhang L, Song Y, Xiang T. Simpler is better: Few-shot semantic segmentation with classifier weight transformer. In: IEEE/CVF International Conference on Computer Vision, (ICCV). 2021. pp. 8721–30.

Download references

Funding

This work was supported by the National Natural Science Foundation of China (No. 62106006, No. 62376005).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lili Huang.

Ethics declarations

Ethics Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed Consent

Informed consent was obtained from all individual participants included in the study.

Conflict of Interest

Sunhui is a Master’s graduate from Anhui University. Ziyan Zhang is a Ph.D. student at Anhui University. Lili Huang is a faculty member at Anhui University. Bo Jiang holds the position of professor at Anhui University. Bin Luo holds the position of professor at Anhui University. Apart from their affiliations with the mentioned educational institutions, research institutes, and companies, all authors declare no conflicts of interest with external entities.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, H., Zhang, Z., Huang, L. et al. Category-Aware Siamese Learning Network for Few-Shot Segmentation. Cogn Comput (2024). https://doi.org/10.1007/s12559-024-10273-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12559-024-10273-5

Keywords

Navigation