Skip to main content
Log in

Kernel-based similarity sorting and allocation for few-shot semantic segmentation

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Few-shot semantic segmentation tackles the problem of recognizing novel class objects from images with only a few annotated exemplars. The key problem in few-shot semantic segmentation is how to effectively model the correspondences between support and query features. Previous works propose to tackle the problem by prototype matching or distance based metric learning. In this work, we introduce a kernel-based similarity matching model, enforcing robust guidance from both foreground and background semantics. In addition, guidance sorting and allocation modules are presented to better explore the guidance from support set. Specifically, guidance sorting module calibrates the most similar semantic patterns on query maps for each support pixel and produces the index vectors. While the allocation module is able to select the most representative correspondences on similarity maps based on index vectors. To integrate the insights of kernel-based similarity features we define a pyramidal paradigm, which progressively integrates guidance signal, query features and mask priors. In this way, the relationships between support and query features are dynamically explored in both foreground and background semantics. Extensive qualitative and quantitative evaluations on PASCAL-5i, COCO-20i and FSS-1000 are conducted to prove the efficiency and advantage of our proposed method. Experimental results demonstrate that our method performs favorably against state-of-the-art methods with reasonable computational cost.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE Conference on computer vision and pattern recognition (CVPR)

  2. Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: Navab N, Hornegger J, Wells WM, Frangi AF (eds) Medical image computing and computer-assisted intervention: MICCAI 2015. Springer, Cham, pp 234–241

    Chapter  Google Scholar 

  3. Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848. https://doi.org/10.1109/TPAMI.2017.2699184

    Article  Google Scholar 

  4. Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder- decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV)

  5. Hwang JJ, Yu SX, Shi J, Collins MD, Yang TJ, Zhang X, Chen LC (2019) Segsort: SEGMENTATION by discriminative sorting of segments. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV)

  6. He K, Gkioxari G, Dollar P, Girshick R (2017) Mask rcnn. In: 2017 IEEE international conference on computer vision (ICCV), pp. 2980–2988. https://doi.org/10.1109/ICCV.2017.322

  7. Badrinarayanan V, Kendall A, Cipolla R (2016) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39:2481–2495. https://doi.org/10.1109/TPAMI.2016

    Article  Google Scholar 

  8. WeiY, Xiao H, Shi H, Jie Z, Feng J, Huang TS (2018) Revisiting dilated convolution: A simple approach for weakly-and semi-supervised semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)

  9. WeiY, Feng J, Liang X, Cheng MM, Zhao Y, Yan S (2017) Object region mining with adversarial erasing: a simple classification to semantic segmentation approach. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)

  10. Wei Y, Liang X, Chen Y, Shen X, Cheng M-M, Feng J, Zhao Y, Yan S (2017) Stc: A simple to complex framework for weakly-supervised semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39(11):2314–2320. https://doi.org/10.1109/TPAMI.2016.2636150

    Article  Google Scholar 

  11. KhorevaA, Benenson R, Hosang JH, Hein M, Schiele B (2017) Simple does it: weakly supervised instance and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  12. DaiJ, He K, Sun J (2015) Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation in ICCV. In: Proceedings of the IEEE/CVF International conference on computer vision (ICCV)

  13. LinD, Dai J, Jia J, He K, Sun J (2016) Scribblesup: scribble supervised convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)

  14. TangM, Djelouah A, Perazzi F, Boykov Y, Schroers C (2018) Normalized cut loss for weakly-supervised cnn segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)

  15. WangB, Qi G, Tang S, Zhang T, Wei Y, Li L, Zhang Y (2019) Boundary perception guidance: a scribble-supervised semantic segmentation approach. In: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI)

  16. Bearman A, Russakovsky O, Ferrari V, Fei-Fei L (2016) Whats the point: Semantic segmentation with point supervision. In: Proceedings of the European conference on computer vision (ECCV)

  17. Tang Y, Zou W, Jin Z, Chen Y, Hua Y, Li X (2019) Weakly supervised salient object detection with spatiotemporal cascade neural networks. IEEE Trans Circuits Syst Video Technol 29(7):1973–1984. https://doi.org/10.1109/TCSVT.2018.2859773

    Article  Google Scholar 

  18. Qian R, Wei Y, Shi H, Li J, Liu J, Huang T (2019) Weakly supervised scene parsing with point-based distance metric learning. In: Proceedings of the AAAI conference on artificial intelligence (AAAI)

  19. Nanqing Dong EPX (2018) Few-shot semantic segmentation with prototype learning. In: Tae-Kyun Kim Stefanos Zafeiriou, GB, Mikolajczyk K. (eds.) Proceedings of the British machine vision conference (BMVC), p. 79

  20. Yang B, Chang Liu BL (2020) Prototype mixture models for few-shot semantic segmentation. In: European conference on computer vision (ECCV)

  21. Hao F, He F, Cheng J, Wang L, Cao J, Tao D (2019) Collect and select: Semantic alignment metric learning for few-shot learning. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV)

  22. Amirreza Shaban ZL, Shray Bansal (2017) One-shot learning for semantic segmentation. In: Kim TK, Stefanos Zafeiriou GB, K Mikolajczyk (eds.) Proceedings of the British machine vision conference (BMVC), pp. 167–116713 (2017). https://doi.org/10.5244/C.31.167

  23. Zhang X, Wei Y, Yang Y, Huang TS (2020) Sg-one: similarity guidance network for one-shot semantic segmentation. IEEE Trans Cybern 50(9):3855–3865

    Article  Google Scholar 

  24. ZhangC, Lin G, Liu F, Guo J, Wu Q, Yao R (2019) Pyramid graph networks with connection attentions for region-based one-shot semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV)

  25. YangX, Wang B, Chen K, Zhou X, Yi S, Ouyang W, Zhou L (2020) Brinet: Towards bridging the intra-class and inter-class gaps in one-shot segmentation. In: Proceedings of the British machine vision conference (BMVC)

  26. Wang H, Xudong Zhang YH (2020) Few-shot semantic segmentation with democratic attention networks. In: european conference on computer vision (ECCV)

  27. Zhang C, Lin G, Liu F, Yao R, Shen C (2019) Canet: Class-agnostic segmentation networks with iterative refinement and attentive few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)

  28. Li Z, Zhang X, Xiao P, Zheng Z (2021) On the effectiveness of weakly supervised semantic segmentation for building extraction from high- resolution remote sensing imagery. IEEE J Sel Top Appl Earth Observ Remote Sens 14:3266–3281. https://doi.org/10.1109/JSTARS.2021.3063788

    Article  Google Scholar 

  29. Feng D, Haase-Schutz C, Rosenbaum L, Hertlein H, Glaeser C, Timm F, Wiesbeck W, Dietmayer K (2020) Deep multi-modal object detection and semantic segmentation for autonomous driving: datasets, methods, and challenges. IEEE Trans Intell Transp Syst

  30. Kim H, Park J, Kim H, Hwang E (2018) Facial landmark extraction scheme based on semantic segmentation, in. Int Conf Platform Technol Serv 2018:1–6. https://doi.org/10.1109/PlatCon.2018.8472730

    Article  Google Scholar 

  31. Fulkerson B, Vedaldi A, Soatto S (2009) Class segmentation and object localization with superpixel neighborhoods. In: 2009 IEEE 12th international conference on computer vision, pp. 670–677 (2009). https://doi.org/10.1109/ICCV.2009.5459175

  32. Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–66. https://doi.org/10.1109/TSMC.1979.4310076

    Article  Google Scholar 

  33. Muller AC, Behnke S (2014) Learning depth-sensitive conditional random fields for semantic segmentation of rgb-d images. In: 2014 IEEE International conference on robotics and automation (ICRA), pp. 6232–6237 (2014). https://doi.org/10.1109/ICRA.2014.6907778

  34. Starck J-L, Elad M, Donoho D (2005) Image decomposition via the combination of sparse representations and a variational approach. IEEE Trans Image Process. https://doi.org/10.1109/TIP.2005.852206

    Article  MathSciNet  MATH  Google Scholar 

  35. Hariharan B, Arbelaez P, Girshick R, Malik J, Hypercolumns for object segmentation and fine-grained localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)

  36. He J, Deng Z, Qiao Y (2019) Dynamic multi-scale filters for semantic segmentation. In: 2019 IEEE/cvf international conference on computer vision (ICCV), pp. 3561–3571 (2019). https://doi.org/10.1109/ICCV.2019.00366

  37. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)

  38. Tao A, Sapra K, Catanzaro B (2020) Hierarchical multi-scale attention for semantic segmentation. https://arxiv.org/abs/2005.10821

  39. Yuan Y, Chen X, Wang J (2020) Object-contextual representations for semantic segmentation. In: European conference on computer vision (ECCV)

  40. YinZ, Wang P, Wang F, Xu X, Zhang H, Li H, Jin R (2021) TransFGU: a top-down approach to fine-grained unsupervised semantic segmentation. https://arxiv.org/abs/2112.01515

  41. Zhang X, Wei Y, Kang G, Yang Y, Huang T (2018) Self-produced guidance for weakly-supervised object localization. In: Proceedings of the European conference on computer vision (ECCV)

  42. Wang Y, Zhang J, Kan M, Shan S, Chen X (2020) Self-supervised equivariant attention mechanism for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)

  43. Gidaris S, Komodakis N (2018) Dynamic few-shot visual learning without forgetting. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)

  44. Qi H, Brown M, Lowe DG (2018) Low-shot learning with imprinted weights. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)

  45. SunQ, Liu Y, Chua TS, Schiele B (2019) Meta-transfer learning for few-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)

  46. Li X, Sun Q, Liu Y, Zhou Q, Zheng S, Chua TS, Schiele B (2019) Learning to self-train for semi-supervised few-shot classification. In: Wallach H, Larochelle H, Beygelzimer A, d'Alché-Buc F, Fox E, Garnett R (eds.) Advances in neural information processing systems. vol. 32

  47. Liu W, Zhang C, Lin G, Liu F (2020) Crnet: Cross-reference networks for few-shot segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)

  48. Zhang X, Wei Y, Li Z, Yan C, Yang Y (2021) Rich embedding features for one-shot semantic segmentation. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2021.3081693

    Article  Google Scholar 

  49. DongN, Xing EP (2018) Few-shot semantic segmentation with prototype learning. In: Proceedings of the British machine vision conference (BMVC)

  50. Wang K, Liew JH, Zou Y, Zhou D, Feng J (2019) Panet: few-shot image semantic segmentation with prototype alignment. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV)

  51. Liu Y, Zhang X, Zhang S, He X (2020) Part-aware prototype network for few-shot semantic segmentation. In: European Conference on Computer Vision (ECCV), pp. 142–158

  52. Liu J, Qin Y (2020) Prototype refinement network for few-shot segmentation. http://arxiv.org/abs/2002.03579

  53. DanelljanM, Hager G, Khan FS, Felsberg M (2015) Learning spatially regularized correlation filters for visual tracking. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV)

  54. ZhangM, Xing J, Gao J, Hu W (2015) Robust visual tracking using joint scale-spatial correlation filters. In: IEEE International conference on image processing. 1468–1472. https://doi.org/10.1109/TIP.2005.852206

  55. WangQ, Gao J, Xing J, Zhang M, Hu W (2017) Dcfnet: discriminant correlation filters network for visual tracking. http://arxiv.org/abs/1704.04057

  56. DanelljanM, Robinson A, Khan FS, Felsberg M (2016) Beyond correlation filters: learning continuous convolution operators for visual tracking. In: European conference on computer vision (ECCV), pp. 472–488 (2016). Springer

  57. ValmadreJ, Bertinetto L, Henriques J, Vedaldi A, PHS (2017) Torr: End-to-end representation learning for correlation filter based tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  58. DanelljanM, Bhat G, Shahbaz Khan F, Felsberg M (2017) Eco: efficient convolution operators for tracking. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)

  59. Zhao H, Qi X, Shen (2018) ICNet for real-time semantic segmentation on high-resolution images. In: Proceedings of the European Conference on Computer Vision (ECCV)

  60. Zhang Z, Zhang X, Peng C, Xue X, Sun J (2018) Exfuse: Enhancing feature fusion for semantic segmentation. In: Proceedings of the European conference on computer vision (ECCV)

  61. Cen, C. Jung: Fully convolutional siamese fusion networks for object tracking, in: Proceedings of the IEEE International Conference on Image Processing (ICIP) (2018)

  62. LiB, Yan J, Wu W, Zhu Z, Hu X (2018) High performance visual tracking with siamese region proposal network. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)

  63. Nguyen K, Todorovic S (2019) Feature weighting and boosting for few-shot segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV)

  64. Siam M, Oreshkin BN, Jagersand M (2019) Amp: adaptive masked proxies for few-shot segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)

  65. Dong Z, Zhang R, Shao X, Zhou H (2019) Multi-scale discriminative location-aware network for few-shot semantic segmentation. In: 2019 IEEE 43rd annual computer software and applications conference (COMPSAC), 2: 42–47. https://doi.org/10.1109/COMPSAC.2019.10181

  66. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large- scale image recognition. In: Bengio Y, LeCun Y (eds.) 3rd International conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings. http://arxiv.org/abs/1409.1556

  67. He K, Zhang, X, Ren, S, Sun, J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on computer vision and pattern recognition (CVPR)

  68. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

    Article  MathSciNet  Google Scholar 

  69. Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorch. In: NIPS 2017 Workshop on Autodiff

  70. Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: Bengio Y, LeCun Y (eds) 3rd international conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings

  71. RakellyK, Shelhamer E, Darrell T, Efros A, Levine S (2018) Conditional networks for few-shot semantic segmentation. https://openreview.net/forum?id=SkMjFKJwG

  72. Li X, Wei T, Chen YP, Tai YW, Tang CK (2020) Fss-1000: A 1000-class dataset for few-shot segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)

  73. Rakelly K, Shelhamer E, Darrell T, Efros AA, Levine S (2018) Few-shot segmentation propagation with guided networks. https://arxiv.org/abs/1806.07373

  74. HendryxSM, Leach AB, Hein PD, Morrison CT (2019) Meta-learning initializations for image segmentation. http://arxiv.org/abs/1912.06290

  75. Azad R, Fayjie AR, Kauffmann C, Ben Ayed I, Pedersoli M, Dolz J (2021) On the texture bias for few-shot cnn segmentation. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision (WACV), pp. 2674–2683

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jian-wei Liu.

Ethics declarations

Conflict of interest

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled “Kernel-based Similarity Sorting and Allocation for Few-Shot Semantic Segmentation”.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Zy., Liu, Jw. Kernel-based similarity sorting and allocation for few-shot semantic segmentation. Neural Comput & Applic 34, 21939–21960 (2022). https://doi.org/10.1007/s00521-022-07654-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07654-y

Keywords

Navigation