Skip to main content
Log in

CRCNet: Few-Shot Segmentation with Cross-Reference and Region–Global Conditional Networks

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Few-shot segmentation aims to learn a segmentation model that can be generalized to novel classes with only a few training images. In this paper, we propose a Cross-Reference and Local–Global Conditional Networks (CRCNet) for few-shot segmentation. Unlike previous works that only predict the query image’s mask, our proposed model concurrently makes predictions for both the support image and the query image. Our network can better find the co-occurrent objects in the two images with a cross-reference mechanism, thus helping the few-shot segmentation task. To further improve feature comparison, we develop a local-global conditional module to capture both global and local relations. We also develop a mask refinement module to refine the prediction of the foreground regions recurrently. Experiments on the PASCAL VOC 2012, MS COCO, and FSS-1000 datasets show that our network achieves new state-of-the-art performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Azad, R., Fayjie, A.R., Kauffmann, C., Ben Ayed, I., Pedersoli, M., & Dolz, J. (2021). On the texture bias for few-shot cnn segmentation. In Proceedings of the IEEE/CVF winter conference on applications of computer vision. pp 2674–2683.

  • Badrinarayanan, V., Kendall, A., & Cipolla, R. (2017). Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12), 2481–2495.

    Article  Google Scholar 

  • Boudiaf, M., Kervadec, H., Masud, Z.I., Piantanida, P., Ben Ayed, I., & Dolz, J. (2021). Few-shot segmentation without meta-learning: A good transductive inference is all you need? In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 13979–13988.

  • Cao, Y., Shen, C., & Shen, H. T. (2016). Exploiting depth from single monocular images for object detection and semantic segmentation. IEEE Transactions on Image Processing, 26(2), 836–846.

    Article  MathSciNet  Google Scholar 

  • Chen, H., Huang, Y., & Nakayama, H. (2018). Semantic aware attention based deep object co-segmentation. arXiv preprint arXiv:1810.06859

  • Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2017). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834–848.

    Article  Google Scholar 

  • Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2018). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834–848.

    Article  Google Scholar 

  • Chen, W., Jiang, Z., Wang, Z., Cui, K., & Qian, X. (2019). Collaborative global-local networks for memory-efficient segmentation of ultra-high resolution images. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 8924–8933.

  • Chen, W.Y., Liu, Y.C., Kira, Z., Wang, Y.C., & Huang, J.B. (2019). A closer look at few-shot classification. In International conference on learning representations.

  • Chen, Y., Cao, Y., Hu, H., & Wang, L. (2020). Memory enhanced global-local aggregation for video object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 10337–10346.

  • Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 248–255.

  • Ding, H., Jiang, X., Shuai, B., Liu, A. Q., & Wang, G. (2020). Semantic segmentation with context encoding and multi-path decoding. IEEE Transactions on Image Processing, 29, 3520–3533.

    Article  Google Scholar 

  • Dong, N., & Xing, E. (2018). Few-shot semantic segmentation with prototype learning. In: BMVC, 79.

  • Dong, Z., Zhang, R., Shao, X., & Zhou, H. (2019). Multi-scale discriminative location-aware network for few-shot semantic segmentation. In: 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC). 2, 42–47. IEEE.

  • Fan, Q., Zhuo, W., Tang, C.K., & Tai, Y.W. (2020). Few-shot object detection with attention-rpn and multi-relation detector. In Proceedings of the IEEE/cvf conference on computer vision and pattern recognition. pp 4013–4022.

  • Han, J., Quan, R., Zhang, D., & Nie, F. (2017). Robust object co-segmentation using background prior. IEEE Transactions on Image Processing, 27(4), 1639–1651.

    Article  MathSciNet  Google Scholar 

  • Hariharan, B., Arbeláez, P., Girshick, R., & Malik, J. (2014). Simultaneous detection and segmentation. In European conference on computer vision. pp 297–312. Springer.

  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. pp 770–778.

  • Hendryx, S.M., Leach, A.B., Hein, P.D., & Morrison, C.T. (2019). Meta-learning initializations for image segmentation. arXiv preprint arXiv:1912.06290

  • Hou, R., Chang, H., Ma, B., Shan, S., & Chen, X. (2019). Cross attention network for few-shot classification. arXiv preprint arXiv:1910.07677

  • Hu, T., Yang, P., Zhang, C., Yu, G., Mu, Y., & Snoek, C.G. (2019). Attention-based multi-context guiding for few-shot semantic segmentation. In Proceedings of the AAAI conference on artificial intelligence. pp. 8441–8448.

  • Huang, Z., Wang, C., Wang, X., Liu, W., & Wang, J. (2019). Semantic image segmentation by scale-adaptive networks. IEEE Transactions on Image Processing, 29(1), 2066–2077.

    MATH  Google Scholar 

  • Jing, L., Chen, Y., & Tian, Y. (2019). Coarse-to-fine semantic segmentation from image-level labels. IEEE Transactions on Image Processing, 29, 225–236.

    Article  MathSciNet  Google Scholar 

  • Joulin, A., Bach, F., & Ponce, J. (2012). Multi-class cosegmentation. In 2012 IEEE conference on computer vision and pattern recognition. pp. 542–549. IEEE.

  • Kolesnikov, A., & Lampert, C.H. (2016). Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In European Conference on Computer Vision. pp 695–711. Springer.

  • Krizhevsky, A., Sutskever, I., & Hinton, G.E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. pp. 1097–1105.

  • Li, G., Jampani, V., Sevilla-Lara, L., Sun, D., Kim, J., & Kim, J. (2021). Adaptive prototype learning and allocation for few-shot segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 8334–8343.

  • Li, X., Wei, T., Chen, Y.P., Tai, Y.W., & Tang, C.K. (2020). Fss-1000: A 1000-class dataset for few-shot segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 2869–2878.

  • Lin, G., Liu, F., Milan, A., Shen, C., & Reid, I. (2019). Refinenet: Multi-path refinement networks for dense prediction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(5), 1228–1242.

    Google Scholar 

  • Lin, G., Milan, A., Shen, C., & Reid, I.D. (2017). Refinenet: Multi-path refinement networks for high-resolution semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. p. 5.

  • Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C.L. (2014). Microsoft coco: Common objects in context. In European Conference on Computer Vision. pp. 740–755.

  • Liu, B., Ding, Y., Jiao, J., Ji, X., & Ye, Q. (2021). Anti-aliasing semantic reconstruction for few-shot semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 9747–9756.

  • Liu, W., Wu, Z., Ding, H., Liu, F., Lin, J., & Lin, G. (2021). Few-shot segmentation with global and local contrastive learning. arXiv preprint arXiv:2108.05293

  • Liu, W., Zhang, C., Lin, G., & Liu, F. (2020). Crnet: Cross-reference networks for few-shot segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 4165–4173.

  • Liu, Y., Zhang, X., Zhang, S., & He, X. (2020). Part-aware prototype network for few-shot semantic segmentation. In European conference on computer vision. pp. 142–158. Springer.

  • Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 3431–3440.

  • Min, J., Kang, D., & Cho, M. (2021). Hypercorrelation squeeze for few-shot segmentation. In Proceedings of the IEEE/CVF international conference on computer vision. pp. 6941–6952.

  • Mukherjee, P., Lall, B., & Lattupally, S. (2018). Object cosegmentation using deep siamese network. arXiv preprint arXiv:1803.02555

  • Nguyen, K., & Todorovic, S. (2019) Feature weighting and boosting for few-shot segmentation. In Proceedings of the IEEE international conference on computer vision. pp. 622–631.

  • Peng, C., Zhang, X., Yu, G., Luo, G., & Sun, J. (2017). Large kernel matters–improve semantic segmentation by global convolutional network. In Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 4353–4361.

  • Rakelly, K., Shelhamer, E., Darrell, T., Efros, A., & Levine, S. (2018). Conditional networks for few-shot semantic segmentation. In International conference on learning representations workshop.

  • Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems. pp. 91–99.

  • Rother, C., Minka, T., Blake, A., & Kolmogorov, V. (2006). Cosegmentation of image pairs by histogram matching-incorporating a global constraint into mrfs. In 2006 IEEE computer society conference on computer vision and pattern recognition (Proceedings of the IEEE/CVF conference on computer vision and pattern recognition’06). vol. 1, pp. 993–1000. IEEE.

  • Rubinstein, M., Joulin, A., Kopf, J., & Liu, C. (2013). Unsupervised joint object discovery and segmentation in internet images. In Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 1939–1946.

  • Shaban, A., Bansal, S., Liu, Z., Essa, I., & Boots, B. (2017). One-shot learning for semantic segmentation. arXiv preprint arXiv:1709.03410

  • Siam, M., & Oreshkin, B. (2019). Adaptive masked weight imprinting for few-shot segmentation. arXiv preprint arXiv:1902.11123

  • Siam, M., Oreshkin, B.N., & Jagersand, M. (2019). Amp: Adaptive masked proxies for few-shot segmentation. In Proceedings of the IEEE international conference on computer vision. pp. 5249–5258.

  • Snell, J., Swersky, K., & Zemel, R. (2017). Prototypical networks for few-shot learning. In Advances in neural information processing systems.

  • Tian, Z., Zhao, H., Shu, M., Yang, Z., Li, R., & Jia, J. (2020). Prior guided feature enrichment network for few-shot segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence.

  • Tsai, C. C., Li, W., Hsu, K. J., Qian, X., & Lin, Y. Y. (2018). Image co-saliency detection and co-segmentation via progressive joint optimization. IEEE Transactions on Image Processing, 28(1), 56–71.

    Article  MathSciNet  Google Scholar 

  • Vinyals, O., Blundell, C., Lillicrap, T., & Wierstra, D., et al. (2016). Matching networks for one shot learning. In Advances in neural information processing systems. pp. 3630–3638.

  • Wang, H., Zhang, X., Hu, Y., Yang, Y., Cao, X., & Zhen, X. (2020). Few-shot semantic segmentation with democratic attention networks. In European conference on computer vision. pp. 730–746. Springer.

  • Wang, K., Liew, J.H., Zou, Y., Zhou, D., & Feng, J. (2019). Panet: Few-shot image semantic segmentation with prototype alignment. In Proceedings of the IEEE international conference on computer vision. pp. 9197–9206.

  • Wu, T., Tang, S., Zhang, R., Cao, J., & Zhang, Y. (2020). Cgnet: A light-weight context guided network for semantic segmentation. IEEE Transactions on Image Processing, 30, 1169–1179.

    Article  Google Scholar 

  • Xie, G., Liu, J., Xiong, H., & Shao, L. (2021). Scale-aware graph neural network for few-shot semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 5475–5484.

  • Yang, B., Liu, C., Li, B., Jiao, J., & Ye, Q. (2020). Prototype mixture models for few-shot semantic segmentation. arXiv preprint arXiv:2008.03898

  • Yang, F.S.Y., Zhang, L., Xiang, T., Torr, P.H., & Hospedales, T.M. (2018). Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.

  • Yang, L., Han, J., Zhang, D., Liu, N., & Zhang, D. (2018). Segmentation in weakly labeled videos via a semantic ranking and optical warping network. IEEE Transactions on Image Processing, 27(8), 4025–4037.

    Article  MathSciNet  Google Scholar 

  • Yang, L., Zhuo, W., Qi, L., Shi, Y., & Gao, Y. (2021). Mining latent classes for few-shot segmentation. In Proceedings of the IEEE/CVF international conference on computer vision. pp. 8721–8730.

  • Yosinski, J., Clune, J., Nguyen, A., Fuchs, T., & Lipson, H. (2015). Understanding neural networks through deep visualization. arXiv preprint arXiv:1506.06579

  • Zhang, C., Lin, G., Liu, F., Guo, J., Wu, Q., & Yao, R. (2019). Pyramid graph networks with connection attentions for region-based one-shot semantic segmentation. In Proceedings of the IEEE international conference on computer vision. pp. 9587–9595.

  • Zhang, C., Lin, G., Liu, F., Yao, R., & Shen, C. (2019). Canet: Class-agnostic segmentation networks with iterative refinement and attentive few-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 5217–5226.

  • Zhang, X., Wei, Y., Yang, Y., & Huang, T. (2018). Sg-one: Similarity guidance network for one-shot semantic segmentation. arXiv preprint arXiv:1810.09091

  • Zhou, S., Nie, D., Adeli, E., Yin, J., Lian, J., & Shen, D. (2019). High-resolution encoder-decoder networks for low-contrast medical image segmentation. IEEE Transactions on Image Processing, 29, 461–475.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This research is supported by the National Research Foundation, Singapore under its AI Singapore Programme (AISG Award No: AISG-RP-2018-003), the Ministry of Education, Singapore, under its Academic Research Fund Tier 2 (MOE-T2EP20220-0007) and Tier 1 (RG95/20). This research is also partly supported by the Agency for Science, Technology and Research (A*STAR) under its AME Programmatic Funds (Grant No. A20H6b0151).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guosheng Lin.

Additional information

Communicated by Christoph H. Lampert.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, W., Zhang, C., Lin, G. et al. CRCNet: Few-Shot Segmentation with Cross-Reference and Region–Global Conditional Networks. Int J Comput Vis 130, 3140–3157 (2022). https://doi.org/10.1007/s11263-022-01677-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-022-01677-7

Keywords

Navigation