Abstract
Current methods for few-shot segmentation (FSSeg) have mainly focused on improving the performance of novel classes while neglecting the performance of base classes. To overcome this limitation, the task of generalized few-shot semantic segmentation (GFSSeg) has been introduced, aiming to predict segmentation masks for both base and novel classes. However, the current prototype-based methods do not explicitly consider the relationship between base and novel classes when updating prototypes, leading to a limited performance in identifying true categories. To address this challenge, we propose a class contrastive loss and a class relationship loss to regulate prototype updates and encourage a large distance between prototypes from different classes, thus distinguishing the classes from each other while maintaining the performance of the base classes. Our proposed approach achieves new state-of-the-art performance for the generalized few-shot segmentation task on PASCAL VOC and MS COCO datasets.
This is a preview of subscription content, access via your institution.





References
Boudiaf, M., Kervadec, H., Masud, Z. I., et al. (2021). Few-shot segmentation without meta-learning: A good transductive inference is all you need?. In CVPR, pp. 13979–13988.
Chen, J., Gao, B. B., Lu, Z., et al. (2021). Apanet: Adaptive prototypes alignment network for few-shot semantic segmentation. arXiv preprint arXiv:2111.12263
Chen, L. C., Yang, Y., Wang, J., et al. (2016). Attention to scale: Scale-aware semantic image segmentation. In CVPR, pp. 3640–3649.
Chen, Z., Fu, Y., Chen, K., et al. (2019a). Image block augmentation for one-shot learning. In AAAI, pp. 3379–3386.
Chen, Z., Fu, Y., Wang, Y. X., et al. (2019b). Image deformation meta-networks for one-shot learning. In CVPR, pp. 8680–8689.
Cheng, G., Lang, C., & Han, J. (2022). Holistic prototype activation for few-shot segmentation. TPAMI, 45(4), 4650–4666.
Dong, K., Yang, W., Xu, Z., et al. (2021). Abpnet: Adaptive background modeling for generalized few shot segmentation. In ACM MM, pp. 2271–2280.
Everingham, M., Van Gool, L., Williams, C. K., et al. (2010). The pascal visual object classes (voc) challenge. IJCV, 88(2), 303–338.
Fan, Q., Tang, C.K., Tai, Y.W. (2021). Few-shot video object detection. arXiv preprint arXiv:2104.14805
Fan, Q., Pei, W., Tai, Y. W., et al. (2022). Self-support few-shot semantic segmentation. arXiv preprint arXiv:2207.11549
Finn, C., Abbeel, P., Levine, S. (2017). Model-agnostic meta-learning for fast adaptation of deep networks. In ICML, PMLR, pp. 1126–1135.
Gairola, S., Hemani, M,. Chopra, A., et al. (2020). Simpropnet: Improved similarity propagation for few-shot image segmentation. In IJCAI.
Hou, R., Chang, H., Ma, B., et al. (2019). Cross attention network for few-shot classification. In NeurIPS.
Iqbal, E., Safarov, S., Bang, S. (2022). Msanet: Multi-similarity and attention guidance for boosting few-shot segmentation. arXiv preprint arXiv:2206.09667
Jamal, M. A., Qi, G. J. (2019). Task agnostic meta-learning for few-shot learning. In CVPR, pp. 11719–11727.
Kang, D., Cho, M. (2022). Integrative few-shot learning for classification and segmentation. In CVPR, pp. 9979–9990.
Kirillov, A., Girshick, R., He, K., et al. (2019). Panoptic feature pyramid networks. In CVPR, pp. 6399–6408.
Koch, G., Zemel, R., Salakhutdinov, R. (2015). Siamese neural networks for one-shot image recognition. In ICMLW.
Lang, C., Cheng, G., Tu, B., et al. (2022a). Learning what not to segment: A new perspective on few-shot segmentation. In CVPR, pp. 8057–8067.
Lang, C., Tu, B., Cheng, G., et al. (2022b). Beyond the prototype: Divide-and-conquer proxies for few-shot segmentation. arXiv preprint arXiv:2204.09903
Lang, C., Cheng, G., Tu, B., et al. (2023). Base and meta: A new perspective on few-shot segmentation. In TPAMI.
Li, B., Yang, B., Liu, C., et al. (2021a). Beyond max-margin: Class margin equilibrium for few-shot object detection. In CVPR, pp. 7363–7372.
Li, G., Jampani, V., Sevilla-Lara, L., et al. (2021b). Adaptive prototype learning and allocation for few-shot segmentation. In CVPR, pp. 8334–8343.
Li, H., Eigen, D., Dodge, S., et al. (2019a). Finding task-relevant features for few-shot learning by category traversal. In CVPR, pp. 1–10.
Li, W., Wang, L., Xu, J., et al. (2019b). Revisiting local descriptor based image-to-class measure for few-shot learning. In CVPR, pp. 7260–7268.
Li, X., Wei, T., Chen, Y. P., et al. (2020). Fss-1000: A 1000-class dataset for few-shot segmentation. In CVPR, pp. 2869–2878.
Lin, T.Y., Maire, M., Belongie, S., et al. (2014). Microsoft coco: Common objects in context. In Proceedings of the European Conference on Computer Vision, pp. 740–755.
Liu, B., Ding, Y., Jiao, J., et al. (2021a). Anti-aliasing semantic reconstruction for few-shot semantic segmentation. In CVPR, pp. 9747–9756.
Liu, C., Fu, Y., Xu, C., et al. (2021b). Learning a few-shot embedding model with contrastive learning. In AAAI, pp. 8635–8643.
Liu, L., Cao, J., Liu, M., et al. (2020a). Dynamic extension nets for few-shot semantic segmentation. In ACM MM, pp. 1441–1449.
Liu, W., Zhang, C., Lin, G., et al. (2020b). Crnet: Cross-reference networks for few-shot segmentation. In CVPR.
Liu, W., Kong, X., Hung, T. Y., et al. (2021c). Cross-image region mining with region prototypical network for weakly supervised segmentation. IEEE Transactions on Multimedia.
Liu, W., Zhang, C., Ding, H., et al. (2021d). Few-shot segmentation with optimal transport matching and message flow. IEEE Transactions on Multimedia.
Liu, W., Zhang, C., Lin, G., et al. (2022). Crcnet: Few-shot segmentation with cross-reference and region-global conditional networks. International Journal of Computer Vision, 130(12), 3140–3157.
Liu, Y., Zhang, X., Zhang, S., et al. (2020c). Part-aware prototype network for few-shot semantic segmentation. In ECCV, pp. 142–158.
Long, J., Shelhamer, E., Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In CVPR, pp. 3431–3440.
Lu, Z., He, S., Zhu, X., et al. (2021). Simpler is better: Few-shot semantic segmentation with classifier weight transformer. In ICCV, pp. 8741–8750.
Min, J., Kang, D., Cho, M. (2021). Hypercorrelation squeeze for few-shot segmentation. In Proceedings of the IEEE/CVF international conference on computer vision, pp. 6941–6952.
Ouyang, C., Biffi, C., Chen, C,. et al. (2020). Self-supervision with superpixels: Training few-shot medical image segmentation without annotation. In ECCV, pp. 762–780.
Ravi, S., Larochelle, H. (2016). Optimization as a model for few-shot learning. In ICLR.
Siam, M., Doraiswamy, N., Oreshkin, B. N., et al. (2020). Weakly supervised few-shot object segmentation using co-attention with visual and semantic embeddings. In IJCAI.
Tian, Z., Zhao, H., Shu, M., et al. (2020). Prior guided feature enrichment network for few-shot segmentation. TPAMI, 44(2), 1050–1065.
Tian, Z., Lai, X., Jiang, L., et al. (2022). Generalized few-shot semantic segmentation. In CVPR, pp. 11563–11572.
Wang, H., Zhang, X., Hu, Y., et al. (2020). Few-shot semantic segmentation with democratic attention networks. In ECCV, pp. 730–746.
Wang, K., Liew, J., Zou, Y., et al. (2019). Panet: Few-shot image semantic segmentation with prototype alignment. In ICCV, pp. 9197–9206.
Wang, W., Zhou, T., Yu, F., et al. (2021). Exploring cross-image pixel contrast for semantic segmentation. In CVPR, pp. 7303–7313.
Wu, Z., Shi, X., Lin, G., et al. (2021). Learning meta-class memory for few-shot semantic segmentation. In ICCV, pp. 517–526.
Xie, G. S., Liu, J., Xiong, H., et al. (2021a). Scale-aware graph neural network for few-shot semantic segmentation. In CVPR, pp. 5475–5484.
Xie, G. S., Xiong, H., Liu, J., et al. (2021b). Few-shot semantic segmentation with cyclic memory network. In ICCV, pp. 7293–7302.
Yang, L., Zhuo, W., Qi, L., et al. (2021). Mining latent classes for few-shot segmentation. In ICCV, pp. 8721–8730.
Yang, X., Wang, B., Chen, K., et al. (2020). Brinet: Towards bridging the intra-class and inter-class gaps in one-shot segmentation. arXiv preprint arXiv:2008.06226
Zhang, B., Xiao, J., Qin, T. (2021). Self-guided and cross-guided learning for few-shot segmentation. In CVPR, pp. 8312–8321.
Zhao, H., Shi, J., Qi, X., et al. (2017). Pyramid scene parsing network. In CVPR, pp. 2881–2890.
Zhao, X., Vemulapalli, R., Mansfield, P. A., et al. (2021). Contrastive learning for label efficient semantic segmentation. In CVPR, pp. 10623–10633.
Zhong, Y., Yuan, B., Wu, H., et al. (2021). Pixel contrastive-consistent semi-supervised semantic segmentation. In CVPR, pp. 7273–7282.
Zhu, K., Zhai, W., Zha, Z. J., et al. (2020). Self-supervised tuning for few-shot segmentation. arXiv preprint arXiv:2004.05538
Zhuge, Y., Shen, C. (2021). Deep reasoning network for few-shot semantic segmentation. In ACM MM, pp. 5344–5352.
Acknowledgements
This research is supported by the Agency for Science, Technology and Research (A*STAR) under its AME Programmatic Funds(Grant No. A20H6b0151).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Bumsub Ham.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, W., Wu, Z., Zhao, Y. et al. Harmonizing Base and Novel Classes: A Class-Contrastive Approach for Generalized Few-Shot Segmentation. Int J Comput Vis (2023). https://doi.org/10.1007/s11263-023-01939-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11263-023-01939-y