Abstract
Incremental few-shot semantic segmentation aims to extend a semantic segmentation model to novel classes according to only a few labeled data, while preserving its segmentation capability on learned base classes. However, semantic aliasing between base and novel classes severely limits the quality of segmentation results. To alleviate this issue, we propose a semantic-guided relation alignment and adaptation method. Specifically, we first conduct semantic relation alignment in the base step, so as to align base class representations to their semantic information. Thus, base class embeddings are constrained to have relatively low semantic correlations to classes that are different from them. Afterwards, based on semantically aligned base classes, we further conduct semantic-guided adaptation during incremental learning, which aims to ensure affinities between visual and semantic embeddings of encountered novel classes, thereby making feature representations be consistent with their semantic information. In this way, the semantic-aliasing issue can be suppressed. We evaluate our model on PASCAL VOC 2012 and COCO datasets. The experimental results demonstrate the effectiveness of the proposed method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Caesar, H., Uijlings, J., Ferrari, V.: Coco-stuff: thing and stuff classes in context. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 1209–1218 (2018)
Castro, F.M., Marín-Jiménez, M.J., Guil, N., Schmid, C., Alahari, K.: End-to-end incremental learning. In: Proceedings of the European Conference on Computer Vision, pp. 233–248 (2018)
Cermelli, F., Mancini, M., Bulo, S.R., Ricci, E., Caputo, B.: Modeling the background for incremental learning in semantic segmentation. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 9233–9242 (2020)
Cermelli, F., Mancini, M., Xian, Y., Akata, Z., Caputo, B.: Prototype-based incremental few-shot semantic segmentation. In: Proceedings of the British Machine Vision Conference, BMVC 2021, p. 484. (2021)
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL visual object classes challenge 2012 (VOC2012) results. http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html
Gidaris, S., Komodakis, N.: Dynamic few-shot visual learning without forgetting. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 4367–4375 (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Kang, M., Park, J., Han, B.: Class-incremental learning by knowledge distillation with adaptive feature consolidation. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 16071–16080 (2022)
Li, A., Huang, W., Lan, X., Feng, J., Li, Z., Wang, L.: Boosting few-shot learning with adaptive margin loss. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 12576–12584 (2020)
Li, Z., Hoiem, D.: Learning without forgetting. IEEE Trans. Pattern Anal. Mach. Intell. 40(12), 2935–2947 (2017)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Liu, Y., Schiele, B., Sun, Q.: RMM: reinforced memory management for class-incremental learning. Adv. Neural. Inf. Process. Syst. 34, 3478–3490 (2021)
Michieli, U., Zanuttigh, P.: Incremental learning techniques for semantic segmentation. In: Proceedings of the International Conference on Computer Vision Workshops (2019)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Nichol, A., Achiam, J., Schulman, J.: On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999 (2018)
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1532–1543 (2014)
Radford, A., et al.: Learning transferable visual models from natural language supervision. In: Proceedings of the International Conference on Machine Learning, pp. 8748–8763. PMLR (2021)
Rebuffi, S.A., Kolesnikov, A., Sperl, G., Lampert, C.H.: ICARL: incremental classifier and representation learning. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 2001–2010 (2017)
Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., Lillicrap, T.: Meta-learning with memory-augmented neural networks. In: Proceedings of the International Conference on Machine Learning, pp. 1842–1850. PMLR (2016)
Shi, G., Wu, Y., Liu, J., Wan, S., Wang, W., Lu, T.: Incremental few-shot semantic segmentation via embedding adaptive-update and hyper-class representation. In: Proceedings of the ACM International Conference on Multimedia, pp. 5547–5556 (2022)
Siam, M., Oreshkin, B., Jagersand, M.: Adaptive masked proxies for few-shot segmentation. arXiv preprint arXiv:1902.11123 (2019)
Tian, Y., Wang, Y., Krishnan, D., Tenenbaum, J.B., Isola, P.: Rethinking few-shot image classification: a good embedding is all you need? In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12359, pp. 266–282. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58568-6_16
Wang, FY., Zhou, DW., Ye, HJ., Zhan, DC.: FOSTER: feature boosting and compression for class-incremental learning. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision - ECCV 2022. ECCV 2022. LNCS, vol. 13685, pp. 398–414. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19806-9_23
Xian, Y., Choudhury, S., He, Y., Schiele, B., Akata, Z.: Semantic projection network for zero-and few-label semantic segmentation. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 8256–8265 (2019)
Xu, J., Le, H.: Generating representative samples for few-shot classification. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 9003–9013 (2022)
Yang, F., Wang, R., Chen, X.: Semantic guided latent parts embedding for few-shot learning. In: Proceedings of the Winter Conference on Applications of Computer Vision, pp. 5447–5457 (2023)
Zhang, B., Li, X., Ye, Y., Huang, Z., Zhang, L.: Prototype completion with primitive knowledge for few-shot learning. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 3754–3762 (2021)
Acknowledgment
This work was supported by the National Key Research and Development Program (Grant No. 2019YFA0706200), the National Nature Science Foundation of China under Grant No. 62072152, 62172137, 72188101.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhou, Y., Chen, X., Guo, Y., Yu, J., Hong, R., Tian, Q. (2024). Advancing Incremental Few-Shot Semantic Segmentation via Semantic-Guided Relation Alignment and Adaptation. In: Rudinac, S., et al. MultiMedia Modeling. MMM 2024. Lecture Notes in Computer Science, vol 14554. Springer, Cham. https://doi.org/10.1007/978-3-031-53305-1_19
Download citation
DOI: https://doi.org/10.1007/978-3-031-53305-1_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53304-4
Online ISBN: 978-3-031-53305-1
eBook Packages: Computer ScienceComputer Science (R0)