Abstract
There inherently exists a hierarchy with different levels of classification granularity for object categories. This hierarchy involves rich semantic relationships among categories, which can benefit fine-grained visual categorization (FGVC) but is overlooked by most of previous works. In this paper, a novel graph neural networks based multi-granularity feature representation learning framework is presented for FGVC, which boosts feature learning of different grain levels simultaneously and enhances multiple granularity categorization. Under this framework, we propose two kinds of correlation graphs, i.e., Abstract Graph (AG) and Detailed Graph (DG). AG assigns one node for each grain level while DG regards different categories at each grain level as different nodes. With AG and DG, two graph neural networks based multiple grain feature learning methods are proposed. With AG, graph gate neural network is utilized to explore the interactions between features from different grain levels and help learn more discriminative and comprehensive feature representation for each grain level. Based on DG, we employ graph convolutional network to model the category hierarchical semantic relationships and enhance the feature by regularizing the semantic space division. To facilitate the research, we construct a large-scale car dataset, i.e., Car-FG3K (Available at http://www.nlpr.ia.ac.cn/iva/homepage/jqwang/Car-FG3K.htm), which covers three-level categories and is more challenging than the existing car datasets in terms of category count and view variation. We conduct experiments on this new dataset and two other datasets, i.e., CUB-200-2011 and FGVC-Aircraft, and our methods achieve comparable results to state-of-the-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bertinetto, L., Müller, R., Tertikas, K., Samangooei, S., Lord, N.A.: Making better mistakes: leveraging class hierarchies with deep networks. In: CVPR (2020)
Chang, D., et al.: The devil is in the channels: mutual-channel loss for fine-grained image classification. IEEE Trans. Image Process. 29, 4683–4695 (2020)
Chang, D., Pang, K., Zheng, Y., Ma, Z., Song, Y.Z., Guo, J.: Your “flamingo” is my “bird”: fine-grained, or not. In: CVPR (2021)
Chen, T., Lin, L., Chen, R., Wu, Y., Luo, X.: Knowledge-embedded representation learning for fine-grained image recognition. In: IJCAI (2018)
Chen, T., Wu, W., Gao, Y., Dong, L., Luo, X., Lin, L.: Fine-grained representation learning and recognition by exploiting hierarchical semantic embedding. In: MM (2018)
Chen, Y., Bai, Y., Zhang, W., Mei, T.: Destruction and construction learning for fine-grained image recognition. In: CVPR (2019)
Cho, K., van Merrienboer, B., Gulcehre, C., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: EMNLP (2014)
Du, R., et al.: Fine-grained visual classification via progressive multi-granularity training of jigsaw patches. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12365, pp. 153–168. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58565-5_10
Fang, H., Xu, Y., Wang, W., Liu, X., Zhu, S.: Learning knowledge-guided pose grammar machine for 3D human pose estimation. In: AAAI (2018)
Fu, J., Zheng, H., Mei, T.: Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: CVPR (2017)
Gao, Y., Chen, Y., Wang, J., Lu, H.: Progressive rectification network for irregular text recognition. Sci. China Inf. Sci. 63(2), 1–14 (2020). https://doi.org/10.1007/s11432-019-2710-7
Gao, Z., Wang, L., Wu, G.: Lip: local importance-based pooling. In: ICCV (2019)
Ge, W., Lin, X., Yu, Y.: Weakly supervised complementary parts models for fine-grained image classification from the bottom up. In: CVPR (2019)
He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: ICCV (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Huang, L., Huang, Y., Ouyang, W., Wang, L.: Part-level graph convolutional network for skeleton-based action recognition. In: AAAI (2020)
Huang, Z., Li, Y.: Interpretable and accurate fine-grained recognition via region grouping. In: CVPR (2020)
Kipf, T., Welling, M.: Semi-supervised classification with graph convolutional networks. ArXiv (2017)
Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3D object representations for fine-grained categorization. In: 3dRR-13 (2013)
Li, Y., Tarlow, D., Brockschmidt, M., Zemel, R.S.: Gated graph sequence neural networks. CoRR (2016)
Luo, W., Zhang, H., Li, J., Wei, X.: Learning semantically enhanced feature for fine-grained image classification. IEEE Signal Process. Lett. 27, 1545–1549(2020)
Maji, S., Kannala, J., Rahtu, E., Blaschko, M., Vedaldi, A.: Fine-grained visual classification of aircraft. Technical report (2013)
Nilsback, M.E., Zisserman, A.: Automated flower classification over a large number of classes. In: ICVGIP (2008)
Scarselli, F., Gori, M., Tsoi, A.C., Hagenbuchner, M., Monfardini, G.: The graph neural network model. IEEE Trans. Neural Netw. 20(1), 61–80 (2009). https://doi.org/10.1109/TNN.2008.2005605
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The caltech-UCSD birds-200-2011 dataset. Technical report (2011)
Wang, Y., et al.: Multi-label classification with label graph superimposing. In: AAAI (2020)
Wang, Z., Wang, S., Li, H., Dou, Z., Li, J.: Graph-propagation based correlation learning for weakly supervised fine-grained image classification. In: AAAI (2020)
Xie, S., Yang, T., Wang, X., Lin, Y.: Hyper-class augmented and regularized deep learning for fine-grained image classification. In: CVPR (2015)
Xu, K., Hu, W., Leskovec, J., Jegelka, S.: How powerful are graph neural networks? In: ICLR (2019)
Yang, L., Luo, P., Loy, C.C., Tang, X.: A large-scale car dataset for fine-grained categorization and verification. In: CVPR (2015)
Yang, L., Zhan, X., Chen, D., Yan, J., Loy, C.C., Lin, D.: Learning to cluster faces on an affinity graph. In: CVPR (2019)
Zheng, H., Fu, J., Zha, Z., Luo, J.: Looking for the devil in the details: learning trilinear attention sampling network for fine-grained image recognition. In: CVPR (2019)
Zhou, B., Khosla, A., Lapedriza, À., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: CVPR (2016)
Zhou, F., Lin, Y.: Fine-grained image classification by exploring bipartite-graph labels. In: CVPR (2015)
Zhou, P., et al.: Attention-based bidirectional long short-term memory networks for relation classification. In: ACL (2016)
Acknowledgements
This work was supported by National Natural Science Foundation of China (No. 61772527, 62002356, 62076235, 61976210, 62176254, 62002357 and 62006230), Ministry of Education industry-University Cooperative Education Program (Wei Qiao Venture Group, No. E1425201) and Open Research Projects of Zhejiang Lab (No. 2021KH0AB07).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Wu, H., Guo, H., Miao, Q., Huang, M., Wang, J. (2022). Graph Neural Networks Based Multi-granularity Feature Representation Learning for Fine-Grained Visual Categorization. In: Þór Jónsson, B., et al. MultiMedia Modeling. MMM 2022. Lecture Notes in Computer Science, vol 13142. Springer, Cham. https://doi.org/10.1007/978-3-030-98355-0_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-98355-0_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-98354-3
Online ISBN: 978-3-030-98355-0
eBook Packages: Computer ScienceComputer Science (R0)