Region Graph Embedding Network for Zero-Shot Learning

Xie, Guo-Sen; Liu, Li; Zhu, Fan; Zhao, Fang; Zhang, Zheng; Yao, Yazhou; Qin, Jie; Shao, Ling

doi:10.1007/978-3-030-58548-8_33

Conference paper
First Online: 29 October 2020

5931 Accesses
54 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12349))

Abstract

Most of the existing Zero-Shot Learning (ZSL) approaches learn direct embeddings from global features or image parts (regions) to the semantic space, which, however, fail to capture the appearance relationships between different local regions within a single image. In this paper, to model the relations among local image regions, we incorporate the region-based relation reasoning into ZSL. Our method, termed as Region Graph Embedding Network (RGEN), is trained end-to-end from raw image data. Specifically, RGEN consists of two branches: the Constrained Part Attention (CPA) branch and the Parts Relation Reasoning (PRR) branch. CPA branch is built upon attention and produces the image regions. To exploit the progressive interactions among these regions, we represent them as a region graph, on which the parts relation reasoning is performed with graph convolutions, thus leading to our PRR branch. To train our model, we introduce both a transfer loss and a balance loss to contrast class similarities and pursue the maximum response consistency among seen and unseen outputs, respectively. Extensive experiments on four datasets well validate the effectiveness of the proposed method under both ZSL and generalized ZSL settings.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
In this paper, part and region are alternatively used.

References

Akata, Z., Malinowski, M., Fritz, M., Schiele, B.: Multi-cue zero-shot learning with strong supervision. In: CVPR (2016)
Google Scholar
Akata, Z., Perronnin, F., Harchaoui, Z., Schmid, C.: Label-embedding for attribute-based classification. In: CVPR (2013)
Google Scholar
Akata, Z., Perronnin, F., Harchaoui, Z., Schmid, C.: Label-embedding for image classification. In: TPAMI (2016)
Google Scholar
Akata, Z., Reed, S., Walter, D., Lee, H., Schiele, B.: Evaluation of output embeddings for fine-grained image classification. In: CVPR (2015)
Google Scholar
Annadani, Y., Biswas, S.: Preserving semantic relations for zero-shot learning. In: CVPR (2018)
Google Scholar
Cacheux, Y., Borgne, H., Crucianu, M.: Modeling inter and intra-class relations in the triplet loss for zero-shot learning. In: ICCV (2019)
Google Scholar
Changpinyo, S., Chao, W.L., Gong, B., Sha, F.: Synthesized classifiers for zero-shot learning. In: CVPR (2016)
Google Scholar
Chao, W.-L., Changpinyo, S., Gong, B., Sha, F.: An empirical study and analysis of generalized zero-shot learning for object recognition in the wild. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 52–68. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_4
Chapter Google Scholar
Chen, L., Zhang, H., Xiao, J., Liu, W., Chang, S.F.: Zero-shot visual recognition using semantics-preserving adversarial embedding network. In: CVPR (2018)
Google Scholar
Elhoseiny, M., Elfeki, M.: Creativity inspired zero-shot learning. In: ICCV (2019)
Google Scholar
Elhoseiny, M., Zhu, Y., Zhang, H., Elgammal, A.M.: Link the head to the "beak": zero shot learning from noisy text description at part precision. In: CVPR (2017)
Google Scholar
Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: CVPR (2009)
Google Scholar
Felix, R., Kumar, V.B., Reid, I., Carneiro, G.: Multi-modal cycle-consistent generalized zero-shot learning. In: ECCV (2008)
Google Scholar
Frome, A., Corrado, G.S., Shlens, J., Bengio, S., Dean, J., T. Mikolov, E.A.: DeViSE: a deep visual-semantic embedding model. In: NeurIPS (2013)
Google Scholar
Fu, Y., Hospedales, T.M., Xiang, T., Gong, S.: Transductive multi-view zero-shot learning. In: TPAMI (2015)
Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: NeurIPS (2014)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Jayaraman, D., Grauman, K.: Zero-shot recognition with unreliable attributes. In: NeurIPS (2014)
Google Scholar
Jiang, H., Wang, R., Shan, S., Chen, X.: Transferable contrastive network for generalized zero-shot learning. In: ICCV (2019)
Google Scholar
Kampffmeyer, M., Chen, Y., Liang, X., Wang, H., Zhang, Y., Xing, E.: Rethinking knowledge graph propagation for zero-shot learning. In: CVPR (2019)
Google Scholar
Kipf, T., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv:1609.02907 (2016)
Kodirov, E., Xiang, T., Fu, Z., Gong, S.: Unsupervised domain adaptation for zero-shot learning. In: ICCV (2015)
Google Scholar
Kodirov, E., Xiang, T., Gong, S.: Semantic autoencoder for zero-shot learning. In: CVPR (2017)
Google Scholar
Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: CVPR (2009)
Google Scholar
Li, X., Yang, F., Cheng, H., Liu, W., Shen, D.: Contour knowledge transfer for salient object detection. In: ECCV (2018)
Google Scholar
Li, Y., Zhang, J., Zhang, J., Huang, K.: Discriminative learning of latent features for zero-shot recognition. In: CVPR (2018)
Google Scholar
Liu, S., Long, M., Wang, J., Jordan, M.: Generalized zero-shot learning with deep calibration network. In: NeurIPS (2018)
Google Scholar
Liu, Y., Guo, J., Cai, D., He, X.: Attribute attention for semantic disambiguation in zero-shot learning. In: ICCV (2019)
Google Scholar
Long, Y., Liu, L., Shen, F., Shao, L., Li, X.: Zero-shot learning using synthesised unseen visual data with diffusion regularisation. In: TPAMI (2017)
Google Scholar
Lu, X., Wang, W., Ma, C., Shen, J., Shao, L., Porikli, F.: See more, know more: unsupervised video object segmentation with co-attention siamese networks. In: CVPR (2019)
Google Scholar
Lu, X., Wang, W., Martin, D., Zhou, T., Shen, J., Luc, V.G.: Video object segmentation with episodic graph memory networks. In: Proceedings of the European Conference on Computer Vision (ECCV) (2020)
Google Scholar
Maaten, L.V.D., Hinton, G.: Visualizing data using t-SNE. JMLR 9, 2579–2605 (2008)
MATH Google Scholar
Morgado, P., Vasconcelos, N.: Semantically consistent regularization for zero-shot recognition. In: CVPR (2017)
Google Scholar
Norouzi, M., et al.: Zero-shot learning by convex combination of semantic embeddings. In: NeurIPS (2014)
Google Scholar
Palatucci, M., Pomerleau, D., Hinton, G.E., Mitchell, T.M.: Zero-shot learning with semantic output codes. In: NeurIPS (2009)
Google Scholar
Patterson, G., Hays, J.: Sun attribute database: discovering, annotating, and recognizing scene attributes. In: CVPR (2012)
Google Scholar
Qiao, R., Liu, L., Shen, C., van den Hengel, A.: Less is more: zero-shot learning from online textual documents with noise suppression. In: CVPR (2016)
Google Scholar
Reed, S., Akata, Z., Lee, H., Schiele, B.: Learning deep representations of fine-grained visual descriptions. In: CVPR (2016)
Google Scholar
Romera-Paredes, B., Torr, P.: An embarrassingly simple approach to zero-shot learning. In: ICML (2015)
Google Scholar
Shen, Y., Qin, J., Huang, L., Liu, L., Zhu, F., Shao, L.: Invertible zero-shot recognition flows. In: Proceedings of the European Conference on Computer Vision (ECCV) (2020)
Google Scholar
Socher, R., Ganjoo, M., Manning, C.D., Ng, A.: Zero-shot learning through cross-modal transfer. In: NeurIPS (2013)
Google Scholar
Song, J., Shen, C., Yang, Y., Liu, Y., Song, M.: Transductive unbiased embedding for zero-shot learning. In: CVPR (2018)
Google Scholar
Verma, V.K., Arora, G., Mishra, A., Rai, P.: Generalized zero-shot learning via synthesized examples. In: CVPR (2018)
Google Scholar
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The caltech-UCSD birds-200-2011 dataset. In: Technical report (2011)
Google Scholar
Wang, X., Ye, Y., Gupta, A.: Zero-shot recognition via semantic embeddings and knowledge graphs. In: CVPR (2018)
Google Scholar
Wu, B., et al.: Tencent ml-images: a large-scale multi-label image database for visual representation learning. IEEE Access 7, 172683–172693 (2019)
Article Google Scholar
Wu, B., Jia, F., Liu, W., Ghanem, B., Lyu, S.: Multi-label learning with missing labels using mixed dependency graphs. Int. J. Comput. Vis. 126, 875–896 (2018)
Article MathSciNet Google Scholar
Xian, Y., Akata, Z., Sharma, G., Nguyen, Q., Hein, M., Schiele, B.: Latent embeddings for zero-shot classification. In: CVPR (2016)
Google Scholar
Xian, Y., Lorenz, T., Schiele, B., Akata, Z.: Feature generating networks for zero-shot learning. In: CVPR (2018)
Google Scholar
Xian, Y., Schiele, B., Akata, Z.: Zero-shot learning-the good, the bad and the ugly. In: CVPR (2017)
Google Scholar
Xian, Y., Sharma, S., Saurabh, S., Akata, Z.: f-VAEGAN-D2: a feature generating framework for any-shot learning. In: CVPR (2019)
Google Scholar
Xie, G.S., et al.: Attentive region embedding network for zero-shot learning. In: CVPR (2019)
Google Scholar
Xie, G.S., Zhang, X.Y., Yang, W., Xu, M., Yan, S., Liu, C.L.: LG-CNN: from local parts to global discrimination for fine-grained recognition. Pattern Recogn. 71, 118–131 (2017)
Article Google Scholar
Xie, G.S., et al.: SRSC: selective, robust, and supervised constrained feature representation for image classification. IEEE Trans. Neural Netw. Learn. Syst. 31, 4290–4302 (2019)
Article MathSciNet Google Scholar
Xu, H., Saenko, K.: Ask, attend and answer: exploring question-guided spatial attention for visual question answering. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 451–466. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_28
Chapter Google Scholar
Xu, J., Zhao, R., Zhu, F., Wang, H., Ouyang, W.: Attention-aware compositional network for person re-identification. arXiv:1805.03344 (2018)
Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: ICML (2015)
Google Scholar
Yang, F.S.Y., Zhang, L., Xiang, T., Torr, P.H., Hospedales, T.M.: Learning to compare: Relation network for few-shot learning. In: CVPR (2018)
Google Scholar
Yang, G., Liu, J., Xu, J., Li, X.: Dissimilarity representation learning for generalized zero-shot recognition. In: MM (2018)
Google Scholar
Yao, Y., et al.: Exploiting web images for multi-output classification: from category to subcategories. IEEE Trans. Neural Netw. Learn. Syst. 31, 2348–2360 (2020)
Google Scholar
Yao, Y., Zhang, J., Shen, F., Hua, X., Xu, J., Tang, Z.: Exploiting web images for dataset construction: a domain robust approach. IEEE Trans. Multimedia 19, 1771–1784 (2017)
Article Google Scholar
Ye, M., Guo, Y.: Zero-shot classification with discriminative semantic representation learning. In: CVPR (2017)
Google Scholar
Yu, H., Lee, B.: Zero-shot learning via simultaneous generating and learning. In: NeurIPS (2019)
Google Scholar
Yu, Y., Ji, Z., Fu, Y., Guo, J., Pang, Y., Zhang, Z.: Stacked semantics-guided attention model for fine-grained zero-shot learning. In: NeurIPS (2018)
Google Scholar
Yu, Y., Ji, Z., Han, J., Zhang, Z.: Episode-based prototype generating network for zero-shot learning. In: CVPR (2020)
Google Scholar
Zhang, L., Xiang, T., Gong, S., et al.: Learning a deep embedding model for zero-shot learning. In: CVPR (2017)
Google Scholar
Zhang, L., et al.: Towards effective deep embedding for zero-shot learning. IEEE Trans. Circ. Syst. Video Technol. 30, 2843–2852 (2020)
Article Google Scholar
Zhang, L., et al.: Adaptive importance learning for improving lightweight image super-resolution network. Int. J. Comput. Vis. 128, 479–499 (2020)
Article MathSciNet Google Scholar
Zhang, L., et al.: Unsupervised domain adaptation using robust class-wise matching. IEEE Trans. Circ. Syst. Video Technol. 29, 1339–1349 (2018)
Article Google Scholar
Zhang, L., Wei, W., Bai, C., Gao, Y., Zhang, Y.: Exploiting clustering manifold structure for hyperspectral imagery super-resolution. IEEE Trans. Image Process. 27, 5969–5982 (2018)
Article MathSciNet Google Scholar
Zhang, L., Wei, W., Zhang, Y., Shen, C., Van Den Hengel, A., Shi, Q.: Cluster sparsity field: an internal hyperspectral imagery prior for reconstruction. Int. J. Comput. Vis. 126, 797–821 (2018)
Article Google Scholar
Zhang, Z., Saligrama, V.: Zero-shot learning via semantic similarity embedding. In: ICCV (2015)
Google Scholar
Zhang, Z., Saligrama, V.: Zero-shot learning via joint latent similarity embedding. In: CVPR (2016)
Google Scholar
Zhang, Z., Liu, L., Shen, F., Shen, H.T., Shao, L.: Binary multi-view clustering. IEEE Trans. Pattern Anal. Mach. Intell. 41, 1774–1782 (2018)
Article Google Scholar
Zhao, F., Liao, S., Xie, G.S., Zhao, J., Zhang, K., Shao, L.: Unsupervised domain adaptation with noise resistible mutual-training for person re-identification. In: ECCV (2020)
Google Scholar
Zhao, F., Zhao, J., Yan, S., Feng, J.: Dynamic conditional networks for few-shot learning. In: ECCV (2018)
Google Scholar
Zhou, B., Khosla, A.A.L., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: CVPR (2016)
Google Scholar
Zhu, P., Wang, H., Saligrama, V.: Generalized zero-shot recognition based on visually semantic embedding. In: CVPR (2019)
Google Scholar
Zhu, Y., Elhoseiny, M., Liu, B., Peng, X., Elgammal, A.: A generative adversarial approach for zero-shot learning from noisy texts. In: CVPR (2018)
Google Scholar
Zhu, Y., Xie, J., Tang, Z., Peng, X., Elgammal, A.: Learning where to look: semantic-guided multi-attention localization for zero-shot learning. In: NeurIPS (2019)
Google Scholar

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Nos. 61702163 and 61976116), the Fundamental Research Funds for the Central Universities (Nos. 30920021135), and the Key Project of Shenzhen Municipal Technology Research (Nos. JSGG20200103103401723).

Author information

Authors and Affiliations

Inception Institute of Artificial Intelligence, Abu Dhabi, UAE
Guo-Sen Xie, Li Liu, Fan Zhu, Fang Zhao, Jie Qin & Ling Shao
Harbin Institute of Technology, Shenzhen, China
Zheng Zhang
Peng Cheng Laboratory, Shenzhen, China
Zheng Zhang
Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE
Ling Shao
Nanjing University of Science and Technology, Nanjing, China
Yazhou Yao

Authors

Guo-Sen Xie
View author publications
You can also search for this author in PubMed Google Scholar
Li Liu
View author publications
You can also search for this author in PubMed Google Scholar
Fan Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Fang Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Zheng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yazhou Yao
View author publications
You can also search for this author in PubMed Google Scholar
Jie Qin
View author publications
You can also search for this author in PubMed Google Scholar
Ling Shao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Guo-Sen Xie or Zheng Zhang .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 9583 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xie, GS. et al. (2020). Region Graph Embedding Network for Zero-Shot Learning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12349. Springer, Cham. https://doi.org/10.1007/978-3-030-58548-8_33

Download citation

DOI: https://doi.org/10.1007/978-3-030-58548-8_33
Published: 29 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58547-1
Online ISBN: 978-3-030-58548-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics