Abstract
Deep generative models have been successfully applied to Zero-Shot Learning (ZSL) recently. However, the underlying drawbacks of GANs and VAEs (e.g., the hardness of training with ZSL-oriented regularizers and the limited generation quality) hinder the existing generative ZSL models from fully bypassing the seen-unseen bias. To tackle the above limitations, for the first time, this work incorporates a new family of generative models (i.e., flow-based models) into ZSL. The proposed Invertible Zero-shot Flow (IZF) learns factorized data embeddings (i.e., the semantic factors and the non-semantic ones) with the forward pass of an invertible flow network, while the reverse pass generates data samples. This procedure theoretically extends conventional generative flows to a factorized conditional scheme. To explicitly solve the bias problem, our model enlarges the seen-unseen distributional discrepancy based on a negative sample-based distance measurement. Notably, IZF works flexibly with either a naive Bayesian classifier or a held-out trainable one for zero-shot recognition. Experiments on widely-adopted ZSL benchmarks demonstrate the significant performance gain of IZF over existing methods, in both classic and generalized settings.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Note that reverse pass and back-propagation are different concepts.
References
Akata, Z., Reed, S., Walter, D., Lee, H., Schiele, B.: Evaluation of output embeddings for fine-grained image classification. In: CVPR (2015)
Ardizzone, L., et al.: Analyzing inverse problems with invertible neural networks. In: ICLR (2019)
Ardizzone, L., Lüth, C., Kruse, J., Rother, C., Köthe, U.: Guided image generation with conditional invertible neural networks. arXiv preprint arXiv:1907.02392 (2019)
Cacheux, Y.L., Borgne, H.L., Crucianu, M.: Modeling inter and intra-class relations in the triplet loss for zero-shot learning. In: ICCV (2019)
Che, T., Li, Y., Jacob, A.P., Bengio, Y., Li, W.: Mode regularized generative adversarial networks. In: ICLR (2017)
Dinh, L., Krueger, D., Bengio, Y.: Nice: non-linear independent components estimation. In: ICLR Workshops (2014)
Dinh, L., Sohl-Dickstein, J., Bengio, S.: Density estimation using real NVP. In: ICLR (2017)
Elhoseiny, M., Elfeki, M.: Creativity inspired zero-shot learning. In: ICCV (2019)
Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.A.: Describing objects by their attributes. In: CVPR (2009)
Frome, A., et al.: DeViSE: a deep visual-semantic embedding model. In: NeurIPS (2013)
Gao, R., et al.: Zero-VAE-GAN: generating unseen features for generalized and transductive zero-shot learning. IEEE Trans. Image Process. 29, 3665–3680 (2020)
Gao, R., Hou, X., Qin, J., Liu, L., Zhu, F., Zhang, Z.: A joint generative model for zero-shot learning. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11132, pp. 631–646. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11018-5_50
Goodfellow, I., et al.: Generative adversarial nets. In: NeurIPS (2015)
Grover, A., Dhar, M., Ermon, S.: Flow-GAN: combining maximum likelihood and adversarial learning in generative models. In: AAAI (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Higgins, I., et al.: Beta-VAE: learning basic visual concepts with a constrained variational framework. In: ICLR (2017)
Hoogeboom, E., Peters, J.W., van den Berg, R., Welling, M.: Integer discrete flows and lossless compression. In: NeurIPS (2019)
Huang, H., Wang, C., Yu, P.S., Wang, C.D.: Generative dual adversarial network for generalized zero-shot learning. In: CVPR (2019)
Jiang, H., Wang, R., Shan, S., Chen, X.: Transferable contrastive network for generalized zero-shot learning. In: ICCV (2019)
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)
Kingma, D., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: NeurIPS (2018)
Kingma, D., Welling, M.: Auto-encoding variational Bayes. In: ICLR (2014)
Kodirov, E., Xiang, T., Gong, S.: Semantic autoencoder for zero-shot learning. In: CVPR (2017)
Kumar Verma, V., Arora, G., Mishra, A., Rai, P.: Generalized zero-shot learning via synthesized examples. In: CVPR (2018)
Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: CVPR (2009)
Lampert, C.H., Nickisch, H., Harmeling, S.: Attribute-based classification for zero-shot visual object categorization. IEEE Trans. Pattern Anal. Mach. Intell. 36(3), 453–465 (2013)
Li, J., Jing, M., Lu, K., Ding, Z., Zhu, L., Huang, Z.: Leveraging the invariant side of generative zero-shot learning. In: CVPR (2019)
Li, K., Min, M.R., Fu, Y.: Rethinking zero-shot learning: a conditional visual classification perspective. In: ICCV (2019)
Liu, R., Liu, Y., Gong, X., Wang, X., Li, H.: Conditional adversarial generative flow for controllable image synthesis. In: CVPR (2019)
Liu, S., Long, M., Wang, J., Jordan, M.I.: Generalized zero-shot learning with deep calibration network. In: NeurIPS (2018)
Liu, Y., Guo, J., Cai, D., He, X.: Attribute attention for semantic disambiguation in zero-shot learning. In: ICCV (2019)
Long, Y., Liu, L., Shen, Y., Shao, L.: Towards affordable semantic searching: zero-shot retrieval via dominant attributes. In: AAAI (2018)
Maas, A.L., Hannun, A.Y., Ng., A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: ICML (2013)
van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(Nov), 2579–2605 (2008)
Mandal, D., et al.: Out-of-distribution detection for generalized zero-shot action recognition. In: CVPR (2019)
Mensink, T., Verbeek, J., Perronnin, F., Csurka, G.: Distance-based image classification: generalizing to new classes at near-zero cost. IEEE Trans. Pattern Anal. Mach. Intell. 35(11), 2624–2637 (2013)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NeurIPS (2013)
Mishra, A., Krishna Reddy, S., Mittal, A., Murthy, H.A.: A generative model for zero shot learning using conditional variational autoencoders. In: CVPR Workshops (2018)
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: NeurIPS (2019)
Patterson, G., Hays, J.: Sun attribute database: discovering, annotating, and recognizing scene attributes. In: CVPR (2012)
Prenger, R., Valle, R., Catanzaro, B.: WaveGlow: a flow-based generative network for speech synthesis. In: ICASSP (2019)
Qin, J., et al.: Zero-shot action recognition with error-correcting output codes. In: CVPR (2017)
Radovanović, M., Nanopoulos, A., Ivanović, M.: Hubs in space: popular nearest neighbors in high-dimensional data. J. Mach. Learn. Res. 11(Sep), 2487–2531 (2010)
Romera-Paredes, B., Torr, P.: An embarrassingly simple approach to zero-shot learning. In: ICML (2015)
Scheirer, W.J., de Rezende Rocha, A., Sapkota, A., Boult, T.E.: Toward open set recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(7), 1757–1772 (2012)
Schonfeld, E., Ebrahimi, S., Sinha, S., Darrell, T., Akata, Z.: Generalized zero- and few-shot learning via aligned variational autoencoders. In: CVPR (2019)
Shen, Y., Liu, L., Shen, F., Shao, L.: Zero-shot sketch-image hashing. In: CVPR (2018)
Shen, Z., Lai, W.-S., Xu, T., Kautz, J., Yang, M.-H.: Exploiting semantics for face image deblurring. Int. J. Comput. Vis. 128(7), 1829–1846 (2020). https://doi.org/10.1007/s11263-019-01288-9
Shen, Z., et al.: Human-aware motion deblurring. In: ICCV (2019)
Socher, R., Ganjoo, M., Sridhar, H., Bastani, O., Manning, C.D., Ng, A.Y.: Zero-shot learning through cross-modal transfer. In: NeurIPS (2013)
Sohn, K., Lee, H., Yan, X.: Learning structured output representation using deep conditional generative models. In: NeurIPS (2015)
Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P.H., Hospedales, T.M.: Learning to compare: relation network for few-shot learning. In: CVPR (2018)
Tolstikhin, I., Bousquet, O., Gelly, S., Schoelkopf, B.: Wasserstein auto-encoders. In: ICLR (2018)
Tong, B., Wang, C., Klinkigt, M., Kobayashi, Y., Nonaka, Y.: Hierarchical disentanglement of discriminative latent features for zero-shot learning. In: CVPR (2019)
Tran, D., Vafa, K., Agrawal, K.K., Dinh, L., Poole, B.: Discrete flows: invertible generative models of discrete data. In: ICLR Workshops (2019)
Tsai, Y.H.H., Huang, L.K., Salakhutdinov, R.: Learning robust visual-semantic embeddings. In: ICCV (2017)
Tsai, Y.H.H., Liang, P.P., Zadeh, A., Morency, L.P., Salakhutdinov, R.: Learning factorized multimodal representations. In: ICLR (2019)
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD Birds-200-2011 dataset. Technical report CNS-TR-2011-001, California Institute of Technology (2011)
Wang, Q., Chen, K.: Zero-shot visual recognition via bidirectional latent embedding. Int. J. Comput. Vis. 124(3), 356–383 (2017). https://doi.org/10.1007/s11263-017-1027-5
Xian, Y., Akata, Z., Sharma, G., Nguyen, Q., Hein, M., Schiele, B.: Latent embeddings for zero-shot classification. In: CVPR (2016)
Xian, Y., Lampert, C.H., Schiele, B., Akata, Z.: Zero-shot learning-a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans. Pattern Anal. Mach. Intell. 41(9), 2251–2265 (2018)
Xian, Y., Lorenz, T., Schiele, B., Akata, Z.: Feature generating networks for zero-shot learning. In: CVPR (2018)
Xian, Y., Schiele, B., Akata, Z.: Zero-shot learning-the good, the bad and the ugly. In: CVPR (2017)
Xian, Y., Sharma, S., Schiele, B., Akata, Z.: f-VAEGAN-D2: a feature generating framework for any-shot learning. In: CVPR (2019)
Xie, G.S., et al.: Attentive region embedding network for zero-shot learning. In: CVPR (2019)
Yu, H., Lee, B.: Zero-shot learning via simultaneous generating and learning. In: NeurIPS (2019)
Zhang, F., Shi, G.: Co-representation network for generalized zero-shot learning. In: ICML (2019)
Zhang, H., Koniusz, P.: Zero-shot kernel learning. In: CVPR (2018)
Zhang, L., Xiang, T., Gong, S.: Learning a deep embedding model for zero-shot learning. In: CVPR (2017)
Zhang, Z., Saligrama, V.: Zero-shot learning via semantic similarity embedding. In: ICCV (2015)
Zhu, Y., Xie, J., Liu, B., Elgammal, A.: Learning feature-to-feature translator by alternating back-propagation for generative zero-shot learning. In: ICCV (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Shen, Y., Qin, J., Huang, L., Liu, L., Zhu, F., Shao, L. (2020). Invertible Zero-Shot Recognition Flows. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12361. Springer, Cham. https://doi.org/10.1007/978-3-030-58517-4_36
Download citation
DOI: https://doi.org/10.1007/978-3-030-58517-4_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58516-7
Online ISBN: 978-3-030-58517-4
eBook Packages: Computer ScienceComputer Science (R0)
