Skip to main content

Invertible Zero-Shot Recognition Flows

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12361))

Abstract

Deep generative models have been successfully applied to Zero-Shot Learning (ZSL) recently. However, the underlying drawbacks of GANs and VAEs (e.g., the hardness of training with ZSL-oriented regularizers and the limited generation quality) hinder the existing generative ZSL models from fully bypassing the seen-unseen bias. To tackle the above limitations, for the first time, this work incorporates a new family of generative models (i.e., flow-based models) into ZSL. The proposed Invertible Zero-shot Flow (IZF) learns factorized data embeddings (i.e., the semantic factors and the non-semantic ones) with the forward pass of an invertible flow network, while the reverse pass generates data samples. This procedure theoretically extends conventional generative flows to a factorized conditional scheme. To explicitly solve the bias problem, our model enlarges the seen-unseen distributional discrepancy based on a negative sample-based distance measurement. Notably, IZF works flexibly with either a naive Bayesian classifier or a held-out trainable one for zero-shot recognition. Experiments on widely-adopted ZSL benchmarks demonstrate the significant performance gain of IZF over existing methods, in both classic and generalized settings.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
EUR   29.95
Price includes VAT (Finland)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR   85.59
Price includes VAT (Finland)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR   109.99
Price includes VAT (Finland)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Note that reverse pass and back-propagation are different concepts.

References

  1. Akata, Z., Reed, S., Walter, D., Lee, H., Schiele, B.: Evaluation of output embeddings for fine-grained image classification. In: CVPR (2015)

    Google Scholar 

  2. Ardizzone, L., et al.: Analyzing inverse problems with invertible neural networks. In: ICLR (2019)

    Google Scholar 

  3. Ardizzone, L., Lüth, C., Kruse, J., Rother, C., Köthe, U.: Guided image generation with conditional invertible neural networks. arXiv preprint arXiv:1907.02392 (2019)

  4. Cacheux, Y.L., Borgne, H.L., Crucianu, M.: Modeling inter and intra-class relations in the triplet loss for zero-shot learning. In: ICCV (2019)

    Google Scholar 

  5. Che, T., Li, Y., Jacob, A.P., Bengio, Y., Li, W.: Mode regularized generative adversarial networks. In: ICLR (2017)

    Google Scholar 

  6. Dinh, L., Krueger, D., Bengio, Y.: Nice: non-linear independent components estimation. In: ICLR Workshops (2014)

    Google Scholar 

  7. Dinh, L., Sohl-Dickstein, J., Bengio, S.: Density estimation using real NVP. In: ICLR (2017)

    Google Scholar 

  8. Elhoseiny, M., Elfeki, M.: Creativity inspired zero-shot learning. In: ICCV (2019)

    Google Scholar 

  9. Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.A.: Describing objects by their attributes. In: CVPR (2009)

    Google Scholar 

  10. Frome, A., et al.: DeViSE: a deep visual-semantic embedding model. In: NeurIPS (2013)

    Google Scholar 

  11. Gao, R., et al.: Zero-VAE-GAN: generating unseen features for generalized and transductive zero-shot learning. IEEE Trans. Image Process. 29, 3665–3680 (2020)

    Article  Google Scholar 

  12. Gao, R., Hou, X., Qin, J., Liu, L., Zhu, F., Zhang, Z.: A joint generative model for zero-shot learning. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11132, pp. 631–646. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11018-5_50

    Chapter  Google Scholar 

  13. Goodfellow, I., et al.: Generative adversarial nets. In: NeurIPS (2015)

    Google Scholar 

  14. Grover, A., Dhar, M., Ermon, S.: Flow-GAN: combining maximum likelihood and adversarial learning in generative models. In: AAAI (2018)

    Google Scholar 

  15. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  16. Higgins, I., et al.: Beta-VAE: learning basic visual concepts with a constrained variational framework. In: ICLR (2017)

    Google Scholar 

  17. Hoogeboom, E., Peters, J.W., van den Berg, R., Welling, M.: Integer discrete flows and lossless compression. In: NeurIPS (2019)

    Google Scholar 

  18. Huang, H., Wang, C., Yu, P.S., Wang, C.D.: Generative dual adversarial network for generalized zero-shot learning. In: CVPR (2019)

    Google Scholar 

  19. Jiang, H., Wang, R., Shan, S., Chen, X.: Transferable contrastive network for generalized zero-shot learning. In: ICCV (2019)

    Google Scholar 

  20. Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)

    Google Scholar 

  21. Kingma, D., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: NeurIPS (2018)

    Google Scholar 

  22. Kingma, D., Welling, M.: Auto-encoding variational Bayes. In: ICLR (2014)

    Google Scholar 

  23. Kodirov, E., Xiang, T., Gong, S.: Semantic autoencoder for zero-shot learning. In: CVPR (2017)

    Google Scholar 

  24. Kumar Verma, V., Arora, G., Mishra, A., Rai, P.: Generalized zero-shot learning via synthesized examples. In: CVPR (2018)

    Google Scholar 

  25. Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: CVPR (2009)

    Google Scholar 

  26. Lampert, C.H., Nickisch, H., Harmeling, S.: Attribute-based classification for zero-shot visual object categorization. IEEE Trans. Pattern Anal. Mach. Intell. 36(3), 453–465 (2013)

    Article  Google Scholar 

  27. Li, J., Jing, M., Lu, K., Ding, Z., Zhu, L., Huang, Z.: Leveraging the invariant side of generative zero-shot learning. In: CVPR (2019)

    Google Scholar 

  28. Li, K., Min, M.R., Fu, Y.: Rethinking zero-shot learning: a conditional visual classification perspective. In: ICCV (2019)

    Google Scholar 

  29. Liu, R., Liu, Y., Gong, X., Wang, X., Li, H.: Conditional adversarial generative flow for controllable image synthesis. In: CVPR (2019)

    Google Scholar 

  30. Liu, S., Long, M., Wang, J., Jordan, M.I.: Generalized zero-shot learning with deep calibration network. In: NeurIPS (2018)

    Google Scholar 

  31. Liu, Y., Guo, J., Cai, D., He, X.: Attribute attention for semantic disambiguation in zero-shot learning. In: ICCV (2019)

    Google Scholar 

  32. Long, Y., Liu, L., Shen, Y., Shao, L.: Towards affordable semantic searching: zero-shot retrieval via dominant attributes. In: AAAI (2018)

    Google Scholar 

  33. Maas, A.L., Hannun, A.Y., Ng., A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: ICML (2013)

    Google Scholar 

  34. van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(Nov), 2579–2605 (2008)

    MATH  Google Scholar 

  35. Mandal, D., et al.: Out-of-distribution detection for generalized zero-shot action recognition. In: CVPR (2019)

    Google Scholar 

  36. Mensink, T., Verbeek, J., Perronnin, F., Csurka, G.: Distance-based image classification: generalizing to new classes at near-zero cost. IEEE Trans. Pattern Anal. Mach. Intell. 35(11), 2624–2637 (2013)

    Article  Google Scholar 

  37. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NeurIPS (2013)

    Google Scholar 

  38. Mishra, A., Krishna Reddy, S., Mittal, A., Murthy, H.A.: A generative model for zero shot learning using conditional variational autoencoders. In: CVPR Workshops (2018)

    Google Scholar 

  39. Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: NeurIPS (2019)

    Google Scholar 

  40. Patterson, G., Hays, J.: Sun attribute database: discovering, annotating, and recognizing scene attributes. In: CVPR (2012)

    Google Scholar 

  41. Prenger, R., Valle, R., Catanzaro, B.: WaveGlow: a flow-based generative network for speech synthesis. In: ICASSP (2019)

    Google Scholar 

  42. Qin, J., et al.: Zero-shot action recognition with error-correcting output codes. In: CVPR (2017)

    Google Scholar 

  43. Radovanović, M., Nanopoulos, A., Ivanović, M.: Hubs in space: popular nearest neighbors in high-dimensional data. J. Mach. Learn. Res. 11(Sep), 2487–2531 (2010)

    MathSciNet  MATH  Google Scholar 

  44. Romera-Paredes, B., Torr, P.: An embarrassingly simple approach to zero-shot learning. In: ICML (2015)

    Google Scholar 

  45. Scheirer, W.J., de Rezende Rocha, A., Sapkota, A., Boult, T.E.: Toward open set recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(7), 1757–1772 (2012)

    Article  Google Scholar 

  46. Schonfeld, E., Ebrahimi, S., Sinha, S., Darrell, T., Akata, Z.: Generalized zero- and few-shot learning via aligned variational autoencoders. In: CVPR (2019)

    Google Scholar 

  47. Shen, Y., Liu, L., Shen, F., Shao, L.: Zero-shot sketch-image hashing. In: CVPR (2018)

    Google Scholar 

  48. Shen, Z., Lai, W.-S., Xu, T., Kautz, J., Yang, M.-H.: Exploiting semantics for face image deblurring. Int. J. Comput. Vis. 128(7), 1829–1846 (2020). https://doi.org/10.1007/s11263-019-01288-9

    Article  Google Scholar 

  49. Shen, Z., et al.: Human-aware motion deblurring. In: ICCV (2019)

    Google Scholar 

  50. Socher, R., Ganjoo, M., Sridhar, H., Bastani, O., Manning, C.D., Ng, A.Y.: Zero-shot learning through cross-modal transfer. In: NeurIPS (2013)

    Google Scholar 

  51. Sohn, K., Lee, H., Yan, X.: Learning structured output representation using deep conditional generative models. In: NeurIPS (2015)

    Google Scholar 

  52. Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P.H., Hospedales, T.M.: Learning to compare: relation network for few-shot learning. In: CVPR (2018)

    Google Scholar 

  53. Tolstikhin, I., Bousquet, O., Gelly, S., Schoelkopf, B.: Wasserstein auto-encoders. In: ICLR (2018)

    Google Scholar 

  54. Tong, B., Wang, C., Klinkigt, M., Kobayashi, Y., Nonaka, Y.: Hierarchical disentanglement of discriminative latent features for zero-shot learning. In: CVPR (2019)

    Google Scholar 

  55. Tran, D., Vafa, K., Agrawal, K.K., Dinh, L., Poole, B.: Discrete flows: invertible generative models of discrete data. In: ICLR Workshops (2019)

    Google Scholar 

  56. Tsai, Y.H.H., Huang, L.K., Salakhutdinov, R.: Learning robust visual-semantic embeddings. In: ICCV (2017)

    Google Scholar 

  57. Tsai, Y.H.H., Liang, P.P., Zadeh, A., Morency, L.P., Salakhutdinov, R.: Learning factorized multimodal representations. In: ICLR (2019)

    Google Scholar 

  58. Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD Birds-200-2011 dataset. Technical report CNS-TR-2011-001, California Institute of Technology (2011)

    Google Scholar 

  59. Wang, Q., Chen, K.: Zero-shot visual recognition via bidirectional latent embedding. Int. J. Comput. Vis. 124(3), 356–383 (2017). https://doi.org/10.1007/s11263-017-1027-5

    Article  MathSciNet  Google Scholar 

  60. Xian, Y., Akata, Z., Sharma, G., Nguyen, Q., Hein, M., Schiele, B.: Latent embeddings for zero-shot classification. In: CVPR (2016)

    Google Scholar 

  61. Xian, Y., Lampert, C.H., Schiele, B., Akata, Z.: Zero-shot learning-a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans. Pattern Anal. Mach. Intell. 41(9), 2251–2265 (2018)

    Article  Google Scholar 

  62. Xian, Y., Lorenz, T., Schiele, B., Akata, Z.: Feature generating networks for zero-shot learning. In: CVPR (2018)

    Google Scholar 

  63. Xian, Y., Schiele, B., Akata, Z.: Zero-shot learning-the good, the bad and the ugly. In: CVPR (2017)

    Google Scholar 

  64. Xian, Y., Sharma, S., Schiele, B., Akata, Z.: f-VAEGAN-D2: a feature generating framework for any-shot learning. In: CVPR (2019)

    Google Scholar 

  65. Xie, G.S., et al.: Attentive region embedding network for zero-shot learning. In: CVPR (2019)

    Google Scholar 

  66. Yu, H., Lee, B.: Zero-shot learning via simultaneous generating and learning. In: NeurIPS (2019)

    Google Scholar 

  67. Zhang, F., Shi, G.: Co-representation network for generalized zero-shot learning. In: ICML (2019)

    Google Scholar 

  68. Zhang, H., Koniusz, P.: Zero-shot kernel learning. In: CVPR (2018)

    Google Scholar 

  69. Zhang, L., Xiang, T., Gong, S.: Learning a deep embedding model for zero-shot learning. In: CVPR (2017)

    Google Scholar 

  70. Zhang, Z., Saligrama, V.: Zero-shot learning via semantic similarity embedding. In: ICCV (2015)

    Google Scholar 

  71. Zhu, Y., Xie, J., Liu, B., Elgammal, A.: Learning feature-to-feature translator by alternating back-propagation for generative zero-shot learning. In: ICCV (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jie Qin .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 263 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shen, Y., Qin, J., Huang, L., Liu, L., Zhu, F., Shao, L. (2020). Invertible Zero-Shot Recognition Flows. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12361. Springer, Cham. https://doi.org/10.1007/978-3-030-58517-4_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58517-4_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58516-7

  • Online ISBN: 978-3-030-58517-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics