Abstract
Generative adversarial networks (GANs) successfully generate high quality data by learning a mapping from a latent vector to the data. Various studies assert that the latent space of a GAN is semantically meaningful and can be utilized for advanced data analysis and manipulation. To analyze the real data in the latent space of a GAN, it is necessary to build an inference mapping from the data to the latent vector. This paper proposes an effective algorithm to accurately infer the latent vector by utilizing GAN discriminator features. Our primary goal is to increase inference mapping accuracy with minimal training overhead. Furthermore, using the proposed algorithm, we suggest a conditional image generation algorithm, namely a spatially conditioned GAN. Extensive evaluations confirmed that the proposed inference algorithm achieved more semantically accurate inference mapping than existing methods and can be successfully applied to advanced conditional image generation tasks.
Similar content being viewed by others
References
Baldi, P. (2012). Autoencoders, unsupervised learning, and deep architectures. In Proceedings of ICML workshop on unsupervised and transfer learning (pp. 37–49).
Bang, D., & Shim, H. (2018). Improved training of generative adversarial networks using representative features. In International conference on machine learning.
Berthelot, D., Schumm, T., & Metz, L. (2017). Began: Boundary equilibrium generative adversarial networks. arXiv preprint arXiv:1703.10717.
Brock, A., Donahue, J., & Simonyan, K. (2018). Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096.
Byrd, R. H., Lu, P., Nocedal, J., & Zhu, C. (1995). A limited memory algorithm for bound constrained optimization. SIAM Journal on Scientific Computing, 16(5), 1190–1208.
Donahue, J., Krähenbühl, P., & Darrell, T. (2017). Adversarial feature learning. In International conference on learning representations.
Dowson, D., & Landau, B. (1982). The Fréchet distance between multivariate normal distributions. Journal of Multivariate Analysis, 12(3), 450–455.
Dumoulin, V., Belghazi, I., Poole, B., Lamb, A., Arjovsky, M., Mastropietro, O., et al. (2017). Adversarially learned inference. In International conference on learning representations.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio Y. (2014). Generative adversarial nets. In Advances in neural information processing systems (pp. 2672–2680).
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., & Courville, A. C. (2017). Improved training of Wasserstein GANs. In Advances in neural information processing systems (pp. 5769–5779).
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778).
Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., & Keutzer, K. (2016). Squeezenet: AlexNet-level accuracy with 50x fewer parameters and\(<\)0.5 mb model size. arXiv preprint arXiv:1602.07360.
Iizuka, S., Simo-Serra, E., & Ishikawa, H. (2017). Globally and locally consistent image completion. ACM Transactions on Graphics (TOG), 36(4), 107.
Isola, P., Zhu, J. Y., Zhou, T., & Efros, A. A. (2017). Image-to-image translation with conditional adversarial networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 5967–5976). IEEE.
Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. In International conference on learning representations.
Kingma, D. P., & Welling, M. (2013). Auto-encoding variational Bayes. International conference on learning representations.
Krizhevsky, A. (2014). One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997.
Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images. Technical report. Citeseer.
Larsen, A. B. L., Sønderby, S. K., Larochelle, H., & Winther, O. (2015). Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300.
Li, C., Liu, H., Chen, C., Pu, Y., Chen, L., Henao, R., & Carin, L. (2017). Alice: Towards understanding adversarial learning for joint distribution matching. In Advances in neural information processing systems (pp. 5495–5503).
Liu, M. Y., & Tuzel, O. (2016). Coupled generative adversarial networks. In Advances in neural information processing systems (pp. 469–477).
Liu, M., Ding, Y., Xia, M., Liu, X., Ding, E., Zuo, W., & Wen, S. (2019). STGAN: A unified selective transfer network for arbitrary image attribute editing. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3673–3682).
Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep learning face attributes in the wild. In Proceedings of the IEEE international conference on computer vision (pp. 3730–3738).
Lucic, M., Kurach, K., Michalski, M., Gelly, S., & Bousquet, O. (2018). Are GANs created equal? A large-scale study. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, & R. Garnett (Eds.), Advances in neural information processing systems (Vol. 31, pp. 700–709). Red Hook: Curran Associates, Inc.
Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I., & Frey, B. (2016). Adversarial autoencoders. International conference on learning representations.
Mao, X., Li, Q., Xie, H., Lau, R. Y., Wang, Z., & Smolley, S. P. (2017). Least squares generative adversarial networks. In 2017 IEEE international conference on computer vision (ICCV) (pp. 2813–2821). IEEE.
Mescheder, L., Geiger, A., & Nowozin, S. (2018). Which training methods for GANs do actually converge? In International conference on machine learning (pp. 3478–3487).
Miyato, T., Kataoka, T., Koyama, M., & Yoshida, Y. (2018). Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957.
Radford, A., Metz, L., & Chintala, S. (2016). Unsupervised representation learning with deep convolutional generative adversarial networks. In International conference on learning representations.
Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In International conference learning representations.
Srivastava, A., Valkoz, L., Russell, C., Gutmann, M. U., & Sutton, C. (2017). Veegan: Reducing mode collapse in GANs using implicit variational learning. In Advances in neural information processing systems (pp. 3310–3320).
Wainwright, M. J., Jordan, M. I., et al. (2008). Graphical models, exponential families, and variational inference. Foundations and Trends® in Machine Learning, 1(1–2), 1–305.
Warde-Farley, D., & Bengio, Y. (2017). Improving generative adversarial networks with denoising feature matching. In International conference on learning representations.
Wu, Y., & He, K. (2018). Group normalization. In Proceedings of the European conference on computer vision (ECCV) (pp. 3–19).
Xiao, H., Rasul, K., & Vollgraf, R. (2017). Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747.
Zhang, H., Goodfellow, I., Metaxas, D., & Odena, A. (2018a). Self-attention generative adversarial networks. arXiv preprint arXiv:1805.08318.
Zhang, R., Isola, P., Efros, A. A., Shechtman, E., & Wang, O. (2018b) The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 586–595).
Zhang, W., Sun, J., & Tang, X. (2008). Cat head detection-how to effectively exploit shape and texture features. In European conference on computer vision (pp. 802–816). Berlin: Springer.
Zheng, C., Cham, T. J., & Cai, J. (2019). Pluralistic image completion. arXiv preprint arXiv:1903.04227.
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., & Torralba, A. (2016). Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2921–2929).
Zhu, J. Y., Krähenbühl, P., Shechtman, E., & Efros, A. A. (2016). Generative visual manipulation on the natural image manifold. In European conference on computer vision. Berlin: Springer.
Zhu, J. Y., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In 2017 IEEE international conference on computer vision (ICCV).
Acknowledgements
This research was supported by the Basic Science Research Program through the National Research Foundation of Korea funded by the Korean Government (Grant NRF-2019R1A2C2006123), the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2019-2016-0-00288) supervised by the IITP (Institute for Information & communications Technology Planning & Evaluation), and also by ICT R&D program of MSIP/IITP. [R7124-16-0004, Development of Intelligent Interaction Technology Based on Context Awareness and Human Intention Understanding].
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Jun-Yan Zhu, Hongsheng Li, Eli Shechtman, Ming-Yu Liu, Jan Kautz, Antonio Torralba.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bang, D., Kang, S. & Shim, H. Discriminator Feature-Based Inference by Recycling the Discriminator of GANs. Int J Comput Vis 128, 2436–2458 (2020). https://doi.org/10.1007/s11263-020-01311-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-020-01311-4