Abstract
Face attribute editing is to edit the face image by modifying single or multiple attributes while maintaining the face identity. In the paper, we propose a method for attribute editing of face images by using the generative adversarial networks: conditional generative adversarial nets is used as the backbone of the framework and input attributes as conditions to the generator, the generator combines the encoder–decoder with U-Net, and the attribute classifier is added to guarantee the correct attribute operation on the generated image. The receptive field of a single discriminator is very limited, especially when the size of the training picture becomes larger, which will affect the extraction of information. In this paper, we tackle these limitations by using multi-scale discriminators to guide the generator to generate better details. It can macroscopically grasp the global information of the generated pictures and obtain more information of the receptive field. We demonstrate the effectiveness of our method and generate well-preserved facial detail images on CelebA dataset. The fidelity of the generated image is improved, and the method has better flexibility. The experiments show that our method is effective on the real-world dataset.
Similar content being viewed by others
References
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative Adversarial Nets. Adv. Neural Inf. Process. Syst. 2672–2680 (2014)
Denton, E., Chintala, S., Szlam, A., Fergus, R.: Deep generative image models using a laplacian pyramid of adversarial networks. Adv. Neural Inf. Process. Syst. 1, 1486–1494 (2015)
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs. Adv. Neural Inf. Process. Syst. 2234–2242 (2016)
Berthelot, D., Schumm, T., Metz, L.: BEGAN: boundary equilibrium generative adversarial networks. IEEE Access (2017). https://doi.org/10.1109/ACCESS.2018.2804278
Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z., Shi, W.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, 2017-January, pp. 105–114 (2017). https://doi.org/10.1109/CVPR.2017.19
Hinton, G.E., Zemel, R.S.: Autoencoders, minimum description length and Helmholtz free energy. Adv. Neural. Inf. Process. Syst. 6, 3–10 (1994). https://doi.org/10.1021/jp906511z
He, Z., Zuo, W., Kan, M., Shan, S., Chen, X.: AttGAN: Facial Attribute Editing by Only Changing What You Want. arXiv Prepr (2017). arXiv:1711.10678
Perarnau, G., van de Weijer, J., Raducanu, B., Álvarez, J.M.: Invertible conditional GANs for image editing. arXiv Prepr (2016). arXiv:1611.06355
Larsen, A.B.L., Sønderby, S.K., Larochelle, H., Winther, O.: Autoencoding beyond pixels using a learned similarity metric. arXiv Prepr. (2015). arXiv:1512.09300
Lample, G., Zeghidour, N., Usunier, N., Bordes, A., Denoyer, L., Ranzato, M.: Fader Networks: Manipulating images by sliding attributes. Adv. Neural Inf. Process. Syst. 5967–5976 (2017)
Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv Prepr (2014). arXiv:1411.1784
Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8798–8807 (2018). https://doi.org/10.1109/CVPR.2018.00917
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.: Improved training of wasserstein GANs. Adv. Neural Inf. Process. Syst. 5767–5777 (2017)
Vincent, P., Larochelle, H.: Extracting and composing robust features with denoising autoencoders, pp. 1–23 (2010). https://doi.org/10.1145/1390156.1390294
Kingma, D.P., Welling, M.: Auto-Encoding Variational Bayes. arXiv Prepr. (2013). arXiv:1312.6114
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 9351. Springer, Cham, pp. 234–241 (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv Prepr. (2015). arXiv:1511.06434
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein GAN. arXiv Prepr (2017). arXiv:1701.07875
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017. 2017-January, pp. 5967–5976 (2017). https://doi.org/10.1109/CVPR.2017.632
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision, 2017-October, pp. 2242–2251 (2017) https://doi.org/10.1109/ICCV.2017.244
Liu, M.-Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. Adv. Neural Inf. Process. Syst. (2017). https://doi.org/10.1109/ICAIIT.2019.8834613
Zhou, S., Xiao, T., Yang, Y., Feng, D., He, Q., He, W.: GeneGAN: Learning object transfiguration and attribute subspace from unpaired data. arXiv Prepr. (2017). arXiv:1705.04932
Xiao, T., Hong, J., Ma, J.: DNA-GAN: Learning disentangled representations from multi-attribute images. arXiv Prepr. (2017). arXiv:1711.05415
Xiao, T., Hong, J., Ma, J.: ELEGANT: exchanging latent encodings with GAN for transferring multiple face attributes. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11214. Springer, Cham, pp. 172–187 (2018). https://doi.org/10.1007/978-3-030-01249-6_11
Zhang, G., Kan, M., Shan, S., Chen, X.: Generative adversarial network with spatial attention for face attribute editing. In: LNCS, vol. 11210, pp. 422–437. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01231-1_26
Choi, Y., Choi, M., Kim, M., Ha, J.W., Kim, S., Choo, J.: StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8789–8797 (2018). https://doi.org/10.1109/CVPR.2018.00916
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv Prepr. (2015). arXiv:1502.03167
Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv Prepr. (2016). arXiv:1607.06450
Ulyanov, D., Vedaldi, A., Lempitsky, V.: Instance normalization: The missing ingredient for fast stylization. arXiv Prepr. (2016). arXiv:1607.08022
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv Prepr. (2014). arXiv:1412.6980
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In Proceedings of the IEEE international conference on computer vision, pp. 3730–3738 (2015). https://doi.org/10.1109/ICCV.2015.425
Acknowledgments
The authors are very indebted to the anonymous referees for their critical comments and suggestions for the improvement of this paper. This work was supported by the grants from the National Natural Science Foundation of China (Nos. 61673396, U19A2073, 61976245) and the Fundamental Research Funds for the Central Universities (18CX02140A).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Song, X., Shao, M., Zuo, W. et al. Face attribute editing based on generative adversarial networks. SIViP 14, 1217–1225 (2020). https://doi.org/10.1007/s11760-020-01660-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-020-01660-0