Advertisement

Semi-supervised Adversarial Learning to Generate Photorealistic Face Images of New Identities from 3D Morphable Model

  • Baris GecerEmail author
  • Binod Bhattarai
  • Josef Kittler
  • Tae-Kyun Kim
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11215)

Abstract

We propose a novel end-to-end semi-supervised adversarial framework to generate photorealistic face images of new identities with a wide range of expressions, poses, and illuminations conditioned by synthetic images sampled from a 3D morphable model. Previous adversarial style-transfer methods either supervise their networks with a large volume of paired data or train highly under-constrained two-way generative networks in an unsupervised fashion. We propose a semi-supervised adversarial learning framework to constrain the two-way networks by a small number of paired real and synthetic images, along with a large volume of unpaired data. A set-based loss is also proposed to preserve identity coherence of generated images. Qualitative results show that generated face images of new identities contain pose, lighting and expression diversity. They are also highly constrained by the synthetic input images while adding photorealism and retaining identity information. We combine face images generated by the proposed method with a real data set to train face recognition algorithms and evaluate the model quantitatively on two challenging data sets: LFW and IJB-A. The generated images by our framework consistently improve the performance of deep face recognition networks trained with the Oxford VGG Face dataset, and achieve comparable results to the state-of-the-art.

Notes

Acknowledgements

This work was supported by the EPSRC Programme Grant ‘FACER2VM’ (EP/N007743/1). We would like to thank Microsoft Research for their support with Microsoft Azure Research Award. Baris Gecer is funded by the Turkish Ministry of National Education. This study is morally motivated to improve face recognition to help prediction of genetic disorders visible on human face in earlier stages.

Supplementary material

474198_1_En_14_MOESM1_ESM.pdf (665 kb)
Supplementary material 1 (pdf 665 KB)

References

  1. 1.
    Bansal, A., Castillo, C., Ranjan, R., Chellappa, R.: The do’s and don’ts for CNN-based face verification. In: ICCVW (2017)Google Scholar
  2. 2.
    Berthelot, D., Schumm, T., Metz, L.: BEGAN: boundary equilibrium generative adversarial networks. arXiv preprint arXiv:1703.10717 (2017)
  3. 3.
    Blanz, V., Vetter, T.: A morphable model for the synthesis of 3d faces. In: Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, pp. 187–194. ACM Press/Addison-Wesley Publishing Co. (1999)Google Scholar
  4. 4.
    Booth, J., Antonakos, E., Ploumpis, S., Trigeorgis, G., Panagakis, Y., Zafeiriou, S.: 3d face morphable models “in-the-wild”. In: CVPR (2017)Google Scholar
  5. 5.
    Booth, J., Roussos, A., Zafeiriou, S., Ponniah, A., Dunaway, D.: A 3d morphable model learnt from 10,000 faces. In: CVPR (2016)Google Scholar
  6. 6.
    Bousmalis, K., Silberman, N., Dohan, D., Erhan, D., Krishnan, D.: Unsupervised pixel-level domain adaptation with generative adversarial networks. In: CVPR (2017)Google Scholar
  7. 7.
    Bousmalis, K., Trigeorgis, G., Silberman, N., Krishnan, D., Erhan, D.: Domain separation networks. In: NIPS (2016)Google Scholar
  8. 8.
    Cao, C., Weng, Y., Zhou, S., Tong, Y., Zhou, K.: FaceWarehouse: a 3d facial expression database for visual computing. IEEE Trans. Vis. Comput. Graph. 20(3), 413–425 (2014)CrossRefGoogle Scholar
  9. 9.
    Chen, Q., Koltun, V.: Photographic image synthesis with cascaded refinement networks. In: ICCV (2017)Google Scholar
  10. 10.
    Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: InfoGan: interpretable representation learning by information maximizing generative adversarial nets. In: NIPS (2016)Google Scholar
  11. 11.
    Cole, F., Belanger, D., Krishnan, D., Sarna, A., Mosseri, I., Freeman, W.T.: Synthesizing normalized faces from facial identity features. In: CVPR (2017)Google Scholar
  12. 12.
    Costa, P., et al.: Towards adversarial retinal image synthesis. arXiv preprint arXiv:1701.08974 (2017)
  13. 13.
    Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. TPAMI 38(2), 295–307 (2016)CrossRefGoogle Scholar
  14. 14.
    Dosovitskiy, A., et al.: FlowNet: learning optical flow with convolutional networks. In: ICCV (2015)Google Scholar
  15. 15.
    Dumoulin, V., et al.: Adversarially learned inference. arXiv preprint arXiv:1606.00704 (2016)
  16. 16.
    Ganin, Y., et al.: Domain-adversarial training of neural networks. J. Mach. Learn. Res. 17(59), 1–35 (2016)MathSciNetzbMATHGoogle Scholar
  17. 17.
    Geçer, B.: Detection and classification of breast cancer in whole slide histopathology images using deep convolutional networks. Ph.D. thesis, Bilkent University (2016)Google Scholar
  18. 18.
    Gecer, B., Aksoy, S., Mercan, E., Shapiro, L.G., Weaver, D.L., Elmore, J.G.: Detection and classification of cancer in whole slide breast histopathology images using deep convolutional networks. In: Pattern Recognition (2018)CrossRefGoogle Scholar
  19. 19.
    Gecer, B., Balntas, V., Kim, T.-K.: Learning deep convolutional embeddings for face representation using joint sample-and set-based supervision. In: ICCVW (2017)Google Scholar
  20. 20.
    Goodfellow, I.: NIPS 2016 tutorial: generative adversarial networks. In: NIPS (2016)Google Scholar
  21. 21.
    Goodfellow, I., et al.: Generative adversarial nets. In: NIPS (2014)Google Scholar
  22. 22.
    He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: ICCV (2017)Google Scholar
  23. 23.
    Huang, G.B., Ramesh, M., Berg, T., Learned-Miller, E.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments. Technical report, Technical Report 07–49, University of Massachusetts, Amherst (2007)Google Scholar
  24. 24.
    Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: ICCV (2017)Google Scholar
  25. 25.
    IEEE: A 3D Face Model for Pose and Illumination Invariant Face Recognition, Genova, Italy (2009)Google Scholar
  26. 26.
    Isola, P., Zhu, J.-Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: CVPR (2017)Google Scholar
  27. 27.
    Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46475-6_43CrossRefGoogle Scholar
  28. 28.
    Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. In: ICLR (2018)Google Scholar
  29. 29.
    Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  30. 30.
    Klare, B.F., et al.: Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus benchmark A. In: CVPR (2015)Google Scholar
  31. 31.
    Lassner, C., Pons-Moll, G., Gehler, P.V.: A generative model of people in clothing. In: ICCV (2017)Google Scholar
  32. 32.
    Li, J., Skinner, K.A., Eustice, R.M., Johnson-Roberson, M.: WaterGAN: unsupervised generative network to enable real-time color correction of monocular underwater images. IEEE Robot. Autom. Lett. 3(1), 387–394 (2018)Google Scholar
  33. 33.
    Liu, M.-Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. In: NIPS (2017)Google Scholar
  34. 34.
    Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., Song, L.: SphereFace: deep hypersphere embedding for face recognition. In: CVPR (2017)Google Scholar
  35. 35.
    Lu, Y., Tai, Y.-W., Tang, C.-K.: Conditional cycleGAN for attribute guided face image generation. arXiv preprint arXiv:1705.09966 (2017)
  36. 36.
    Masi, I., Tran, A.T., Hassner, T., Leksut, J.T., Medioni, G.: Do we really need to collect millions of faces for effective face recognition? In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 579–596. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46454-1_35CrossRefGoogle Scholar
  37. 37.
    Parkhi, O.M., Vedaldi, A., Zisserman, A.: Deep face recognition. In: BMVC (2015)Google Scholar
  38. 38.
    Patel, V.M., Gopalan, R., Li, R., Chellappa, R.: Visual domain adaptation: a survey of recent advances. IEEE Signal Process. Mag. 32(3), 53–69 (2015)CrossRefGoogle Scholar
  39. 39.
    Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)
  40. 40.
    Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., Lee, H.: Generative adversarial text to image synthesis. In: ICML (2016)Google Scholar
  41. 41.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS (2015)Google Scholar
  42. 42.
    Richardson, E., Sela, M., Kimmel, R.: 3d face reconstruction by learning from synthetic data. In: 3D Vision (3DV) (2016)Google Scholar
  43. 43.
    Rippel, O., Paluri, M., Dollar, P., Bourdev, L.: Metric learning with adaptive density discrimination. arXiv preprint arXiv:1511.05939 (2015)
  44. 44.
    Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  45. 45.
    Sagonas, C., Tzimiropoulos, G., Zafeiriou, S., Pantic, M.: 300 faces in-the-wild challenge: the first facial landmark localization challenge. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 397–403 (2013)Google Scholar
  46. 46.
    Sagonas, C., Tzimiropoulos, G., Zafeiriou, S., Pantic, M.: A semi-automatic methodology for facial landmark annotation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 896–903 (2013)Google Scholar
  47. 47.
    Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: CVPR (2015)Google Scholar
  48. 48.
    Shrivastava, A., Pfister, T., Tuzel, O., Susskind, J., Wang, W., Webb, R.: Learning from simulated and unsupervised images through adversarial training. In: CVPR (2017)Google Scholar
  49. 49.
    Shu, Z., Yumer, E., Hadap, S., Sunkavalli, K., Shechtman, E., Samaras, D.: Neural face editing with intrinsic image disentangling. In: CVPR (2017)Google Scholar
  50. 50.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556 (2014)Google Scholar
  51. 51.
    Sixt, L., Wild, B., Landgraf, T.: RenderGAN: generating realistic labeled data. arXiv preprint arXiv:1611.01331 (2016)
  52. 52.
    Sun, B., Saenko, K.: Subspace distribution alignment for unsupervised domain adaptation. In: BMVC (2015)Google Scholar
  53. 53.
    Tewari, A., et al.: MOFA: model-based deep convolutional face autoencoder for unsupervised monocular reconstruction. In: ICCV (2017)Google Scholar
  54. 54.
    Tran, A.T., Hassner, T., Masi, I., Medioni, G.: Regressing robust and discriminative 3d morphable models with a very deep neural network. In: CVPR (2017)Google Scholar
  55. 55.
    Tran, L., Yin, X., Liu, X.: Disentangled representation learning GAN for pose-invariant face recognition. In: CVPR (2017)Google Scholar
  56. 56.
    Tzeng, E., Hoffman, J., Darrell, T., Saenko, K.: Simultaneous deep transfer across domains and tasks. In: ICCV (2015)Google Scholar
  57. 57.
    Tzeng, E., Hoffman, J., Saenko, K., Darrell, T.: Adversarial discriminative domain adaptation. In: CVPR (2017)Google Scholar
  58. 58.
    Varol, G., et al.: Learning from synthetic humans. In: CVPR (2017)Google Scholar
  59. 59.
    Wen, Y., Zhang, K., Li, Z., Qiao, Y.: A discriminative feature learning approach for deep face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 499–515. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46478-7_31CrossRefGoogle Scholar
  60. 60.
    Wolf, L., Taigman, Y., Polyak, A.: Unsupervised creation of parameterized avatars. In: ICCV (2017)Google Scholar
  61. 61.
    Wood, E., Baltrušaitis, T., Morency, L.-P., Robinson, P., Bulling, A.: Learning an appearance-based gaze estimator from one million synthesised images. In: Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications, pp. 131–138. ACM (2016)Google Scholar
  62. 62.
    Xiong, C., Liu, L., Zhao, X., Yan, S., Kim, T.-K.: Convolutional fusion network for face verification in the wild. IEEE Trans. Circuits Syst. Video Technol. 26(3), 517–528 (2016)CrossRefGoogle Scholar
  63. 63.
    Xiong, C., Zhao, X., Tang, D., Jayashree, K., Yan, S., Kim, T.-K.: Conditional convolutional neural network for modality-aware face recognition. In: ICCV (2015)Google Scholar
  64. 64.
    Yi, D., Lei, Z., Liao, S., Li, S.Z.: Learning face representation from scratch. arXiv preprint arXiv:1411.7923 (2014)
  65. 65.
    Yin, X., Yu, X., Sohn, K., Liu, X., Chandraker, M.: Towards large-pose face frontalization in the wild. In: ICCV (2017)Google Scholar
  66. 66.
    Yuan, S., Ye, Q., Stenger, B., Jain, S., Kim, T.-K.: BigHand2. 2M benchmark: hand pose dataset and state of the art analysis. In: CVPR (2017)Google Scholar
  67. 67.
    Zhang, H., et al.: Text to photo-realistic image synthesis with stacked generative adversarial networks. In: ICCV (2017)Google Scholar
  68. 68.
    Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)CrossRefGoogle Scholar
  69. 69.
    Zhang, X., Sugano, Y., Fritz, M., Bulling, A.: Appearance-based gaze estimation in the wild. In: CVPR (2015)Google Scholar
  70. 70.
    Zheng, Z., Zheng, L., Yang, Y.: Unlabeled samples generated by GAN improve the person re-identification baseline in vitro. In: ICCV (2017)Google Scholar
  71. 71.
    Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV (2017)Google Scholar
  72. 72.
    Zhu, X., Lei, Z., Liu, X., Shi, H., Li, S.Z.: Face alignment across large poses: a 3d solution. In: CVPR (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Baris Gecer
    • 1
    Email author
  • Binod Bhattarai
    • 1
  • Josef Kittler
    • 2
  • Tae-Kyun Kim
    • 1
  1. 1.Department of Electrical and Electronic EngineeringImperial College LondonLondonUK
  2. 2.Centre for Vision, Speech and Signal ProcessingUniversity of SurreyGuildfordUK

Personalised recommendations