Advertisement

Identity-Preserving Face Recovery from Stylized Portraits

  • Fatemeh Shiri
  • Xin Yu
  • Fatih Porikli
  • Richard Hartley
  • Piotr KoniuszEmail author
Article
  • 75 Downloads

Abstract

Given an artistic portrait, recovering the latent photorealistic face that preserves the subject’s identity is challenging because the facial details are often distorted or fully lost in artistic portraits. We develop an Identity-preserving Face Recovery from Portraits method that utilizes a Style Removal network (SRN) and a Discriminative Network (DN). Our SRN, composed of an autoencoder with residual block-embedded skip connections, is designed to transfer feature maps of stylized images to the feature maps of the corresponding photorealistic faces. Owing to the Spatial Transformer Network, SRN automatically compensates for misalignments of stylized portraits to output aligned realistic face images. To ensure the identity preservation, we promote the recovered and ground truth faces to share similar visual features via a distance measure which compares features of recovered and ground truth faces extracted from a pre-trained FaceNet network. DN has multiple convolutional and fully-connected layers, and its role is to enforce recovered faces to be similar to authentic faces. Thus, we can recover high-quality photorealistic faces from unaligned portraits while preserving the identity of the face in an image. By conducting extensive evaluations on a large-scale synthesized dataset and a hand-drawn sketch dataset, we demonstrate that our method achieves superior face recovery and attains state-of-the-art results. In addition, our method can recover photorealistic faces from unseen stylized portraits, artistic paintings, and hand-drawn sketches.

Keywords

Face synthesis Image stylization Face recovery Destylization Generative models 

Notes

Acknowledgements

This work is supported by the Australian Research Council (ARC) Grant DP150104645.

Supplementary material

References

  1. Chen, D., Yuan, L., Liao, J., & Yu, N., Hua, G. (2017). Stylebank: An explicit representation for neural image style transfer. arXiv preprint arXiv:1703.09210.
  2. Chen, T. Q., & Schmidt, M. (2016). Fast patch-based style transfer of arbitrary style. arXiv preprint arXiv:1612.04337.
  3. Denton, E. L., Chintala, S., & Fergus, R., et al. (2015). Deep generative image models using a laplacian pyramid of adversarial networks. In NIPS.Google Scholar
  4. Dumoulin, V., Shlens, J., & Kudlur, M. (2016). A learned representation for artistic style. arXiv preprint arXiv:1610.07629.
  5. Gatys, L. A., Bethge, M., Hertzmann, A., & Shechtman, E. (2016). Preserving color in neural artistic style transfer. arXiv preprint arXiv:1606.05897.
  6. Gatys, L. A., Ecker, A. S., & Bethge, M. (2016). Image style transfer using convolutional neural networks. In CVPR.Google Scholar
  7. Gatys, L. A., Ecker, A. S., Bethge, M., Hertzmann, A., & Shechtman, E. (2016) Controlling perceptual factors in neural style transfer. arXiv preprint arXiv:1611.07865.
  8. Goodfellow, I., Pouget-Abadie, J., & Mirza, M. (2014). Generative Adversarial Networks. In NIPS.Google Scholar
  9. Gupta, A., Johnson, J., Alahi, A., & Fei-Fei, L. (2017). Characterizing and improving stability in neural style transfer. arXiv preprint arXiv:1705.02092.
  10. Hinton, G.: Neural Networks for Machine Learning Lecture 6a: Overview of mini-batch gradient descent Reminder: The error surface for a linear neuronGoogle Scholar
  11. Huang, R., Zhang, S., Li, T., & He, R. (2017). Beyond face rotation: Global and local perception gan for photorealistic and identity preserving frontal view synthesis. arXiv preprint arXiv:1704.04086.
  12. Huang, X., & Belongie, S. (2017). Arbitrary style transfer in real-time with adaptive instance normalization. arXiv preprint arXiv:1703.06868.
  13. Isola, P., Zhu, J. Y., Zhou, T., & Efros, A. A. (2016). Image-to-image translation with conditional adversarial networks. arXiv preprint arXiv:1611.07004.
  14. Jaderberg, M., Simonyan, K., & Zisserman, A., et al. (2015). Spatial transformer networks. In NIPS (pp. 2017–2025).Google Scholar
  15. Jayasumana, S., Hartley, R., Salzmann, M., Li, H., & Harandi, M. (2013). Kernel methods on the riemannian manifold of symmetric positive definite matrices. In CVPR.Google Scholar
  16. Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual losses for real-time style transfer and super-resolution. In ECCV. Springer.Google Scholar
  17. Karacan, L., Akata, Z., Erdem, A., & Erdem, E. (2016). Learning to generate images of outdoor scenes from attributes and semantic layouts. arXiv preprint arXiv:1612.00215.
  18. Kazemi, H., Iranmanesh, M., Dabouei, A., Soleymani, S., & Nasrabadi, N.M. (2018). Facial attributes guided deep sketch-to-photo synthesis. In 2018 IEEE winter applications of computer vision workshops (WACVW) (pp. 1–8). IEEE.Google Scholar
  19. Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.
  20. Koniusz, P., Tas, Y., Zhang, H., Harandi, M., Porikli, F., & Zhang, R. (2018). Museum exhibit identification challenge for the supervised domain adaptation and beyond. In ECCV (pp. 788–804).Google Scholar
  21. Koniusz, P., Yan, F., Gosselin, P., & Mikolajczyk, K. (2016). Higher-order occurrence pooling for bags-of-words: Visual concept detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(2), 313–326.CrossRefGoogle Scholar
  22. Koniusz, P., Zhang, H., & Porikli, F. (2018). A deeper look at power normalizations. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 5774–5783).Google Scholar
  23. Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., & Wang, Z., et al. (2016). Photo-realistic single image super-resolution using a generative adversarial network. arXiv preprint arXiv:1609.04802.
  24. Li, C., & Wand, M. (2016a). Combining markov random fields and convolutional neural networks for image synthesis. In CVPR.Google Scholar
  25. Li, C., & Wand, M. (2016b). Precomputed real-time texture synthesis with markovian generative adversarial networks. In ECCV (pp. 702–716). Springer.Google Scholar
  26. Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., & Yang, M.H. (2017). Diversified texture synthesis with feed-forward networks. arXiv preprint arXiv:1703.01664.
  27. Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep learning face attributes in the wild. In ICCV.Google Scholar
  28. Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In CVPR.Google Scholar
  29. Maas, A. L., Hannun, A. Y., & Ng, A. Y. (2013). Rectifier nonlinearities improve neural network acoustic models. In ICML.Google Scholar
  30. Nejati, H., & Sim, T. (2011). A study on recognizing non-artistic face sketches. In WACV. IEEE.Google Scholar
  31. Oord, A. V. D., Kalchbrenner, N., & Kavukcuoglu, K. (2016). Pixel recurrent neural networks. arXiv preprint arXiv:1601.06759.
  32. Parkhi, O. M., Vedaldi, A., & Zisserman, A. (2015). Deep face recognition. In BMVC.Google Scholar
  33. Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., & Efros, A. A. (2016). Context encoders: Feature learning by inpainting. In CVPR.Google Scholar
  34. Phillips, P. J., Wechsler, H., Huang, J., & Rauss, P. J. (1998). The feret database and evaluation procedure for face-recognition algorithms. Image and Vision Computing, 16(5), 295–306.CrossRefGoogle Scholar
  35. Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434.
  36. Ronneberger, O., & Fischer, P., Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In MICCAI. Springer.Google Scholar
  37. Sangkloy, P., Lu, J., Fang, C., Yu, F., & Hays, J. (2016). Scribbler: Controlling deep image synthesis with sketch and color. arXiv preprint arXiv:1612.00835.
  38. Schroff, F., Kalenichenko, D., & Philbin, J. (2015). Facenet: A unified embedding for face recognition and clustering. In CVPR.Google Scholar
  39. Selim, A., Elgharib, M., & Doyle, L. (2016). Painting style transfer for head portraits using convolutional neural networks. ACM (TOG), 35(4), 129.Google Scholar
  40. Sharma, A., & Jacobs, D. W. (2011). Bypassing synthesis: Pls for face recognition with pose, low-resolution and sketch. In CVPR. IEEE.Google Scholar
  41. Shiri, F., Yu, X., Koniusz, P., & Porikli, F. (2017). Face destylization. In International conference on digital image computing: Techniques and applications (DICTA).  https://doi.org/10.1109/DICTA.2017.8227432.
  42. Shiri, F., Yu, X., Porikli, F., Hartley, R., & Koniusz, P. (2018). Identity-preserving face recovery from portraits. In WACV (pp. 102–111).  https://doi.org/10.1109/WACV.2018.00018.
  43. Shiri, F., Yu, X., Porikli, F., Hartley, R., & Koniusz, P. (2019). Recovering faces from portraits with auxiliary facial attributes. In WACV.Google Scholar
  44. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
  45. Ulyanov, D., Lebedev, V., Vedaldi, A., & Lempitsky, V. S. (2016a). Texture networks: Feed-forward synthesis of textures and stylized images. In ICML.Google Scholar
  46. Ulyanov, D., Vedaldi, A., & Lempitsky, V. (2016b). Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022.
  47. Ulyanov, D., Vedaldi, A., & Lempitsky, V. (2017). Improved texture networks: Maximizing quality and diversity in feed-forward stylization and texture synthesis. arXiv preprint arXiv:1701.02096.
  48. Wang, L., Sindagi, V., & Patel, V. (2018a). High-quality facial photo-sketch synthesis using multi-adversarial networks. In 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018) (pp. 83–90). IEEE.Google Scholar
  49. Wang, N., Zha, W., Li, J., & Gao, X. (2018b). Back projection: An effective postprocessing method for gan-based face sketch synthesis. Pattern Recognition Letters, 107, 59–65.CrossRefGoogle Scholar
  50. Wang, X., Oxholm, G., Zhang, D., & Wang, Y.F. (2016). Multimodal transfer: A hierarchical deep convolutional neural network for fast artistic style transfer. arXiv preprint arXiv:1612.01895.
  51. Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: From error visibility to structural similarity. IEEE TIP, 13(4), 600–612.Google Scholar
  52. Wilmot, P., Risser, E., & Barnes, C. (2017). Stable and controllable neural texture synthesis and style transfer using histogram losses. arXiv preprint arXiv:1701.08893.
  53. Yin, R. (2016). Content aware neural style transfer. arXiv preprint arXiv:1601.04568.
  54. Yu, F., Zhang, Y., Song, S., Seff, A., & Xiao, J. (2015). Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365.
  55. Yu, X., & Porikli, F. (2016). Ultra-resolving face images by discriminative generative networks. In ECCV.Google Scholar
  56. Yu, X., & Porikli, F. (2017a). Face hallucination with tiny unaligned images by transformative discriminative neural networks. In AAAI.Google Scholar
  57. Yu, X., & Porikli, F. (2017b). Hallucinating very low-resolution unaligned and noisy face images by transformative discriminative autoencoders. In CVPR.Google Scholar
  58. Zhang, H., & Dana, K. (2017). Multi-style generative network for real-time transfer. arXiv preprint arXiv:1703.06953.
  59. Zhang, H., Sindagi, V., & Patel, V. M. (2017). Image de-raining using a conditional generative adversarial network. arXiv preprint arXiv:1701.05957.
  60. Zhang, L., Zhang, L., Mou, X., Zhang, D., et al. (2011a). Fsim: A feature similarity index for image quality assessment. IEEE Transactions on Image Processing, 20(8), 2378–2386.MathSciNetCrossRefzbMATHGoogle Scholar
  61. Zhang, W., Wang, X., & Tang, X. (2011b). Coupled information-theoretic encoding for face photo-sketch recognition. In CVPR. IEEE.Google Scholar
  62. Zhang, Z., Luo, P., Loy, C. C., & Tang, X. (2014). Facial landmark detection by deep multi-task learning. In ECCV. Springer.Google Scholar
  63. Zhu, J. Y., Park, T., Isola, P., & Efros, A. A.(2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. arXiv preprint arXiv:1703.10593.

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Fatemeh Shiri
    • 1
  • Xin Yu
    • 1
  • Fatih Porikli
    • 1
  • Richard Hartley
    • 1
    • 2
  • Piotr Koniusz
    • 1
    • 2
    Email author
  1. 1.Australian National UniversityCanberraAustralia
  2. 2.Data61/CSIROCanberraAustralia

Personalised recommendations