Advertisement

Wavelet Domain Generative Adversarial Network for Multi-scale Face Hallucination

  • Huaibo Huang
  • Ran HeEmail author
  • Zhenan Sun
  • Tieniu Tan
Article
  • 132 Downloads

Abstract

Most modern face hallucination methods resort to convolutional neural networks (CNN) to infer high-resolution (HR) face images. However, when dealing with very low-resolution (LR) images, these CNN based methods tend to produce over-smoothed outputs. To address this challenge, this paper proposes a wavelet-domain generative adversarial method that can ultra-resolve a very low-resolution (like \(16\times 16\) or even \(8\times 8\)) face image to its larger version of multiple upscaling factors (\(2\times \) to \(16\times \)) in a unified framework. Different from the most existing studies that hallucinate faces in image pixel domain, our method firstly learns to predict the wavelet information of HR face images from its corresponding LR inputs before image-level super-resolution. To capture both global topology information and local texture details of human faces, a flexible and extensible generative adversarial network is designed with three types of losses: (1) wavelet reconstruction loss aims to push wavelets closer with the ground-truth; (2) wavelet adversarial loss aims to generate realistic wavelets; (3) identity preserving loss aims to help identity information recovery. Extensive experiments demonstrate that the presented approach not only achieves more appealing results both quantitatively and qualitatively than state-of-the-art face hallucination methods, but also can significantly improve identification accuracy for low-resolution face images captured in the wild.

Keywords

Face hallucination Super-resolution Wavelet transform Generative adversarial network Face recognition 

Notes

Acknowledgements

This work is partially funded by the State Key Development Program (Grant No. 2016YFB1001001), National Natural Science Foundation of China (Grant No. 61622310, 61427811), and Beijing Natural Science Foundation (Grants No. JQ18017).

References

  1. Anbarjafari, G., & Demirel, H. (2010). Image super resolution based on interpolation of wavelet domain high frequency subbands and the spatial domain input image. ETRI Journal, 32(3), 390–394.Google Scholar
  2. Bruna, J., Sprechmann, P., & LeCun, Y. (2016). Super-resolution with deep convolutional sufficient statistics. In International conference on learning representations.Google Scholar
  3. Bulat, A., & Tzimiropoulos, G. (2018). Super-fan: Integrated facial landmark localization and super-resolution of real-world low resolution faces in arbitrary poses with GANS. In IEEE conference on computer vision and pattern recognition (pp. 109–117).Google Scholar
  4. Bulat, A., Yang, J., & Tzimiropoulos, G. (2018). To learn image super-resolution, use a GAN to learn how to do image degradation first. In European conference on computer vision (pp. 185–200).Google Scholar
  5. Chang, H., Yeung, D. Y., & Xiong, Y. (2004). Super-resolution through neighbor embedding. In IEEE Conference on Computer Vision and Pattern Recognition (Vol. 1, pp. 275–282).Google Scholar
  6. Chen, Y., Tai, Y., Liu, X., Shen, C., & Yang, J. (2018). FSRNet: End-to-end learning face super-resolution with facial priors. In IEEE conference on computer vision and pattern recognition (pp. 2492–2501).Google Scholar
  7. Coifman, R. R., & Wickerhauser, M. V. (1992). Entropy-based algorithms for best basis selection. IEEE Transactions on Information Theory, 38(2), 713–718.zbMATHGoogle Scholar
  8. Dahl, R., Norouzi, M., & Shlens, J. (2017). Pixel recursive super resolution. In IEEE international conference on computer vision (pp. 5439–5448).Google Scholar
  9. Dong, C., Loy, C. C., He, K., & Tang, X. (2016). Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2), 295–307.Google Scholar
  10. Farrugia, R. A., & Guillemot, C. (2017). Face hallucination using linear models of coupled sparse support. IEEE Transactions on Image Processing, 26(9), 4562–4577.MathSciNetGoogle Scholar
  11. Gao, X., & Xiong, H. (2016). A hybrid wavelet convolution network with sparse-coding for image super-resolution. In IEEE international conference on image Processing (pp. 1439–1443).Google Scholar
  12. Gatys, L. A., Ecker, A. S., & Bethge, M. (2016) Image style transfer using convolutional neural networks. In IEEE conference on computer vision and pattern recognition (pp. 2414–2423).Google Scholar
  13. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., et al. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems, 27, 2672–2680.Google Scholar
  14. Hayat, M., Khan, S. H., & Bennamoun, M. (2017). Empowering simple binary classifiers for image set based face recognition. International Journal of Computer Vision, 123(3), 479–498.MathSciNetGoogle Scholar
  15. Huang, G. B., Ramesh, M., Berg, T., & Learned-Miller, E. (2007) Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report 07-49, University of Massachusetts, Amherst.Google Scholar
  16. Huang, H., He, R., Sun, Z., & Tan, T. (2017). Wavelet-SRNet: A wavelet-based CNN for multi-scale face super resolution. In IEEE international conference on computer vision (pp. 1689–1697).Google Scholar
  17. Huang, H., Li, Z., He, R., Sun, Z., & Tan, T. (2018). Introvae: Introspective variational autoencoders for photographic image synthesis. In Neural information processing systems.Google Scholar
  18. Huang, J. B., Singh, A., & Ahuja, N. (2015). Single image super-resolution from transformed self-exemplars. In IEEE conference on computer vision and pattern recognition (pp. 5197–5206).Google Scholar
  19. Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning (pp. 448–456).Google Scholar
  20. Ji, H., & Fermüller, C. (2009). Robust wavelet-based super-resolution reconstruction: Theory and algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(4), 649–660.Google Scholar
  21. Jiang, J., Hu, R., Wang, Z., & Han, Z. (2014). Noise robust face hallucination via locality-constrained representation. IEEE Transactions on Multimedia, 16(5), 1268–1281.Google Scholar
  22. Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual losses for real-time style transfer and super-resolution. In European conference on computer vision (pp. 694–711).Google Scholar
  23. Jung, C., Jiao, L., Liu, B., & Gong, M. (2011). Position-patch based face hallucination using convex optimization. IEEE Signal Processing Letters, 18(6), 367–370.Google Scholar
  24. Karras, T., Aila, T., Laine, S., & Lehtinen, J. (2018). Progressive growing of GANs for improved quality, stability, and variation. In International conference on learning representations.Google Scholar
  25. Kim, J., Kwon Lee, J., & Mu Lee, K. (2016a). Accurate image super-resolution using very deep convolutional networks. In IEEE conference on computer vision and pattern recognition (pp. 1646–1654).Google Scholar
  26. Kim, J., Kwon Lee, J., & Mu Lee, K. (2016b). Deeply-recursive convolutional network for image super-resolution. In IEEE conference on computer vision and pattern recognition (pp. 1637–1645).Google Scholar
  27. Kingma, D., & Ba, J. (2014). Adam: A method for stochastic optimization. In International conference on learning representations.Google Scholar
  28. Lai, W. S., Huang, J. B., Ahuja, N., & Yang, M. H. (2017). Deep Laplacian pyramid networks for fast and accurate super-resolution. In IEEE conference on computer vision and pattern recognition (pp. 624–632).Google Scholar
  29. Ledig, C., Theis, L., Huszar, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z., & Shi, W. (2017). Photo-realistic single image super-resolution using a generative adversarial network. In IEEE conference on computer vision and pattern recognition (pp. 4681–4690).Google Scholar
  30. Li, B., Chang, H., Shan, S., & Chen, X. (2009). Aligning coupled manifolds for face hallucination. IEEE Signal Processing Letters, 16(11), 957–960.Google Scholar
  31. Lin, Z., He, J., Tang, X., & Tang, C. K. (2008). Limits of learning-based superresolution algorithms. International Journal of Computer Vision, 80(3), 406–420.Google Scholar
  32. Liu, C., Shum, H. Y., & Freeman, W. T. (2007). Face hallucination: Theory and practice. International Journal of Computer Vision, 75(1), 115–134.Google Scholar
  33. Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep learning face attributes in the wild. In IEEE international conference on computer vision (pp. 3730–3738).Google Scholar
  34. Ma, X., Zhang, J., & Qi, C. (2010). Hallucinating face by position-patch. Pattern Recognition, 43(6), 2224–2236.Google Scholar
  35. Mallat, S. (1996). Wavelets for a vision. Proceedings of the IEEE, 84(4), 604–614.Google Scholar
  36. Mallat, S. (2016). Understanding deep convolutional networks. Philos Trans R Soc A, 374(2065), 20150203.Google Scholar
  37. Mallat, S. G. (1989). A theory for multiresolution signal decomposition: The wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(7), 674–693.zbMATHGoogle Scholar
  38. Mao, X., Li, Q., Xie, H., Lau, R. Y., Wang, Z., & Smolley, S. P. (2017). Least squares generative adversarial networks. In IEEE international conference on computer vision (pp. 2813–2821).Google Scholar
  39. Naik, S., & Patel, N. (2013). Single image super resolution in spatial and wavelet domain. The International Journal of Multimedia & Its Applications, 5(4), 23.Google Scholar
  40. Nguyen, N., & Milanfar, P. (2000). A wavelet-based interpolation-restoration method for superresolution (wavelet superresolution). Circuits, Systems, and Signal Processing, 19(4), 321–338.zbMATHGoogle Scholar
  41. Odena, A., Olah, C., & Shlens, J. (2017). Conditional image synthesis with auxiliary classifier GANs. In International conference on machine learning (pp. 2642–2651).Google Scholar
  42. van den Oord, A., Kalchbrenner, N., Espeholt, L., kavukcuoglu, k, Vinyals, O., & Graves, A. (2016). Conditional image generation with pixelcnn decoders. Advances in Neural Information Processing Systems, 29, 4790–4798.Google Scholar
  43. Park, J. S., & Lee, S. W. (2008). An example-based face hallucination method for single-frame, low-resolution facial images. IEEE Transactions on Image Processing, 17(10), 1806–1816.MathSciNetzbMATHGoogle Scholar
  44. Parkhi, O. M., Vedaldi, A., & Zisserman, A. (2015). Deep face recognition. In British machine vision conference.Google Scholar
  45. Radford, A., Metz, L., & Chintala, S. (2016). Unsupervised representation learning with deep convolutional generative adversarial networks. In International conference on learning representations.Google Scholar
  46. Sajjadi, M. S. M., Scholkopf, B., & Hirsch, M. (2017). Enhancenet: Single image super-resolution through automated texture synthesis. In IEEE international conference on computer vision (pp. 4491–4500).Google Scholar
  47. Shamir, L. (2008). Evaluation of face datasets as tools for assessing the? Performance of face recognition methods. International Journal of Computer Vision, 79(3), 225.Google Scholar
  48. Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, AP., Bishop, R., Rueckert, D., & Wang, Z. (2016). Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In IEEE conference on computer vision and pattern recognition (pp. 1874–1883).Google Scholar
  49. Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In International conference on learning representations.Google Scholar
  50. Singh, A., Porikli, F., & Ahuja, N. (2014). Super-resolving noisy images. In IEEE conference on computer vision and pattern recognition (pp. 2846–2853).Google Scholar
  51. Sohn, K,. Liu, S., Zhong, G., Yu, X., Yang, M. H., & Chandraker, M. (2017). Unsupervised domain adaptation for face recognition in unlabeled videos. In IEEE international conference on computer vision (pp. 3210–3218).Google Scholar
  52. Sønderby, C. K., Caballero, J., Theis, L., Shi, W., & Huszár, F. (2017). Amortised map inference for image super-resolution. In International conference on learning representations.Google Scholar
  53. Sun, J., Xu, Z., & Shum, H. Y. (2008). Image super-resolution using gradient profile prior. In IEEE conference on computer vision and pattern recognition (pp. 1–8).Google Scholar
  54. Tai, Y., Yang, J., & Liu, X. (2017). Image super-resolution via deep recursive residual network. In IEEE conference on computer vision and pattern recognition (pp. 3147–3155)Google Scholar
  55. Tong, T., Li, G., Liu, X., & Gao, Q. (2017). Image super-resolution using dense skip connections. In IEEE international conference on computer vision (pp. 4799–4807).Google Scholar
  56. Wang, N., Tao, D., Gao, X., Li, X., & Li, J. (2014). A comprehensive survey to face hallucination. International Journal of Computer Vision, 106(1), 9–30.Google Scholar
  57. Wang, X., & Tang, X. (2005). Hallucinating face by eigentransformation. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 35(3), 425–434.Google Scholar
  58. Wu, X., Song, L., He, R., & Tan, T. (2018). Coupled deep learning for heterogeneous face recognition. In AAAI conference on artificial intelligence.Google Scholar
  59. Xu, X., Sun, D., Pan, J., Zhang, Y., Pfister, H., & Yang, M. H. (2017). Learning to super-resolve blurry face and text images. In IEEE international conference on computer vision (pp. 251–260).Google Scholar
  60. Yang, C. Y., & Yang, M. H. (2013). Fast direct super-resolution by simple functions. In IEEE international conference on computer vision (pp. 561–568)Google Scholar
  61. Yang, C. Y., Liu, S., & Yang, M. H. (2013) Structured face hallucination. In IEEE conference on computer vision and pattern recognition (pp. 1099–1106).Google Scholar
  62. Yang, C. Y., Liu, S., & Yang, M. H. (2017). Hallucinating compressed face images. International Journal of Computer Vision.  https://doi.org/10.1007/s11263-017-1044-4.Google Scholar
  63. Yang, J., Tang, H., Ma, Y., & Huang, T. (2008). Face hallucination via sparse coding. In IEEE international conference on image processing (pp. 1264–1267).Google Scholar
  64. Yang, J., Wright, J., Huang, T. S., & Ma, Y. (2010). Image super-resolution via sparse representation. IEEE Transactions on Image Processing, 19(11), 2861–2873.MathSciNetzbMATHGoogle Scholar
  65. Yu, X., & Porikli, F. (2016). Ultra-resolving face images by discriminative generative networks. In European conference on computer vision (pp. 318–333).Google Scholar
  66. Yu, X., & Porikli, F. (2017a). Face hallucination with tiny unaligned images by transformative discriminative neural networks. In AAAI conference on artificial intelligence (pp. 4327–4333).Google Scholar
  67. Yu, X., & Porikli, F. (2017b). Hallucinating very low-resolution unaligned and noisy face images by transformative discriminative autoencoders. In IEEE conference on computer vision and pattern recognition (pp. 3760–3768).Google Scholar
  68. Yu, X., Fernando, B., Ghanem, B., Porikli, F., & Hartley, R. (2018a). Face super-resolution guided by facial component heatmaps. In European conference on computer vision (pp. 217–233).Google Scholar
  69. Yu, X., Fernando, B., Hartley, R., & Porikli, F. (2018b). Super-resolving very low-resolution face images with supplementary attributes. In IEEE conference on computer vision and pattern recognition (pp. 908–917).Google Scholar
  70. Zhang, H., Xu, T., Li, H., Zhang, S., Wang, X., Huang, X., & Metaxas, DN. (2017). Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks. In IEEE international conference on computer vision (pp. 5907–5915).Google Scholar
  71. Zhao, S., Han, H., & Peng, S. (2003). Wavelet-domain HMT-based image super-resolution. IEEE International Conference on Image Processing, 2, 953–956.Google Scholar
  72. Zhu, S., Liu, S., Loy, C. C., & Tang, X. (2016). Deep cascaded bi-network for face hallucination. In European conference on computer vision (pp. 614–630).Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Artificial IntelligenceUniversity of Chinese Academy of SciencesBeijingChina
  2. 2.Center for Research on Intelligent Perception and ComputingCASIABeijingChina
  3. 3.National Laboratory of Pattern RecognitionCASIABeijingChina
  4. 4.Center for Excellence in Brain Science and Intelligence TechnologyCASBeijingChina

Personalised recommendations