Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Hallucinating Unaligned Face Images by Multiscale Transformative Discriminative Networks

  • 120 Accesses

Abstract

Conventional face hallucination methods heavily rely on accurate alignment of low-resolution (LR) faces before upsampling them. Misalignment often leads to deficient results and unnatural artifacts for large upscaling factors. However, due to the diverse range of poses and different facial expressions, aligning an LR input image, in particular when it is tiny, is severely difficult. In addition, when the resolutions of LR input images vary, previous deep neural network based face hallucination methods require the interocular distances of input face images to be similar to the ones in the training datasets. Downsampling LR input faces to a required resolution will lose high-frequency information of the original input images. This may lead to suboptimal super-resolution performance for the state-of-the-art face hallucination networks. To overcome these challenges, we present an end-to-end multiscale transformative discriminative neural network devised for super-resolving unaligned and very small face images of different resolutions ranging from 16 \(\times \) 16 to 32 \(\times \) 32 pixels in a unified framework. Our proposed network embeds spatial transformation layers to allow local receptive fields to line-up with similar spatial supports, thus obtaining a better mapping between LR and HR facial patterns. Furthermore, we incorporate a class-specific loss designed to classify upright realistic faces in our objective through a successive discriminative network to improve the alignment and upsampling performance with semantic information. Extensive experiments on a large face dataset show that the proposed method significantly outperforms the state-of-the-art.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

References

  1. Arandjelović, O. (2014). Hallucinating optimal high-dimensional subspaces. Pattern Recognition, 47(8), 2662–2672.

  2. Baker, S., & Kanade, T. (2000). Hallucinating faces. In Proceedings of 4th IEEE international conference on automatic face and gesture recognition, FG 2000 (pp. 83–88).

  3. Baker, S., & Kanade, T. (2002). Limits on super-resolution and how to break them. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(9), 1167–1183.

  4. Bruna, J., Sprechmann, P., & LeCun, Y. (2016). Super-resolution with deep convolutional sufficient statistics. In International conference on learning representations (ICLR).

  5. Bulat, A., & Tzimiropoulos, G. (2017). How far are we from solving the 2D and 3D face alignment problem? (and a dataset of 230,000 3D facial landmarks). In Proceeding of international conference on computer vision (ICCV).

  6. Bulat, A., & Tzimiropoulos, G. (2018). Super-fan: Integrated facial landmark localization and super-resolution of real-world low resolution faces in arbitrary poses with gans. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).

  7. Bulat, A., Yang, J., & Tzimiropoulos, G. (2018). To learn image super-resolution, use a gan to learn how to do image degradation first. In Proceedings of European conference on computer vision (ECCV) (pp. 185–200).

  8. Chen, Y., Tai, Y., Liu, X., Shen, C., & Yang, J. (2018). Fsrnet: End-to-end learning face super-resolution with facial priors. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).

  9. Dahl, R., Norouzi, M., & Shlens, J. (2017). Pixel recursive super resolution. In Proceeding of international conference on computer vision (ICCV) (pp. 5439–5448).

  10. Denton, E., Chintala, S., Szlam, A., & Fergus, R. (2015). Deep generative image models using a Laplacian pyramid of adversarial networks. In Advances in neural information processing systems (NIPS) (pp. 1486–1494).

  11. Dong, C., Loy, C. C., & He, K. (2016). Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2), 295–307.

  12. Freedman, G., & Fattal, R. (2010). Image and video upscaling from local self-examples. ACM Transactions on Graphics, 28(3), 1–10.

  13. Freeman, W. T., Jones, T. R., & Pasztor, E. C. (2002). Example-based super-resolution. IEEE Computer Graphics and Applications, 22(2), 56–65.

  14. Glasner, D., Bagon, S., & Irani, M. (2009). Super-resolution from a single image. In Proceedings of IEEE international conference on computer vision (ICCV) (pp. 349–356).

  15. Goodfellow, I., Pouget-Abadie, J., & Mirza, M. (2014). Generative adversarial networks. In Advances in neural information processing systems (NIPS) (pp. 2672–2680).

  16. Gu, S., Zuo, W., Xie, Q., Meng, D., Feng, X., & Zhang, L. (2015). Convolutional sparse coding for image super-resolution. In Proceedings of the IEEE international conference on computer vision (ICCV).

  17. Hennings-Yeomans, P. H., Baker. S., & Kumar, B. V. (2008). Simultaneous super-resolution and feature extraction for recognition of low-resolution faces. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 1–8).

  18. Hinton, G. (2012). Neural networks for machine learning lecture 6a: Overview of mini-batch gradient descent Reminder—The error surface for a linear neuron. Technical report.

  19. Chang, H., Yeung, D.-Y., & Xiong, Y. (2004). Super-resolution through neighbor embedding. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (Vol. 1, pp. 275–282).

  20. Huang, G. B., Ramesh, M., Berg, T., & Learned-Miller, E. (2007). Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical report 07-49, University of Massachusetts, Amherst.

  21. Huang, H., He, R., Sun, Z., & Tan, T. (2017). Wavelet-SRNet: A wavelet-based CNN for multi-scale face super resolution. In Proceeding of international conference on computer vision (ICCV).

  22. Huang, J. B., Singh, A., & Ahuja, N. (2015). Single image super-resolution from transformed self-exemplars. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 5197–5206).

  23. Jaderberg, M., Simonyan, K., & Zisserman, A., et al. (2015). Spatial transformer networks. In Advances in neural information processing systems (NIPS) (pp. 2017–2025).

  24. Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual losses for real-time style transfer and super-resolution. In Proceedings of European conference on computer vision (ECCV).

  25. Kim, J., Kwon Lee, J., & Mu Lee, K. (2016a). Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 1646–1654).

  26. Kim, J., Kwon Lee, J., & Mu Lee, K. (2016b). Deeply-recursive convolutional network for image super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1637–1645).

  27. Kolouri, S., & Rohde, G. K. (2015). Transport-based single frame super resolution of very low resolution face images. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR).

  28. Lai, W. S., Huang, J. B., Ahuja, N., & Yang, M. H. (2017). Deep Laplacian pyramid networks for fast and accurate super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 624–632).

  29. Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., & Wang, Z., et al. (2016). Photo-realistic single image super-resolution using a generative adversarial network. arXiv:1609.04802

  30. Li, Y., Cai, C., Qiu, G., & Lam, K. M. (2014). Face hallucination based on sparse local-pixel structure. Pattern Recognition, 47(3), 1261–1270.

  31. Lin, Z., & Shum, H. Y. (2006). Response to the comments on “Fundamental limits of reconstruction-based superresolution algorithms under local translation”. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(5), 83–97.

  32. Lin, Z., He, J., Tang, X., & Tang, C. K. (2008). Limits of learning-based superresolution algorithms. International Journal of Computer Vision, 80(3), 406–420.

  33. Liu, C., Shum, H., & Zhang, C. (2001). A two-step approach to hallucinating faces: Global parametric model and local nonparametric model. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (Vol. 1, pp. 192–198).

  34. Liu, C., Shum, H. Y., & Freeman, W. T. (2007). Face hallucination: Theory and practice. International Journal of Computer Vision, 75(1), 115–134.

  35. Liu, C., Yuen, J., & Torralba, A. (2011). Sift flow: Dense correspondence across scenes and its applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5), 978–994.

  36. Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., & Song, L. (2017). Sphereface: Deep hypersphere embedding for face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (Vol. 1, p. 1).

  37. Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep learning face attributes in the wild. In Proceedings of international conference on computer vision (ICCV).

  38. Ma, X., Zhang, J., & Qi, C. (2010). Hallucinating face by position-patch. Pattern Recognition, 43(6), 2224–2236.

  39. Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks (pp. 1–15). arXiv:1511.06434

  40. Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A. P., Bishop, R., Rueckert, D., & Wang, Z. (2016). Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1874–1883).

  41. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  42. Singh, A., Porikli, F., & Ahuja, N. (2014). Super-resolving noisy images. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 2846–2853).

  43. Tai, Y., Yang, J., & Liu, X. (2017). Image super-resolution via deep recursive residual network. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (Vol. 1).

  44. Tappen, M. F., & Liu, C. (2012). A Bayesian approach to alignment-based image hallucination. In Proceedings of European conference on computer vision (ECCV) (Vol. 7578, pp. 236–249).

  45. Tappen, M. F., Russell, B. C., & Freeman, W. T. (2003). Exploiting the sparse derivative prior for super-resolution and image demosaicing. In IEEE workshop on statistical and computational theories of vision.

  46. Van Den Oord, A., Kalchbrenner, N., & Kavukcuoglu, K. (2016). Pixel recurrent neural networks. In Proceedings of international conference on international conference on machine learning (ICML) (pp. 1747–1756).

  47. Wang, N., Tao, D., Gao, X., Li, X., & Li, J. (2014). A comprehensive survey to face hallucination. International Journal of Computer Vision, 106(1), 9–30.

  48. Wang, X., & Tang, X. (2005). Hallucinating face by eigen transformation. IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews, 35(3), 425–434.

  49. Xu, X., Sun, D., Pan, J., Zhang, Y., Pfister, H., & Yang, M. H. (2017). Learning to super-resolve blurry face and text images. In Proceedings of the IEEE conference on computer vision and pattern recognition (ICCV) (pp. 251–260).

  50. Yang, C. Y., Liu, S., & Yang, M. H. (2013). Structured face hallucination. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 1099–1106).

  51. Yang, C. Y., Liu, S., & Yang, M. H. (2018). Hallucinating compressed face images. International Journal of Computer Vision, 126(6), 597–614.

  52. Yang, J., Wright, J., Huang, T. S., & Ma, Y. (2010). Image super-resolution via sparse representation. IEEE Transactions on Image Processing, 19(11), 2861–73.

  53. Yang, S., Luo, P., Loy, C. C., & Tang, X. (2016). Wider face: A face detection benchmark. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 5525–5533).

  54. Yu, X., & Porikli, F. (2016). Ultra-resolving face images by discriminative generative networks. In European conference on computer vision (ECCV) (pp. 318–333).

  55. Yu, X., & Porikli, F. (2017a). Face hallucination with tiny unaligned images by transformative discriminative neural networks. In Thirty-First AAAI conference on artificial intelligence.

  56. Yu, X., & Porikli, F. (2017b). Hallucinating very low-resolution unaligned and noisy face images by transformative discriminative autoencoders. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3760–3768).

  57. Yu, X., & Porikli, F. (2018). Imagining the unimaginable faces by deconvolutional networks. IEEE Transactions on Image Processing, 27(6), 2747–2761.

  58. Yu, X., Xu, F., Zhang, S., & Zhang, L. (2014). Efficient patch-wise non-uniform deblurring for a single image. IEEE Transactions on Multimedia, 16(6), 1510–1524.

  59. Yu, X., Fernando, B., Ghanem, B., Porikli, F., & Hartley, R. (2018a). Face super-resolution guided by facial component heatmaps. In Proceedings of European conference on computer vision (ECCV) (pp. 217–233).

  60. Yu, X., Fernando, B., Hartley, R., & Porikli, F. (2018b). Super-resolving very low-resolution face images with supplementary attributes. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 908–917).

  61. Yu, X., Fernando, B., Hartley, R., & Porikli, F. (2019a). Semantic face hallucination: Super-resolving very low-resolution face images with supplementary attributes. IEEE Transactions on Pattern Analysis and Machine Intelligence,. https://doi.org/10.1109/TPAMI.2019.2916881.

  62. Yu, X., Shiri, F., Ghanem, B., & Porikli, F. (2019b). Can we see more? Joint frontalization and hallucination of unaligned tiny faces. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2019.2914039

  63. Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In European conference on computer vision (ECCV) (pp. 818–833).

  64. Zeiler, M. D., Krishnan, D., Taylor, G. W., & Fergus, R. (2010). Deconvolutional networks. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 2528–2535).

  65. Zhou, E., & Fan, H. (2015). Learning face hallucination in the wild. In Twenty-ninth AAAI conference on artificial intelligence (pp. 3871–3877).

  66. Zhu, S., Liu, S., Loy, C. C., & Tang, X. (2016a). Deep cascaded bi-network for face hallucination. In Proceedings of European conference on computer vision (ECCV) (pp. 614–630).

  67. Zhu, S., Liu, S., Loy, C. C., & Tang, X. (2016b). Deep cascaded bi-network for face hallucination. In European conference on computer vision (ECCV) (pp. 614–630).

  68. Zhu, X., & Ramanan, D. (2012). Face detection, pose estimation, and landmark localization in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2879–2886).

Download references

Acknowledgements

This work was supported under the Australian Research Council’s Discovery Project funding scheme (Project DP150104645) and Australian Research Council Centre of Excellence for Robotic Vision (Project CE140100016).

Author information

Correspondence to Xin Yu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Communicated by Chen Change Loy.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Yu, X., Porikli, F., Fernando, B. et al. Hallucinating Unaligned Face Images by Multiscale Transformative Discriminative Networks. Int J Comput Vis 128, 500–526 (2020). https://doi.org/10.1007/s11263-019-01254-5

Download citation

Keywords

  • Face hallucination
  • Super-resolution
  • Multiscale
  • Transformative discriminative network