Advertisement

Hallucinating Unaligned Face Images by Multiscale Transformative Discriminative Networks

  • Xin YuEmail author
  • Fatih Porikli
  • Basura Fernando
  • Richard Hartley
Article
  • 12 Downloads

Abstract

Conventional face hallucination methods heavily rely on accurate alignment of low-resolution (LR) faces before upsampling them. Misalignment often leads to deficient results and unnatural artifacts for large upscaling factors. However, due to the diverse range of poses and different facial expressions, aligning an LR input image, in particular when it is tiny, is severely difficult. In addition, when the resolutions of LR input images vary, previous deep neural network based face hallucination methods require the interocular distances of input face images to be similar to the ones in the training datasets. Downsampling LR input faces to a required resolution will lose high-frequency information of the original input images. This may lead to suboptimal super-resolution performance for the state-of-the-art face hallucination networks. To overcome these challenges, we present an end-to-end multiscale transformative discriminative neural network devised for super-resolving unaligned and very small face images of different resolutions ranging from 16 \(\times \) 16 to 32 \(\times \) 32 pixels in a unified framework. Our proposed network embeds spatial transformation layers to allow local receptive fields to line-up with similar spatial supports, thus obtaining a better mapping between LR and HR facial patterns. Furthermore, we incorporate a class-specific loss designed to classify upright realistic faces in our objective through a successive discriminative network to improve the alignment and upsampling performance with semantic information. Extensive experiments on a large face dataset show that the proposed method significantly outperforms the state-of-the-art.

Keywords

Face hallucination Super-resolution Multiscale Transformative discriminative network 

Notes

Acknowledgements

This work was supported under the Australian Research Council’s Discovery Project funding scheme (Project DP150104645) and Australian Research Council Centre of Excellence for Robotic Vision (Project CE140100016).

References

  1. Arandjelović, O. (2014). Hallucinating optimal high-dimensional subspaces. Pattern Recognition, 47(8), 2662–2672.CrossRefGoogle Scholar
  2. Baker, S., & Kanade, T. (2000). Hallucinating faces. In Proceedings of 4th IEEE international conference on automatic face and gesture recognition, FG 2000 (pp. 83–88).Google Scholar
  3. Baker, S., & Kanade, T. (2002). Limits on super-resolution and how to break them. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(9), 1167–1183.CrossRefGoogle Scholar
  4. Bruna, J., Sprechmann, P., & LeCun, Y. (2016). Super-resolution with deep convolutional sufficient statistics. In International conference on learning representations (ICLR).Google Scholar
  5. Bulat, A., & Tzimiropoulos, G. (2017). How far are we from solving the 2D and 3D face alignment problem? (and a dataset of 230,000 3D facial landmarks). In Proceeding of international conference on computer vision (ICCV).Google Scholar
  6. Bulat, A., & Tzimiropoulos, G. (2018). Super-fan: Integrated facial landmark localization and super-resolution of real-world low resolution faces in arbitrary poses with gans. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).Google Scholar
  7. Bulat, A., Yang, J., & Tzimiropoulos, G. (2018). To learn image super-resolution, use a gan to learn how to do image degradation first. In Proceedings of European conference on computer vision (ECCV) (pp. 185–200).Google Scholar
  8. Chen, Y., Tai, Y., Liu, X., Shen, C., & Yang, J. (2018). Fsrnet: End-to-end learning face super-resolution with facial priors. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).Google Scholar
  9. Dahl, R., Norouzi, M., & Shlens, J. (2017). Pixel recursive super resolution. In Proceeding of international conference on computer vision (ICCV) (pp. 5439–5448).Google Scholar
  10. Denton, E., Chintala, S., Szlam, A., & Fergus, R. (2015). Deep generative image models using a Laplacian pyramid of adversarial networks. In Advances in neural information processing systems (NIPS) (pp. 1486–1494).Google Scholar
  11. Dong, C., Loy, C. C., & He, K. (2016). Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2), 295–307.CrossRefGoogle Scholar
  12. Freedman, G., & Fattal, R. (2010). Image and video upscaling from local self-examples. ACM Transactions on Graphics, 28(3), 1–10.Google Scholar
  13. Freeman, W. T., Jones, T. R., & Pasztor, E. C. (2002). Example-based super-resolution. IEEE Computer Graphics and Applications, 22(2), 56–65.CrossRefGoogle Scholar
  14. Glasner, D., Bagon, S., & Irani, M. (2009). Super-resolution from a single image. In Proceedings of IEEE international conference on computer vision (ICCV) (pp. 349–356).Google Scholar
  15. Goodfellow, I., Pouget-Abadie, J., & Mirza, M. (2014). Generative adversarial networks. In Advances in neural information processing systems (NIPS) (pp. 2672–2680).Google Scholar
  16. Gu, S., Zuo, W., Xie, Q., Meng, D., Feng, X., & Zhang, L. (2015). Convolutional sparse coding for image super-resolution. In Proceedings of the IEEE international conference on computer vision (ICCV).Google Scholar
  17. Hennings-Yeomans, P. H., Baker. S., & Kumar, B. V. (2008). Simultaneous super-resolution and feature extraction for recognition of low-resolution faces. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 1–8).Google Scholar
  18. Hinton, G. (2012). Neural networks for machine learning lecture 6a: Overview of mini-batch gradient descent Reminder—The error surface for a linear neuron. Technical report.Google Scholar
  19. Chang, H., Yeung, D.-Y., & Xiong, Y. (2004). Super-resolution through neighbor embedding. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (Vol. 1, pp. 275–282).Google Scholar
  20. Huang, G. B., Ramesh, M., Berg, T., & Learned-Miller, E. (2007). Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical report 07-49, University of Massachusetts, Amherst.Google Scholar
  21. Huang, H., He, R., Sun, Z., & Tan, T. (2017). Wavelet-SRNet: A wavelet-based CNN for multi-scale face super resolution. In Proceeding of international conference on computer vision (ICCV).Google Scholar
  22. Huang, J. B., Singh, A., & Ahuja, N. (2015). Single image super-resolution from transformed self-exemplars. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 5197–5206).Google Scholar
  23. Jaderberg, M., Simonyan, K., & Zisserman, A., et al. (2015). Spatial transformer networks. In Advances in neural information processing systems (NIPS) (pp. 2017–2025).Google Scholar
  24. Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual losses for real-time style transfer and super-resolution. In Proceedings of European conference on computer vision (ECCV).Google Scholar
  25. Kim, J., Kwon Lee, J., & Mu Lee, K. (2016a). Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 1646–1654).Google Scholar
  26. Kim, J., Kwon Lee, J., & Mu Lee, K. (2016b). Deeply-recursive convolutional network for image super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1637–1645).Google Scholar
  27. Kolouri, S., & Rohde, G. K. (2015). Transport-based single frame super resolution of very low resolution face images. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR).Google Scholar
  28. Lai, W. S., Huang, J. B., Ahuja, N., & Yang, M. H. (2017). Deep Laplacian pyramid networks for fast and accurate super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 624–632).Google Scholar
  29. Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., & Wang, Z., et al. (2016). Photo-realistic single image super-resolution using a generative adversarial network. arXiv:1609.04802
  30. Li, Y., Cai, C., Qiu, G., & Lam, K. M. (2014). Face hallucination based on sparse local-pixel structure. Pattern Recognition, 47(3), 1261–1270.CrossRefGoogle Scholar
  31. Lin, Z., & Shum, H. Y. (2006). Response to the comments on “Fundamental limits of reconstruction-based superresolution algorithms under local translation”. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(5), 83–97.Google Scholar
  32. Lin, Z., He, J., Tang, X., & Tang, C. K. (2008). Limits of learning-based superresolution algorithms. International Journal of Computer Vision, 80(3), 406–420.CrossRefGoogle Scholar
  33. Liu, C., Shum, H., & Zhang, C. (2001). A two-step approach to hallucinating faces: Global parametric model and local nonparametric model. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (Vol. 1, pp. 192–198).Google Scholar
  34. Liu, C., Shum, H. Y., & Freeman, W. T. (2007). Face hallucination: Theory and practice. International Journal of Computer Vision, 75(1), 115–134.CrossRefGoogle Scholar
  35. Liu, C., Yuen, J., & Torralba, A. (2011). Sift flow: Dense correspondence across scenes and its applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5), 978–994.CrossRefGoogle Scholar
  36. Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., & Song, L. (2017). Sphereface: Deep hypersphere embedding for face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (Vol. 1, p. 1).Google Scholar
  37. Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep learning face attributes in the wild. In Proceedings of international conference on computer vision (ICCV).Google Scholar
  38. Ma, X., Zhang, J., & Qi, C. (2010). Hallucinating face by position-patch. Pattern Recognition, 43(6), 2224–2236.CrossRefGoogle Scholar
  39. Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks (pp. 1–15). arXiv:1511.06434
  40. Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A. P., Bishop, R., Rueckert, D., & Wang, Z. (2016). Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1874–1883).Google Scholar
  41. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
  42. Singh, A., Porikli, F., & Ahuja, N. (2014). Super-resolving noisy images. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 2846–2853).Google Scholar
  43. Tai, Y., Yang, J., & Liu, X. (2017). Image super-resolution via deep recursive residual network. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (Vol. 1).Google Scholar
  44. Tappen, M. F., & Liu, C. (2012). A Bayesian approach to alignment-based image hallucination. In Proceedings of European conference on computer vision (ECCV) (Vol. 7578, pp. 236–249).Google Scholar
  45. Tappen, M. F., Russell, B. C., & Freeman, W. T. (2003). Exploiting the sparse derivative prior for super-resolution and image demosaicing. In IEEE workshop on statistical and computational theories of vision.Google Scholar
  46. Van Den Oord, A., Kalchbrenner, N., & Kavukcuoglu, K. (2016). Pixel recurrent neural networks. In Proceedings of international conference on international conference on machine learning (ICML) (pp. 1747–1756).Google Scholar
  47. Wang, N., Tao, D., Gao, X., Li, X., & Li, J. (2014). A comprehensive survey to face hallucination. International Journal of Computer Vision, 106(1), 9–30.CrossRefGoogle Scholar
  48. Wang, X., & Tang, X. (2005). Hallucinating face by eigen transformation. IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews, 35(3), 425–434.CrossRefGoogle Scholar
  49. Xu, X., Sun, D., Pan, J., Zhang, Y., Pfister, H., & Yang, M. H. (2017). Learning to super-resolve blurry face and text images. In Proceedings of the IEEE conference on computer vision and pattern recognition (ICCV) (pp. 251–260).Google Scholar
  50. Yang, C. Y., Liu, S., & Yang, M. H. (2013). Structured face hallucination. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 1099–1106).Google Scholar
  51. Yang, C. Y., Liu, S., & Yang, M. H. (2018). Hallucinating compressed face images. International Journal of Computer Vision, 126(6), 597–614. MathSciNetCrossRefGoogle Scholar
  52. Yang, J., Wright, J., Huang, T. S., & Ma, Y. (2010). Image super-resolution via sparse representation. IEEE Transactions on Image Processing, 19(11), 2861–73.MathSciNetCrossRefGoogle Scholar
  53. Yang, S., Luo, P., Loy, C. C., & Tang, X. (2016). Wider face: A face detection benchmark. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 5525–5533).Google Scholar
  54. Yu, X., & Porikli, F. (2016). Ultra-resolving face images by discriminative generative networks. In European conference on computer vision (ECCV) (pp. 318–333).Google Scholar
  55. Yu, X., & Porikli, F. (2017a). Face hallucination with tiny unaligned images by transformative discriminative neural networks. In Thirty-First AAAI conference on artificial intelligence.Google Scholar
  56. Yu, X., & Porikli, F. (2017b). Hallucinating very low-resolution unaligned and noisy face images by transformative discriminative autoencoders. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3760–3768).Google Scholar
  57. Yu, X., & Porikli, F. (2018). Imagining the unimaginable faces by deconvolutional networks. IEEE Transactions on Image Processing, 27(6), 2747–2761.MathSciNetCrossRefGoogle Scholar
  58. Yu, X., Xu, F., Zhang, S., & Zhang, L. (2014). Efficient patch-wise non-uniform deblurring for a single image. IEEE Transactions on Multimedia, 16(6), 1510–1524.CrossRefGoogle Scholar
  59. Yu, X., Fernando, B., Ghanem, B., Porikli, F., & Hartley, R. (2018a). Face super-resolution guided by facial component heatmaps. In Proceedings of European conference on computer vision (ECCV) (pp. 217–233).Google Scholar
  60. Yu, X., Fernando, B., Hartley, R., & Porikli, F. (2018b). Super-resolving very low-resolution face images with supplementary attributes. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 908–917).Google Scholar
  61. Yu, X., Fernando, B., Hartley, R., & Porikli, F. (2019a). Semantic face hallucination: Super-resolving very low-resolution face images with supplementary attributes. IEEE Transactions on Pattern Analysis and Machine Intelligence,.  https://doi.org/10.1109/TPAMI.2019.2916881.
  62. Yu, X., Shiri, F., Ghanem, B., & Porikli, F. (2019b). Can we see more? Joint frontalization and hallucination of unaligned tiny faces. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2019.2914039
  63. Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In European conference on computer vision (ECCV) (pp. 818–833).Google Scholar
  64. Zeiler, M. D., Krishnan, D., Taylor, G. W., & Fergus, R. (2010). Deconvolutional networks. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 2528–2535).Google Scholar
  65. Zhou, E., & Fan, H. (2015). Learning face hallucination in the wild. In Twenty-ninth AAAI conference on artificial intelligence (pp. 3871–3877).Google Scholar
  66. Zhu, S., Liu, S., Loy, C. C., & Tang, X. (2016a). Deep cascaded bi-network for face hallucination. In Proceedings of European conference on computer vision (ECCV) (pp. 614–630).Google Scholar
  67. Zhu, S., Liu, S., Loy, C. C., & Tang, X. (2016b). Deep cascaded bi-network for face hallucination. In European conference on computer vision (ECCV) (pp. 614–630).Google Scholar
  68. Zhu, X., & Ramanan, D. (2012). Face detection, pose estimation, and landmark localization in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2879–2886).Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Research School of EngineeringAustralian National UniversityCanberraAustralia
  2. 2.Human Centric AI Programme, A*STAR Artificial Intelligence Initiative (A*AI)SingaporeSingapore

Personalised recommendations