Hallucinating Unaligned Face Images by Multiscale Transformative Discriminative Networks

Yu, Xin; Porikli, Fatih; Fernando, Basura; Hartley, Richard

doi:10.1007/s11263-019-01254-5

Hallucinating Unaligned Face Images by Multiscale Transformative Discriminative Networks

Published: 07 November 2019

Volume 128, pages 500–526, (2020)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Xin Yu ORCID: orcid.org/0000-0002-0269-5649¹,
Fatih Porikli¹,
Basura Fernando² &
…
Richard Hartley¹

836 Accesses
28 Citations
Explore all metrics

Abstract

Conventional face hallucination methods heavily rely on accurate alignment of low-resolution (LR) faces before upsampling them. Misalignment often leads to deficient results and unnatural artifacts for large upscaling factors. However, due to the diverse range of poses and different facial expressions, aligning an LR input image, in particular when it is tiny, is severely difficult. In addition, when the resolutions of LR input images vary, previous deep neural network based face hallucination methods require the interocular distances of input face images to be similar to the ones in the training datasets. Downsampling LR input faces to a required resolution will lose high-frequency information of the original input images. This may lead to suboptimal super-resolution performance for the state-of-the-art face hallucination networks. To overcome these challenges, we present an end-to-end multiscale transformative discriminative neural network devised for super-resolving unaligned and very small face images of different resolutions ranging from 16 \(\times \) 16 to 32 \(\times \) 32 pixels in a unified framework. Our proposed network embeds spatial transformation layers to allow local receptive fields to line-up with similar spatial supports, thus obtaining a better mapping between LR and HR facial patterns. Furthermore, we incorporate a class-specific loss designed to classify upright realistic faces in our objective through a successive discriminative network to improve the alignment and upsampling performance with semantic information. Extensive experiments on a large face dataset show that the proposed method significantly outperforms the state-of-the-art.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 15

Facial image super-resolution guided by adaptive geometric features

Article Open access 17 July 2020

Super-Identity Convolutional Neural Network for Face Hallucination

Face Super-Resolution with Spatial Attention Guided by Multiscale Receptive-Field Features

References

Arandjelović, O. (2014). Hallucinating optimal high-dimensional subspaces. Pattern Recognition, 47(8), 2662–2672.
Article Google Scholar
Baker, S., & Kanade, T. (2000). Hallucinating faces. In Proceedings of 4th IEEE international conference on automatic face and gesture recognition, FG 2000 (pp. 83–88).
Baker, S., & Kanade, T. (2002). Limits on super-resolution and how to break them. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(9), 1167–1183.
Article Google Scholar
Bruna, J., Sprechmann, P., & LeCun, Y. (2016). Super-resolution with deep convolutional sufficient statistics. In International conference on learning representations (ICLR).
Bulat, A., & Tzimiropoulos, G. (2017). How far are we from solving the 2D and 3D face alignment problem? (and a dataset of 230,000 3D facial landmarks). In Proceeding of international conference on computer vision (ICCV).
Bulat, A., & Tzimiropoulos, G. (2018). Super-fan: Integrated facial landmark localization and super-resolution of real-world low resolution faces in arbitrary poses with gans. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).
Bulat, A., Yang, J., & Tzimiropoulos, G. (2018). To learn image super-resolution, use a gan to learn how to do image degradation first. In Proceedings of European conference on computer vision (ECCV) (pp. 185–200).
Chen, Y., Tai, Y., Liu, X., Shen, C., & Yang, J. (2018). Fsrnet: End-to-end learning face super-resolution with facial priors. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).
Dahl, R., Norouzi, M., & Shlens, J. (2017). Pixel recursive super resolution. In Proceeding of international conference on computer vision (ICCV) (pp. 5439–5448).
Denton, E., Chintala, S., Szlam, A., & Fergus, R. (2015). Deep generative image models using a Laplacian pyramid of adversarial networks. In Advances in neural information processing systems (NIPS) (pp. 1486–1494).
Dong, C., Loy, C. C., & He, K. (2016). Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2), 295–307.
Article Google Scholar
Freedman, G., & Fattal, R. (2010). Image and video upscaling from local self-examples. ACM Transactions on Graphics, 28(3), 1–10.
Google Scholar
Freeman, W. T., Jones, T. R., & Pasztor, E. C. (2002). Example-based super-resolution. IEEE Computer Graphics and Applications, 22(2), 56–65.
Article Google Scholar
Glasner, D., Bagon, S., & Irani, M. (2009). Super-resolution from a single image. In Proceedings of IEEE international conference on computer vision (ICCV) (pp. 349–356).
Goodfellow, I., Pouget-Abadie, J., & Mirza, M. (2014). Generative adversarial networks. In Advances in neural information processing systems (NIPS) (pp. 2672–2680).
Gu, S., Zuo, W., Xie, Q., Meng, D., Feng, X., & Zhang, L. (2015). Convolutional sparse coding for image super-resolution. In Proceedings of the IEEE international conference on computer vision (ICCV).
Hennings-Yeomans, P. H., Baker. S., & Kumar, B. V. (2008). Simultaneous super-resolution and feature extraction for recognition of low-resolution faces. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 1–8).
Hinton, G. (2012). Neural networks for machine learning lecture 6a: Overview of mini-batch gradient descent Reminder—The error surface for a linear neuron. Technical report.
Chang, H., Yeung, D.-Y., & Xiong, Y. (2004). Super-resolution through neighbor embedding. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (Vol. 1, pp. 275–282).
Huang, G. B., Ramesh, M., Berg, T., & Learned-Miller, E. (2007). Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical report 07-49, University of Massachusetts, Amherst.
Huang, H., He, R., Sun, Z., & Tan, T. (2017). Wavelet-SRNet: A wavelet-based CNN for multi-scale face super resolution. In Proceeding of international conference on computer vision (ICCV).
Huang, J. B., Singh, A., & Ahuja, N. (2015). Single image super-resolution from transformed self-exemplars. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 5197–5206).
Jaderberg, M., Simonyan, K., & Zisserman, A., et al. (2015). Spatial transformer networks. In Advances in neural information processing systems (NIPS) (pp. 2017–2025).
Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual losses for real-time style transfer and super-resolution. In Proceedings of European conference on computer vision (ECCV).
Kim, J., Kwon Lee, J., & Mu Lee, K. (2016a). Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 1646–1654).
Kim, J., Kwon Lee, J., & Mu Lee, K. (2016b). Deeply-recursive convolutional network for image super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1637–1645).
Kolouri, S., & Rohde, G. K. (2015). Transport-based single frame super resolution of very low resolution face images. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR).
Lai, W. S., Huang, J. B., Ahuja, N., & Yang, M. H. (2017). Deep Laplacian pyramid networks for fast and accurate super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 624–632).
Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., & Wang, Z., et al. (2016). Photo-realistic single image super-resolution using a generative adversarial network. arXiv:1609.04802
Li, Y., Cai, C., Qiu, G., & Lam, K. M. (2014). Face hallucination based on sparse local-pixel structure. Pattern Recognition, 47(3), 1261–1270.
Article Google Scholar
Lin, Z., & Shum, H. Y. (2006). Response to the comments on “Fundamental limits of reconstruction-based superresolution algorithms under local translation”. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(5), 83–97.
Google Scholar
Lin, Z., He, J., Tang, X., & Tang, C. K. (2008). Limits of learning-based superresolution algorithms. International Journal of Computer Vision, 80(3), 406–420.
Article Google Scholar
Liu, C., Shum, H., & Zhang, C. (2001). A two-step approach to hallucinating faces: Global parametric model and local nonparametric model. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (Vol. 1, pp. 192–198).
Liu, C., Shum, H. Y., & Freeman, W. T. (2007). Face hallucination: Theory and practice. International Journal of Computer Vision, 75(1), 115–134.
Article Google Scholar
Liu, C., Yuen, J., & Torralba, A. (2011). Sift flow: Dense correspondence across scenes and its applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5), 978–994.
Article Google Scholar
Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., & Song, L. (2017). Sphereface: Deep hypersphere embedding for face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (Vol. 1, p. 1).
Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep learning face attributes in the wild. In Proceedings of international conference on computer vision (ICCV).
Ma, X., Zhang, J., & Qi, C. (2010). Hallucinating face by position-patch. Pattern Recognition, 43(6), 2224–2236.
Article Google Scholar
Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks (pp. 1–15). arXiv:1511.06434
Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A. P., Bishop, R., Rueckert, D., & Wang, Z. (2016). Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1874–1883).
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Singh, A., Porikli, F., & Ahuja, N. (2014). Super-resolving noisy images. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 2846–2853).
Tai, Y., Yang, J., & Liu, X. (2017). Image super-resolution via deep recursive residual network. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (Vol. 1).
Tappen, M. F., & Liu, C. (2012). A Bayesian approach to alignment-based image hallucination. In Proceedings of European conference on computer vision (ECCV) (Vol. 7578, pp. 236–249).
Tappen, M. F., Russell, B. C., & Freeman, W. T. (2003). Exploiting the sparse derivative prior for super-resolution and image demosaicing. In IEEE workshop on statistical and computational theories of vision.
Van Den Oord, A., Kalchbrenner, N., & Kavukcuoglu, K. (2016). Pixel recurrent neural networks. In Proceedings of international conference on international conference on machine learning (ICML) (pp. 1747–1756).
Wang, N., Tao, D., Gao, X., Li, X., & Li, J. (2014). A comprehensive survey to face hallucination. International Journal of Computer Vision, 106(1), 9–30.
Article Google Scholar
Wang, X., & Tang, X. (2005). Hallucinating face by eigen transformation. IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews, 35(3), 425–434.
Article Google Scholar
Xu, X., Sun, D., Pan, J., Zhang, Y., Pfister, H., & Yang, M. H. (2017). Learning to super-resolve blurry face and text images. In Proceedings of the IEEE conference on computer vision and pattern recognition (ICCV) (pp. 251–260).
Yang, C. Y., Liu, S., & Yang, M. H. (2013). Structured face hallucination. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 1099–1106).
Yang, C. Y., Liu, S., & Yang, M. H. (2018). Hallucinating compressed face images. International Journal of Computer Vision, 126(6), 597–614.
Article MathSciNet Google Scholar
Yang, J., Wright, J., Huang, T. S., & Ma, Y. (2010). Image super-resolution via sparse representation. IEEE Transactions on Image Processing, 19(11), 2861–73.
Article MathSciNet Google Scholar
Yang, S., Luo, P., Loy, C. C., & Tang, X. (2016). Wider face: A face detection benchmark. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 5525–5533).
Yu, X., & Porikli, F. (2016). Ultra-resolving face images by discriminative generative networks. In European conference on computer vision (ECCV) (pp. 318–333).
Yu, X., & Porikli, F. (2017a). Face hallucination with tiny unaligned images by transformative discriminative neural networks. In Thirty-First AAAI conference on artificial intelligence.
Yu, X., & Porikli, F. (2017b). Hallucinating very low-resolution unaligned and noisy face images by transformative discriminative autoencoders. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3760–3768).
Yu, X., & Porikli, F. (2018). Imagining the unimaginable faces by deconvolutional networks. IEEE Transactions on Image Processing, 27(6), 2747–2761.
Article MathSciNet Google Scholar
Yu, X., Xu, F., Zhang, S., & Zhang, L. (2014). Efficient patch-wise non-uniform deblurring for a single image. IEEE Transactions on Multimedia, 16(6), 1510–1524.
Article Google Scholar
Yu, X., Fernando, B., Ghanem, B., Porikli, F., & Hartley, R. (2018a). Face super-resolution guided by facial component heatmaps. In Proceedings of European conference on computer vision (ECCV) (pp. 217–233).
Yu, X., Fernando, B., Hartley, R., & Porikli, F. (2018b). Super-resolving very low-resolution face images with supplementary attributes. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 908–917).
Yu, X., Fernando, B., Hartley, R., & Porikli, F. (2019a). Semantic face hallucination: Super-resolving very low-resolution face images with supplementary attributes. IEEE Transactions on Pattern Analysis and Machine Intelligence,. https://doi.org/10.1109/TPAMI.2019.2916881.
Yu, X., Shiri, F., Ghanem, B., & Porikli, F. (2019b). Can we see more? Joint frontalization and hallucination of unaligned tiny faces. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2019.2914039
Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In European conference on computer vision (ECCV) (pp. 818–833).
Zeiler, M. D., Krishnan, D., Taylor, G. W., & Fergus, R. (2010). Deconvolutional networks. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 2528–2535).
Zhou, E., & Fan, H. (2015). Learning face hallucination in the wild. In Twenty-ninth AAAI conference on artificial intelligence (pp. 3871–3877).
Zhu, S., Liu, S., Loy, C. C., & Tang, X. (2016a). Deep cascaded bi-network for face hallucination. In Proceedings of European conference on computer vision (ECCV) (pp. 614–630).
Zhu, S., Liu, S., Loy, C. C., & Tang, X. (2016b). Deep cascaded bi-network for face hallucination. In European conference on computer vision (ECCV) (pp. 614–630).
Zhu, X., & Ramanan, D. (2012). Face detection, pose estimation, and landmark localization in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2879–2886).

Download references

Acknowledgements

This work was supported under the Australian Research Council’s Discovery Project funding scheme (Project DP150104645) and Australian Research Council Centre of Excellence for Robotic Vision (Project CE140100016).

Author information

Authors and Affiliations

Research School of Engineering, Australian National University, Canberra, Australia
Xin Yu, Fatih Porikli & Richard Hartley
Human Centric AI Programme, A*STAR Artificial Intelligence Initiative (A*AI), Singapore, Singapore
Basura Fernando

Authors

Xin Yu
View author publications
You can also search for this author in PubMed Google Scholar
Fatih Porikli
View author publications
You can also search for this author in PubMed Google Scholar
Basura Fernando
View author publications
You can also search for this author in PubMed Google Scholar
Richard Hartley
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xin Yu.

Additional information

Communicated by Chen Change Loy.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yu, X., Porikli, F., Fernando, B. et al. Hallucinating Unaligned Face Images by Multiscale Transformative Discriminative Networks. Int J Comput Vis 128, 500–526 (2020). https://doi.org/10.1007/s11263-019-01254-5

Download citation

Received: 09 August 2018
Accepted: 09 October 2019
Published: 07 November 2019
Issue Date: February 2020
DOI: https://doi.org/10.1007/s11263-019-01254-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hallucinating Unaligned Face Images by Multiscale Transformative Discriminative Networks

Abstract

Access this article

Similar content being viewed by others

Facial image super-resolution guided by adaptive geometric features

Super-Identity Convolutional Neural Network for Face Hallucination

Face Super-Resolution with Spatial Attention Guided by Multiscale Receptive-Field Features

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Hallucinating Unaligned Face Images by Multiscale Transformative Discriminative Networks

Abstract

Access this article

Similar content being viewed by others

Facial image super-resolution guided by adaptive geometric features

Super-Identity Convolutional Neural Network for Face Hallucination

Face Super-Resolution with Spatial Attention Guided by Multiscale Receptive-Field Features

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation