Advertisement

To Learn Image Super-Resolution, Use a GAN to Learn How to Do Image Degradation First

  • Adrian Bulat
  • Jing YangEmail author
  • Georgios Tzimiropoulos
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11210)

Abstract

This paper is on image and face super-resolution. The vast majority of prior work for this problem focus on how to increase the resolution of low-resolution images which are artificially generated by simple bilinear down-sampling (or in a few cases by blurring followed by down-sampling). We show that such methods fail to produce good results when applied to real-world low-resolution, low quality images. To circumvent this problem, we propose a two-stage process which firstly trains a High-to-Low Generative Adversarial Network (GAN) to learn how to degrade and downsample high-resolution images requiring, during training, only unpaired high and low-resolution images. Once this is achieved, the output of this network is used to train a Low-to-High GAN for image super-resolution using this time paired low- and high-resolution images. Our main result is that this network can be now used to effectively increase the quality of real-world low-resolution images. We have applied the proposed pipeline for the problem of face super-resolution where we report large improvement over baselines and prior work although the proposed method is potentially applicable to other object categories.

Keywords

Image and face super-resolution Generative Adversarial Networks GANs 

Supplementary material

474211_1_En_12_MOESM1_ESM.pdf (54.2 mb)
Supplementary material 1 (pdf 55522 KB)

References

  1. 1.
    Yang, S., Luo, P., Loy, C.C., Tang, X.: Wider face: a face detection benchmark. In: CVPR (2016)Google Scholar
  2. 2.
    Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: CVPR (2017)Google Scholar
  3. 3.
    Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV (2017)Google Scholar
  4. 4.
    Timofte, R., et al.: Ntire 2017 challenge on single image super-resolution: methods and results. In: CVPR-W (2017)Google Scholar
  5. 5.
    Shocher, A., Cohen, N., Irani, M.: “Zero-shot” super-resolution using deep internal learning. arXiv (2017)Google Scholar
  6. 6.
    Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE TPAMI (2016)Google Scholar
  7. 7.
    Kim, J., Kwon Lee, J., Mu Lee, K.: Accurate image super-resolution using very deep convolutional networks. In: CVPR (2016)Google Scholar
  8. 8.
    Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46475-6_43CrossRefGoogle Scholar
  9. 9.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv (2014)Google Scholar
  10. 10.
    Lai, W.S., Huang, J.B., Ahuja, N., Yang, M.H.: Deep Laplacian pyramid networks for fast and accurate super-resolution. In: CVPR (2017)Google Scholar
  11. 11.
    Tai, Y., Yang, J., Liu, X.: Image super-resolution via deep recursive residual network. In: CVPR (2017)Google Scholar
  12. 12.
    Tong, T., Li, G., Liu, X., Gao, Q.: Image super-resolution using dense skip connections. In: ICCV (2017)Google Scholar
  13. 13.
    Goodfellow, I., et al.: Generative adversarial nets. In: NIPS, pp. 2672–2680 (2014)Google Scholar
  14. 14.
    Lim, B., Son, S., Kim, H., Nah, S., Lee, K.M.: Enhanced deep residual networks for single image super-resolution. In: CVPR-W (2017)Google Scholar
  15. 15.
    Sajjadi, M.S., Scholkopf, B., Hirsch, M.: EnhanceNet: single image super-resolution through automated texture synthesis. In: ICCV (2017)Google Scholar
  16. 16.
    Dahl, R., Norouzi, M., Shlens, J.: Pixel recursive super resolution. In: ICCV (2017)Google Scholar
  17. 17.
    van den Oord, A., Kalchbrenner, N., Kavukcuoglu, K.: Pixel recurrent neural networks. arXiv (2016)Google Scholar
  18. 18.
    Yu, X., Porikli, F.: Hallucinating very low-resolution unaligned and noisy face images by transformative discriminative autoencoders. In: CVPR (2017)Google Scholar
  19. 19.
    Cao, Q., Lin, L., Shi, Y., Liang, X., Li, G.: Attention-aware Face Hallucination via deep reinforcement learning. In: CVPR (2017)Google Scholar
  20. 20.
    Huang, H., He, R., Sun, Z., Tan, T.: Wavelet-SRNet: a wavelet-based CNN for multi-scale face super resolution. In: ICCV (2017)Google Scholar
  21. 21.
    Zhu, S., Liu, S., Loy, C.C., Tang, X.: Deep cascaded bi-network for Face Hallucination. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 614–630. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46454-1_37CrossRefGoogle Scholar
  22. 22.
    Yu, X., Porikli, F.: Ultra-resolving face images by discriminative generative networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 318–333. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46454-1_20CrossRefGoogle Scholar
  23. 23.
    Yang, C.Y., Liu, S., Yang, M.H.: Structured Face Hallucination. In: CVPR (2013)Google Scholar
  24. 24.
    Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: ICCV (2015)Google Scholar
  25. 25.
    Le, V., Brandt, J., Lin, Z., Bourdev, L., Huang, T.S.: Interactive facial feature localization. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7574, pp. 679–692. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33712-3_49CrossRefGoogle Scholar
  26. 26.
    Huang, G.B., Ramesh, M., Berg, T., Learned-Miller, E.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments. Technical report 07–49, University of Massachusetts, Amherst, October 2007Google Scholar
  27. 27.
    Jesorsky, O., Kirchberg, K.J., Frischholz, R.W.: Robust face detection using the Hausdorff distance. In: Bigun, J., Smeraldi, F. (eds.) AVBPA 2001. LNCS, vol. 2091, pp. 90–95. Springer, Heidelberg (2001).  https://doi.org/10.1007/3-540-45344-X_14CrossRefGoogle Scholar
  28. 28.
    Bulat, A., Tzimiropoulos, G.: Super-FAN: integrated facial landmark localization and super-resolution of real-world low resolution faces in arbitrary poses with GANs. arXiv (2017)Google Scholar
  29. 29.
    Köstinger, M., Wohlhart, P., Roth, P.M., Bischof, H.: Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: ICCV-W (2011)Google Scholar
  30. 30.
    Bulat, A., Tzimiropoulos, G.: How far are we from solving the 2D & 3D face alignment problem? (and a dataset of 230,000 3D facial landmarks). In: ICCV (2017)Google Scholar
  31. 31.
    Cao, Q., Shen, L., Xie, W., Parkhi, O.M., Zisserman, A.: VGGFace2: a dataset for recognising faces across pose and age. In: FG (2018)Google Scholar
  32. 32.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)Google Scholar
  33. 33.
    He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_38CrossRefGoogle Scholar
  34. 34.
    Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv (2014)Google Scholar
  35. 35.
    Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.: Improved training of wasserstein GANs. arXiv (2017)Google Scholar
  36. 36.
    Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. arXiv (2018)Google Scholar
  37. 37.
    Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein GAN. arXiv (2017)Google Scholar
  38. 38.
    Zhang, S., Zhu, X., Lei, Z., Shi, H., Wang, X., Li, S.Z.: S 3FD: single shot scale-invariant face detector. In: ICCV (2017)Google Scholar
  39. 39.
    Paszke, A., Gross, S., Chintal, S.: Pytorch. http://github.com/pytorch/pytorch
  40. 40.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv (2014)Google Scholar
  41. 41.
    Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: NIPS (2017)Google Scholar
  42. 42.
    Chen, Y., Tai, Y., Liu, X., Shen, C., Yang, J.: FSRNet: end-to-end learning face super-resolution with facial priors. In: CVPR (2018)Google Scholar
  43. 43.
    Nah, S., Kim, T.H., Lee, K.M.: Deep multi-scale convolutional neural network for dynamic scene deblurring. In: CVPR (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Adrian Bulat
    • 1
  • Jing Yang
    • 1
    Email author
  • Georgios Tzimiropoulos
    • 1
  1. 1.Computer Vision LaboratoryUniversity of NottinghamNottinghamUK

Personalised recommendations