Advertisement

Invertible Image Rescaling

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12346)

Abstract

High-resolution digital images are usually downscaled to fit various display screens or save the cost of storage and bandwidth, meanwhile the post-upscaling is adopted to recover the original resolutions or the details in the zoom-in images. However, typical image downscaling is a non-injective mapping due to the loss of high-frequency information, which leads to the ill-posed problem of the inverse upscaling procedure and poses great challenges for recovering details from the downscaled low-resolution images. Simply upscaling with image super-resolution methods results in unsatisfactory recovering performance. In this work, we propose to solve this problem by modeling the downscaling and upscaling processes from a new perspective, i.e. an invertible bijective transformation, which can largely mitigate the ill-posed nature of image upscaling. We develop an Invertible Rescaling Net (IRN) with deliberately designed framework and objectives to produce visually-pleasing low-resolution images and meanwhile capture the distribution of the lost information using a latent variable following a specified distribution in the downscaling process. In this way, upscaling is made tractable by inversely passing a randomly-drawn latent variable with the low-resolution image through the network. Experimental results demonstrate the significant improvement of our model over existing methods in terms of both quantitative and qualitative evaluations of image upscaling reconstruction from downscaled images. Code is available at https://github.com/pkuxmq/Invertible-Image-Rescaling.

Supplementary material

500725_1_En_8_MOESM1_ESM.pdf (15.3 mb)
Supplementary material 1 (pdf 15621 KB)

References

  1. 1.
    Agustsson, E., Timofte, R.: NTIRE 2017 challenge on single image super-resolution: dataset and study. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 126–135 (2017)Google Scholar
  2. 2.
    Agustsson, E., Tschannen, M., Mentzer, F., Timofte, R., Gool, L.V.: Generative adversarial networks for extreme learned image compression. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 221–231 (2019)Google Scholar
  3. 3.
    Ardizzone, L., et al.: Analyzing inverse problems with invertible neural networks. In: Proceedings of the International Conference on Learning and Representations (2019)Google Scholar
  4. 4.
    Ardizzone, L., Lüth, C., Kruse, J., Rother, C., Köthe, U.: Guided image generation with conditional invertible neural networks. arXiv preprint arXiv:1907.02392 (2019)
  5. 5.
    Arjovsky, M., Bottou, L.: Towards principled methods for training generative adversarial networks. In: Proceedings of the International Conference on Learning and Representations (2017)Google Scholar
  6. 6.
    Ballé, J., Laparra, V., Simoncelli, E.P.: End-to-end optimized image compression. arXiv preprint arXiv:1611.01704 (2016)
  7. 7.
    Ballé, J., Minnen, D., Singh, S., Hwang, S.J., Johnston, N.: Variational image compression with a scale hyperprior. arXiv preprint arXiv:1802.01436 (2018)
  8. 8.
    Behrmann, J., Grathwohl, W., Chen, R.T., Duvenaud, D., Jacobsen, J.H.: Invertible residual networks. In: International Conference on Machine Learning, pp. 573–582 (2019)Google Scholar
  9. 9.
    Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013)
  10. 10.
    Berg, R., Hasenclever, L., Tomczak, J.M., Welling, M.: Sylvester normalizing flows for variational inference. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence (2018)Google Scholar
  11. 11.
    Bevilacqua, M., Roumy, A., Guillemot, C., Alberi-Morel, M.L.: Low-complexity single-image super-resolution based on nonnegative neighbor embedding (2012)Google Scholar
  12. 12.
    Bruckstein, A.M., Elad, M., Kimmel, R.: Down-scaling for better transform compression. IEEE Trans. Image Process. 12(9), 1132–1144 (2003)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Chen, R.T., Behrmann, J., Duvenaud, D., Jacobsen, J.H.: Residual flows for invertible generative modeling. arXiv preprint arXiv:1906.02735 (2019)
  14. 14.
    Dai, T., Cai, J., Zhang, Y., Xia, S.T., Zhang, L.: Second-order attention network for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 11065–11074 (2019)Google Scholar
  15. 15.
    Dinh, L., Krueger, D., Bengio, Y.: NICE: non-linear independent components estimation. In: Workshop of the International Conference on Learning Representations (2015)Google Scholar
  16. 16.
    Dinh, L., Sohl-Dickstein, J., Bengio, S.: Density estimation using real NVP. In: Proceedings of the International Conference on Learning Representations (2017)Google Scholar
  17. 17.
    Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2015)CrossRefGoogle Scholar
  18. 18.
    Freedman, G., Fattal, R.: Image and video upscaling from local self-examples. ACM Trans. Graph. (TOG) 30(2), 12 (2011)CrossRefGoogle Scholar
  19. 19.
    Giachetti, A., Asuni, N.: Real-time artifact-free image upscaling. IEEE Trans. Image Process. 20(10), 2760–2768 (2011)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Glasner, D., Bagon, S., Irani, M.: Super-resolution from a single image. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 349–356. IEEE (2009)Google Scholar
  21. 21.
    Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, Montréal, Canada, pp. 2672–2680. NIPS Foundation (2014)Google Scholar
  22. 22.
    Grathwohl, W., Chen, R.T., Betterncourt, J., Sutskever, I., Duvenaud, D.: FFJORD: free-form continuous dynamics for scalable reversible generative models. In: Proceedings of the International Conference on Learning and Representations (2019)Google Scholar
  23. 23.
    Huang, J.B., Singh, A., Ahuja, N.: Single image super-resolution from transformed self-exemplars. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5197–5206 (2015)Google Scholar
  24. 24.
    Glasner, D., Bagon, S., Irani, M.: Super-resolution from a single image. In: Proceedings of the IEEE International Conference on Computer Vision, Kyoto, Japan, pp. 349–356 (2009)Google Scholar
  25. 25.
    Jacobsen, J.H., Smeulders, A.W., Oyallon, E.: i-RevNet: deep invertible networks. In: International Conference on Learning Representations (2018)Google Scholar
  26. 26.
    Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46475-6_43CrossRefGoogle Scholar
  27. 27.
    Kim, H., Choi, M., Lim, B., Mu Lee, K.: Task-aware image downscaling. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11208, pp. 419–434. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01225-0_25CrossRefGoogle Scholar
  28. 28.
    Kim, K.I., Kwon, Y.: Single-image super-resolution using sparse regression and natural image prior. IEEE Trans. Pattern Anal. Mach. Intell. 32(6), 1127–1133 (2010)CrossRefGoogle Scholar
  29. 29.
    Kingma, D.P., Dhariwal, P.: Glow: generative flow with invertible 1x1 convolutions. In: Advances in Neural Information Processing Systems, pp. 10215–10224 (2018)Google Scholar
  30. 30.
    Kingma, D.P., Salimans, T., Jozefowicz, R., Chen, X., Sutskever, I., Welling, M.: Improved variational inference with inverse autoregressive flow. In: Advances in Neural Information Processing Systems, pp. 4743–4751 (2016)Google Scholar
  31. 31.
    Kopf, J., Shamir, A., Peers, P.: Content-adaptive image downscaling. ACM Trans. Graph. (TOG) 32(6), 173 (2013)Google Scholar
  32. 32.
    Kumar, M., et al.: VideoFlow: a flow-based generative model for video. arXiv preprint arXiv:1903.01434 (2019)
  33. 33.
    Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4681–4690 (2017)Google Scholar
  34. 34.
    Li, Y., Liu, D., Li, H., Li, L., Li, Z., Wu, F.: Learning a convolutional neural network for image compact-resolution. IEEE Trans. Image Process. 28(3), 1092–1107 (2018)MathSciNetCrossRefGoogle Scholar
  35. 35.
    Li, Z., Li, S., Zhang, N., Wang, L., Xue, Z.: Multi-scale invertible network for image super-resolution. In: Proceedings of the ACM Multimedia Asia, pp. 1–6 (2019)Google Scholar
  36. 36.
    Lienhart, R., Maydt, J.: An extended set of Haar-like features for rapid object detection. In: Proceedings of the International Conference on Image Processing, vol. 1, p. I. IEEE (2002)Google Scholar
  37. 37.
    Lim, B., Son, S., Kim, H., Nah, S., Mu Lee, K.: Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 136–144 (2017)Google Scholar
  38. 38.
    Lin, W., Dong, L.: Adaptive downsampling to improve image compression at low bit rates. IEEE Trans. Image Process. 15(9), 2513–2521 (2006)CrossRefGoogle Scholar
  39. 39.
    Liu, J., He, S., Lau, R.W.: \(l\_\{0\}\)-regularized image downscaling. IEEE Trans. Image Process. 27(3), 1076–1085 (2017)MathSciNetCrossRefGoogle Scholar
  40. 40.
    Martin, D., Fowlkes, C., Tal, D., Malik, J., et al.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: ICCV, Vancouver (2001)Google Scholar
  41. 41.
    Minnen, D., Ballé, J., Toderici, G.D.: Joint autoregressive and hierarchical priors for learned image compression. In: Advances in Neural Information Processing Systems, pp. 10771–10780 (2018)Google Scholar
  42. 42.
    Mitchell, D.P., Netravali, A.N.: Reconstruction filters in computer-graphics. ACM SIGGRAPH Comput. Graph. 22(4), 221–228 (1988)CrossRefGoogle Scholar
  43. 43.
    Oeztireli, A.C., Gross, M.: Perceptually based downscaling of images. ACM Trans. Graph. (TOG) 34(4), 77 (2015)Google Scholar
  44. 44.
    van der Ouderaa, T.F., Worrall, D.E.: Reversible GANs for memory-efficient image-to-image translation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4720–4728 (2019)Google Scholar
  45. 45.
    Rezende, D., Mohamed, S.: Variational inference with normalizing flows. In: Proceedings of the International Conference on Machine Learning, pp. 1530–1538 (2015)Google Scholar
  46. 46.
    Rippel, O., Bourdev, L.: Real-time adaptive image compression. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 2922–2930. JMLR.org (2017)Google Scholar
  47. 47.
    Schulter, S., Leistner, C., Bischof, H.: Fast and accurate image upscaling with super-resolution forests. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3791–3799 (2015)Google Scholar
  48. 48.
    Shannon, C.E.: Communication in the presence of noise. Proc. IRE 37(1), 10–21 (1949)MathSciNetCrossRefGoogle Scholar
  49. 49.
    Shen, M., Xue, P., Wang, C.: Down-sampling based video coding using super-resolution technique. IEEE Trans. Circ. Syst. Video Technol. 21(6), 755–765 (2011)CrossRefGoogle Scholar
  50. 50.
    Sun, W., Chen, Z.: Learned image downscaling for upscaling using content adaptive resampler. IEEE Trans. Image Process. 29, 4027–4040 (2020)CrossRefGoogle Scholar
  51. 51.
    Wang, X., et al.: ESRGAN: enhanced super-resolution generative adversarial networks. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 63–79. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-11021-5_5CrossRefGoogle Scholar
  52. 52.
    Wang, Y., Xiao, M., Liu, C., Zheng, S., Liu, T.Y.: Modeling lost information in lossy image compression. arXiv preprint arXiv:2006.11999 (2020)
  53. 53.
    Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P., et al.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)CrossRefGoogle Scholar
  54. 54.
    Weber, N., Waechter, M., Amend, S.C., Guthe, S., Goesele, M.: Rapid, detail-preserving image downscaling. ACM Trans. Graph. (TOG) 35(6), 205 (2016)CrossRefGoogle Scholar
  55. 55.
    Wilson, P.I., Fernandez, J.: Facial feature detection using Haar classifiers. J. Comput. Sci. Coll. 21(4), 127–133 (2006)Google Scholar
  56. 56.
    Wu, X., Zhang, X., Wang, X.: Low bit-rate image compression via adaptive down-sampling and constrained least squares upconversion. IEEE Trans. Image Process. 18(3), 552–561 (2009)MathSciNetCrossRefGoogle Scholar
  57. 57.
    Yang, J., Wright, J., Huang, T.S., Ma, Y.: Image super-resolution via sparse representation. IEEE Trans. Image Process. 19(11), 2861–2873 (2010)MathSciNetCrossRefGoogle Scholar
  58. 58.
    Yeo, H., Do, S., Han, D.: How will deep learning change internet video delivery? In: Proceedings of the 16th ACM Workshop on Hot Topics in Networks, pp. 57–64. ACM (2017)Google Scholar
  59. 59.
    Yeo, H., Jung, Y., Kim, J., Shin, J., Han, D.: Neural adaptive content-aware internet video delivery. In: 13th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2018, pp. 645–661 (2018)Google Scholar
  60. 60.
    Zeyde, R., Elad, M., Protter, M.: On single image scale-up using sparse-representations. In: Boissonnat, J.-D., et al. (eds.) Curves and Surfaces 2010. LNCS, vol. 6920, pp. 711–730. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-27413-8_47CrossRefGoogle Scholar
  61. 61.
    Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 294–310. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01234-2_18CrossRefGoogle Scholar
  62. 62.
    Zhang, Y., Tian, Y., Kong, Y., Zhong, B., Fu, Y.: Residual dense network for image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2472–2481 (2018)Google Scholar
  63. 63.
    Zhong, Z., Shen, T., Yang, Y., Lin, Z., Zhang, C.: Joint sub-bands learning with clique structures for wavelet domain super-resolution. In: Advances in Neural Information Processing Systems, pp. 165–175 (2018)Google Scholar
  64. 64.
    Zhu, X., Li, Z., Zhang, X.Y., Li, C., Liu, Y., Xue, Z.: Residual invertible spatio-temporal network for video super-resolution. Proc. AAAI Conf. Artif. Intell. 33, 5981–5988 (2019)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Peking UniversityBeijingChina
  2. 2.Microsoft Research Asia BeijingChina
  3. 3.University of TorontoTorontoCanada

Personalised recommendations