Abstract
High-resolution digital images are usually downscaled to fit various display screens or save the cost of storage and bandwidth, meanwhile the post-upscaling is adopted to recover the original resolutions or the details in the zoom-in images. However, typical image downscaling is a non-injective mapping due to the loss of high-frequency information, which leads to the ill-posed problem of the inverse upscaling procedure and poses great challenges for recovering details from the downscaled low-resolution images. Simply upscaling with image super-resolution methods results in unsatisfactory recovering performance. In this work, we propose to solve this problem by modeling the downscaling and upscaling processes from a new perspective, i.e. an invertible bijective transformation, which can largely mitigate the ill-posed nature of image upscaling. We develop an Invertible Rescaling Net (IRN) with deliberately designed framework and objectives to produce visually-pleasing low-resolution images and meanwhile capture the distribution of the lost information using a latent variable following a specified distribution in the downscaling process. In this way, upscaling is made tractable by inversely passing a randomly-drawn latent variable with the low-resolution image through the network. Experimental results demonstrate the significant improvement of our model over existing methods in terms of both quantitative and qualitative evaluations of image upscaling reconstruction from downscaled images. Code is available at https://github.com/pkuxmq/Invertible-Image-Rescaling.
M. Xiao and Y. Wang—Work done during an internship at Microsoft Research Asia.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
MLEs corresponding to minimizing \(\mathrm {KL}( q(x|y), {f_\theta ^{-1}(y, \cdot )}_\# [p(z)] )\) or \(\mathrm {KL}\Big ( q(x), \Big ( \mathbb {E}_{{f_\theta ^y}_\# [q(x)]} [f_\theta ^{-1}(y, \cdot )] \Big )_\# [p(z)] \Big )\) are also impossible, since the pushed-forward distributions have a.e. zero density in \(\mathcal {X}\) so the KL is a.e. infinite.
References
Agustsson, E., Timofte, R.: NTIRE 2017 challenge on single image super-resolution: dataset and study. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 126–135 (2017)
Agustsson, E., Tschannen, M., Mentzer, F., Timofte, R., Gool, L.V.: Generative adversarial networks for extreme learned image compression. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 221–231 (2019)
Ardizzone, L., et al.: Analyzing inverse problems with invertible neural networks. In: Proceedings of the International Conference on Learning and Representations (2019)
Ardizzone, L., Lüth, C., Kruse, J., Rother, C., Köthe, U.: Guided image generation with conditional invertible neural networks. arXiv preprint arXiv:1907.02392 (2019)
Arjovsky, M., Bottou, L.: Towards principled methods for training generative adversarial networks. In: Proceedings of the International Conference on Learning and Representations (2017)
Ballé, J., Laparra, V., Simoncelli, E.P.: End-to-end optimized image compression. arXiv preprint arXiv:1611.01704 (2016)
Ballé, J., Minnen, D., Singh, S., Hwang, S.J., Johnston, N.: Variational image compression with a scale hyperprior. arXiv preprint arXiv:1802.01436 (2018)
Behrmann, J., Grathwohl, W., Chen, R.T., Duvenaud, D., Jacobsen, J.H.: Invertible residual networks. In: International Conference on Machine Learning, pp. 573–582 (2019)
Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013)
Berg, R., Hasenclever, L., Tomczak, J.M., Welling, M.: Sylvester normalizing flows for variational inference. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence (2018)
Bevilacqua, M., Roumy, A., Guillemot, C., Alberi-Morel, M.L.: Low-complexity single-image super-resolution based on nonnegative neighbor embedding (2012)
Bruckstein, A.M., Elad, M., Kimmel, R.: Down-scaling for better transform compression. IEEE Trans. Image Process. 12(9), 1132–1144 (2003)
Chen, R.T., Behrmann, J., Duvenaud, D., Jacobsen, J.H.: Residual flows for invertible generative modeling. arXiv preprint arXiv:1906.02735 (2019)
Dai, T., Cai, J., Zhang, Y., Xia, S.T., Zhang, L.: Second-order attention network for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 11065–11074 (2019)
Dinh, L., Krueger, D., Bengio, Y.: NICE: non-linear independent components estimation. In: Workshop of the International Conference on Learning Representations (2015)
Dinh, L., Sohl-Dickstein, J., Bengio, S.: Density estimation using real NVP. In: Proceedings of the International Conference on Learning Representations (2017)
Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2015)
Freedman, G., Fattal, R.: Image and video upscaling from local self-examples. ACM Trans. Graph. (TOG) 30(2), 12 (2011)
Giachetti, A., Asuni, N.: Real-time artifact-free image upscaling. IEEE Trans. Image Process. 20(10), 2760–2768 (2011)
Glasner, D., Bagon, S., Irani, M.: Super-resolution from a single image. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 349–356. IEEE (2009)
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, Montréal, Canada, pp. 2672–2680. NIPS Foundation (2014)
Grathwohl, W., Chen, R.T., Betterncourt, J., Sutskever, I., Duvenaud, D.: FFJORD: free-form continuous dynamics for scalable reversible generative models. In: Proceedings of the International Conference on Learning and Representations (2019)
Huang, J.B., Singh, A., Ahuja, N.: Single image super-resolution from transformed self-exemplars. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5197–5206 (2015)
Glasner, D., Bagon, S., Irani, M.: Super-resolution from a single image. In: Proceedings of the IEEE International Conference on Computer Vision, Kyoto, Japan, pp. 349–356 (2009)
Jacobsen, J.H., Smeulders, A.W., Oyallon, E.: i-RevNet: deep invertible networks. In: International Conference on Learning Representations (2018)
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43
Kim, H., Choi, M., Lim, B., Mu Lee, K.: Task-aware image downscaling. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11208, pp. 419–434. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01225-0_25
Kim, K.I., Kwon, Y.: Single-image super-resolution using sparse regression and natural image prior. IEEE Trans. Pattern Anal. Mach. Intell. 32(6), 1127–1133 (2010)
Kingma, D.P., Dhariwal, P.: Glow: generative flow with invertible 1x1 convolutions. In: Advances in Neural Information Processing Systems, pp. 10215–10224 (2018)
Kingma, D.P., Salimans, T., Jozefowicz, R., Chen, X., Sutskever, I., Welling, M.: Improved variational inference with inverse autoregressive flow. In: Advances in Neural Information Processing Systems, pp. 4743–4751 (2016)
Kopf, J., Shamir, A., Peers, P.: Content-adaptive image downscaling. ACM Trans. Graph. (TOG) 32(6), 173 (2013)
Kumar, M., et al.: VideoFlow: a flow-based generative model for video. arXiv preprint arXiv:1903.01434 (2019)
Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4681–4690 (2017)
Li, Y., Liu, D., Li, H., Li, L., Li, Z., Wu, F.: Learning a convolutional neural network for image compact-resolution. IEEE Trans. Image Process. 28(3), 1092–1107 (2018)
Li, Z., Li, S., Zhang, N., Wang, L., Xue, Z.: Multi-scale invertible network for image super-resolution. In: Proceedings of the ACM Multimedia Asia, pp. 1–6 (2019)
Lienhart, R., Maydt, J.: An extended set of Haar-like features for rapid object detection. In: Proceedings of the International Conference on Image Processing, vol. 1, p. I. IEEE (2002)
Lim, B., Son, S., Kim, H., Nah, S., Mu Lee, K.: Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 136–144 (2017)
Lin, W., Dong, L.: Adaptive downsampling to improve image compression at low bit rates. IEEE Trans. Image Process. 15(9), 2513–2521 (2006)
Liu, J., He, S., Lau, R.W.: \(l\_\{0\}\)-regularized image downscaling. IEEE Trans. Image Process. 27(3), 1076–1085 (2017)
Martin, D., Fowlkes, C., Tal, D., Malik, J., et al.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: ICCV, Vancouver (2001)
Minnen, D., Ballé, J., Toderici, G.D.: Joint autoregressive and hierarchical priors for learned image compression. In: Advances in Neural Information Processing Systems, pp. 10771–10780 (2018)
Mitchell, D.P., Netravali, A.N.: Reconstruction filters in computer-graphics. ACM SIGGRAPH Comput. Graph. 22(4), 221–228 (1988)
Oeztireli, A.C., Gross, M.: Perceptually based downscaling of images. ACM Trans. Graph. (TOG) 34(4), 77 (2015)
van der Ouderaa, T.F., Worrall, D.E.: Reversible GANs for memory-efficient image-to-image translation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4720–4728 (2019)
Rezende, D., Mohamed, S.: Variational inference with normalizing flows. In: Proceedings of the International Conference on Machine Learning, pp. 1530–1538 (2015)
Rippel, O., Bourdev, L.: Real-time adaptive image compression. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 2922–2930. JMLR.org (2017)
Schulter, S., Leistner, C., Bischof, H.: Fast and accurate image upscaling with super-resolution forests. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3791–3799 (2015)
Shannon, C.E.: Communication in the presence of noise. Proc. IRE 37(1), 10–21 (1949)
Shen, M., Xue, P., Wang, C.: Down-sampling based video coding using super-resolution technique. IEEE Trans. Circ. Syst. Video Technol. 21(6), 755–765 (2011)
Sun, W., Chen, Z.: Learned image downscaling for upscaling using content adaptive resampler. IEEE Trans. Image Process. 29, 4027–4040 (2020)
Wang, X., et al.: ESRGAN: enhanced super-resolution generative adversarial networks. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 63–79. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_5
Wang, Y., Xiao, M., Liu, C., Zheng, S., Liu, T.Y.: Modeling lost information in lossy image compression. arXiv preprint arXiv:2006.11999 (2020)
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P., et al.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Weber, N., Waechter, M., Amend, S.C., Guthe, S., Goesele, M.: Rapid, detail-preserving image downscaling. ACM Trans. Graph. (TOG) 35(6), 205 (2016)
Wilson, P.I., Fernandez, J.: Facial feature detection using Haar classifiers. J. Comput. Sci. Coll. 21(4), 127–133 (2006)
Wu, X., Zhang, X., Wang, X.: Low bit-rate image compression via adaptive down-sampling and constrained least squares upconversion. IEEE Trans. Image Process. 18(3), 552–561 (2009)
Yang, J., Wright, J., Huang, T.S., Ma, Y.: Image super-resolution via sparse representation. IEEE Trans. Image Process. 19(11), 2861–2873 (2010)
Yeo, H., Do, S., Han, D.: How will deep learning change internet video delivery? In: Proceedings of the 16th ACM Workshop on Hot Topics in Networks, pp. 57–64. ACM (2017)
Yeo, H., Jung, Y., Kim, J., Shin, J., Han, D.: Neural adaptive content-aware internet video delivery. In: 13th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2018, pp. 645–661 (2018)
Zeyde, R., Elad, M., Protter, M.: On single image scale-up using sparse-representations. In: Boissonnat, J.-D., et al. (eds.) Curves and Surfaces 2010. LNCS, vol. 6920, pp. 711–730. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-27413-8_47
Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 294–310. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_18
Zhang, Y., Tian, Y., Kong, Y., Zhong, B., Fu, Y.: Residual dense network for image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2472–2481 (2018)
Zhong, Z., Shen, T., Yang, Y., Lin, Z., Zhang, C.: Joint sub-bands learning with clique structures for wavelet domain super-resolution. In: Advances in Neural Information Processing Systems, pp. 165–175 (2018)
Zhu, X., Li, Z., Zhang, X.Y., Li, C., Liu, Y., Xue, Z.: Residual invertible spatio-temporal network for video super-resolution. Proc. AAAI Conf. Artif. Intell. 33, 5981–5988 (2019)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Xiao, M. et al. (2020). Invertible Image Rescaling. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12346. Springer, Cham. https://doi.org/10.1007/978-3-030-58452-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-58452-8_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58451-1
Online ISBN: 978-3-030-58452-8
eBook Packages: Computer ScienceComputer Science (R0)