Abstract
Blind super-resolution (SR) is a challenging low-level vision task, dedicated to recovering corrupted details in low-resolution (LR) images with complex unknown degradations. The mainstream blind SR methods mainly adopt the paradigm of capturing the robust degradation representation from the LR images as condition and then perform deep feature reconstruction. However, the manifold degradation factors make it challenging to achieve flexible estimation. In this paper, we propose a residual-guided diffusion degradation representation scheme (Diff-BSR) for blind SR. Specifically, we leverage the powerful generative capability of the diffusion model (DM) to implicitly model the diverse degradations representation, which helps to resist to the disturbance of varied input. Meanwhile, to reduce the expensive computational complexity and training costs, we design a lightweight degradation extractor in the residual domain. It transforms the target residual distribution in a low-dimension feature space. As a result, Diff-BSR requires only about 60 sampling steps and a much smaller scale denoising network. Moreover, we designed the Degradation-Aware Multihead Self-Attention mechanism to effectively fuse the discriminative representations with the intermediate features of the network for robustness enhancement. Extensive experiments on mainstream blind SR benchmarks show that Diff-BSR achieves SOTA or comparable performance compared to existing methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Agustsson, E., Timofte, R.: NTIRE 2017 challenge on single image super-resolution: dataset and study. In: CVPRW, pp. 126–135 (2017)
Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. TPAMI 33(5), 898–916 (2010)
Bevilacqua, M., Roumy, A., Guillemot, C., Alberi-Morel, M.L.: Low-complexity single-image super-resolution based on nonnegative neighbor embedding (2012)
Blattmann, A., et al.: Align your latents: high-resolution video synthesis with latent diffusion models. In: CVPR, pp. 22563–22575 (2023)
Cai, J., Zeng, H., Yong, H., Cao, Z., Zhang, L.: Toward real-world single image super-resolution: a new benchmark and a new model. In: ICCV, pp. 3086–3095 (2019)
Chen, C., et al.: Real-world blind super-resolution via feature matching with implicit high-resolution priors. In: ACMMM, pp. 1329–1338 (2022)
Dhariwal, P., Nichol, A.: Diffusion models beat GANs on image synthesis. In: NeurIPS, vol. 34, pp. 8780–8794 (2021)
Fritsche, M., Gu, S., Timofte, R.: Frequency separation for real-world super-resolution. In: ICCVW, pp. 3599–3608 (2019)
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: NeurIPS, vol. 33, pp. 6840–6851 (2020)
Ho, J., Salimans, T., Gritsenko, A., Chan, W., Norouzi, M., Fleet, D.J.: Video diffusion models. arXiv:2204.03458 (2022)
Huang, J.B., Singh, A., Ahuja, N.: Single image super-resolution from transformed self-exemplars. In: CVPR, pp. 5197–5206 (2015)
Huang, Y., Li, S., Wang, L., Tan, T., et al.: Unfolding the alternating optimization for blind super resolution. In: NeurIPS, vol. 33, pp. 5632–5643 (2020)
Ignatov, A., Kobyshev, N., Timofte, R., Vanhoey, K., Van Gool, L.: DSLR-quality photos on mobile devices with deep convolutional networks. In: ICCV, pp. 3277–3285 (2017)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) ICLR (2015)
Li, H., et al.: SRDiff: single image super-resolution with diffusion probabilistic models. Neurocomputing 479, 47–59 (2022)
Li, X., Zuo, W., Loy, C.C.: Learning generative structure prior for blind text image super-resolution. In: CVPR, pp. 10103–10113 (2023)
Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: SwinIR: image restoration using swin transformer. In: ICCV, pp. 1833–1844 (2021)
Maeda, S.: Unpaired image super-resolution using pseudo-supervision. In: CVPR, pp. 291–300 (2020)
Nichol, A.Q., Dhariwal, P.: Improved denoising diffusion probabilistic models. In: ICML, pp. 8162–8171 (2021)
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: CVPR, pp. 10684–10695 (2022)
Sahak, H., Watson, D., Saharia, C., Fleet, D.J.: Denoising diffusion probabilistic models for robust image super-resolution in the wild. CoRR abs/2302.07864 (2023)
Saharia, C., Ho, J., Chan, W., Salimans, T., Fleet, D.J., Norouzi, M.: Image super-resolution via iterative refinement. TPAMI 45(4), 4713–4726 (2023)
Shang, S., Shan, Z., Liu, G., Zhang, J.: ResDiff: combining CNN and diffusion model for image super-resolution. arXiv preprint arXiv:2303.08714 (2023)
Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., Ganguli, S.: Deep unsupervised learning using nonequilibrium thermodynamics. In: ICML, pp. 2256–2265 (2015)
Timofte, R., Agustsson, E., Van Gool, L., Yang, M.H., Zhang, L.: NTIRE 2017 challenge on single image super-resolution: methods and results. In: CVPRW, pp. 114–125 (2017)
Vahdat, A., Kreis, K., Kautz, J.: Score-based generative modeling in latent space. In: NeurIPS, vol. 34, pp. 11287–11302 (2021)
Wan, Z., et al.: Bringing old photos back to life. In: CVPR, pp. 2747–2757 (2020)
Wang, L., et al.: Unsupervised degradation representation learning for blind super-resolution. In: CVPR, pp. 10581–10590 (2021)
Wang, X., Xie, L., Dong, C., Shan, Y.: Real-ESRGAN: training real-world blind super-resolution with pure synthetic data. In: ICCVW, pp. 1905–1914 (2021)
Wang, X., Yu, K., Dong, C., Loy, C.C.: Recovering realistic texture in image super-resolution by deep spatial feature transform. CoRR abs/1804.02815 (2018)
Wang, Y., et al.: Towards compact single image super-resolution via contrastive self-distillation. In: Zhou, Z. (ed.) IJCAI, pp. 1122–1128 (2021)
Wei, P., et al.: Component divide-and-conquer for real-world image super-resolution. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12353, pp. 101–117. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58598-3_7
Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.: Restormer: efficient transformer for high-resolution image restoration. In: CVPR, pp. 5718–5729 (2022)
Zeyde, R., Elad, M., Protter, M.: On single image scale-up using sparse-representations. In: ICCV, pp. 711–730 (2010)
Zhang, K., Liang, J., Van Gool, L., Timofte, R.: Designing a practical degradation model for deep blind image super-resolution. In: ICCV, pp. 4791–4800 (2021)
Acknowledgment
This work is supported by the National Key Research and Development Program of China No. 2020AAA0108301; National Natural Science Foundation of China under Grants No. 62176224; CCF-Lenovo Blue Ocean Research Fund.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Ye, F., Zhou, Y., Cheng, L., Qu, Y. (2024). Robust Degradation Representation via Efficient Diffusion Model for Blind Super-Resolution. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14435. Springer, Singapore. https://doi.org/10.1007/978-981-99-8552-4_3
Download citation
DOI: https://doi.org/10.1007/978-981-99-8552-4_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8551-7
Online ISBN: 978-981-99-8552-4
eBook Packages: Computer ScienceComputer Science (R0)