A-ESRGAN: Training Real-World Blind Super-Resolution with Attention U-Net Discriminators

Wei, Zihao; Huang, Yidong; Chen, Yuang; Zheng, Chenhao; Gao, Jingnan

doi:10.1007/978-981-99-7025-4_2

Zihao Wei^12,13,
Yidong Huang^12,13,
Yuang Chen^12,13,
Chenhao Zheng^12,13 &
…
Jingnan Gao¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14327))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

716 Accesses
7 Citations

Abstract

Generative adversarial networks (GANs) have recently made great progress in blind image super-resolution (SR) with their superiority in learning mappings between manifolds, which benefits the reconstruction of image’s textural details. Recent works have largely focused on designing more realistic degradation models, or constructing a more powerful generator structure but neglected the ability of discriminators in improving visual performances. In this paper, we present A-ESRGAN, a GAN model for blind SR tasks featuring an attention U-Net based, multi-scale discriminator that can be seamlessly integrated with other generators. To our knowledge, this is the first work to introduce attention U-Net structure as the discriminator of GAN to solve blind SR problems. And the paper also gives an interpretation of the mechanism behind multi-scale attention U-Net that brings performance breakthrough to the model. Experimental results demonstrate the superiority of our A-ESRGAN over state-of-the-art level performance in terms of quantitative metrics and visual quality. The code can be find in https://github.com/stroking-fishes-ml-corp/A-ESRGAN.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agustsson, E., Timofte, R.: NTIRE 2017 challenge on single image super-resolution: dataset and study. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1122–1131 (2017). https://doi.org/10.1109/CVPRW.2017.150
Bevilacqua, M., Roumy, A., Guillemot, C., Alberi-Morel, M.-L.: Low-complexity single-image super-resolution based on nonnegative neighbor embedding. In: Proceedings of the British Machine Vision Conference, pp. 1–10. BMVA Press (2012). https://doi.org/10.5244/C.26.135
Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. CoRR abs/1501.00092 (2015). arxiv.org/abs/1501.00092
Dong, C., Loy, C.C., Tang, X.: Accelerating the super-resolution convolutional neural network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 391–407. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_25
Chapter Google Scholar
Huang, J.B., Singh, A., Ahuja, N.: Single image super-resolution from transformed self-exemplars. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5197–5206 (2015)
Google Scholar
Ji, X., Cao, Y., Tai, Y., Wang, C., Li, J., Huang, F.: Real-world super-resolution via Kernel estimation and noise injection. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2020)
Google Scholar
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. CoRR abs/1603.08155 (2016). arxiv.org/abs/1603.08155
Lai, W.S., Huang, J.B., Ahuja, N., Yang, M.H.: Deep Laplacian pyramid networks for fast and accurate super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 624–632 (2017)
Google Scholar
Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. CoRR abs/1609.04802 (2016). arxiv.org/abs/1609.04802
Li, C., Wand, M.: Combining Markov random fields and convolutional neural networks for image synthesis. CoRR abs/1601.04589 (2016). arxiv.org/abs/1601.04589
Martin, D., Fowlkes, C., Tal, D., Malik, J.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proceedings Eighth IEEE International Conference on Computer Vision, ICCV 2001, vol. 2, pp. 416–423. IEEE (2001)
Google Scholar
Mittal, A., Fellow, I.E.E.E., Soundararajan, R., Bovik, A.C.: Making a “completely blind’’ image quality analyzer. IEEE Sig. Process. Lett. 20(3), 209–212 (2013)
Article Google Scholar
Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. In: International Conference on Learning Representations (2018). www.openreview.net/forum?id=B1QRgziT-
Oktay, O., et al.: Attention U-Net: learning where to look for the pancreas (2018)
Google Scholar
Park, S.-J., Son, H., Cho, S., Hong, K.-S., Lee, S.: SRFeat: single image super-resolution with feature discrimination. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11220, pp. 455–471. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01270-0_27
Chapter Google Scholar
Sajjadi, M.S., Scholkopf, B., Hirsch, M.: EnhanceNet: single image super-resolution through automated texture synthesis. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4491–4500 (2017)
Google Scholar
Schonfeld, E., Schiele, B., Khoreva, A.: A U-Net based discriminator for generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Google Scholar
Sun, L., Hays, J.: Super-resolution from internet-scale scene matching. In: 2012 IEEE International Conference on Computational Photography (ICCP), pp. 1–12. IEEE (2012)
Google Scholar
Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Wang, X., Xie, L., Dong, C., Shan, Y.: Real-ESRGAN: training real-world blind super-resolution with pure synthetic data. In: 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), pp. 1905–1914 (2021). https://doi.org/10.1109/ICCVW54120.2021.00217
Wang, X., Yu, K., Dong, C., Loy, C.C.: Recovering realistic texture in image super-resolution by deep spatial feature transform. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Wang, X., et al.: ESRGAN: enhanced super-resolution generative adversarial networks. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 63–79. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_5
Chapter Google Scholar
Yang, J., Wright, J., Huang, T.S., Ma, Y.: Image super-resolution via sparse representation. IEEE Trans. Image Process. 19(11), 2861–2873 (2010)
Article MathSciNet MATH Google Scholar
Zhang, K., Li, Y., Zuo, W., Zhang, L., Van Gool, L., Timofte, R.: Plug-and-play image restoration with deep denoiser prior. IEEE Trans. Pattern Anal. Mach. Intell. PP, 1 (2021)
Google Scholar
Zhang, K., Liang, J., Van Gool, L., Timofte, R.: Designing a practical degradation model for deep blind image super-resolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 4791–4800 (2021)
Google Scholar
Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 294–310. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_18
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

University of Michigan, Ann Arbor, MI, 48109, USA
Zihao Wei, Yidong Huang, Yuang Chen & Chenhao Zheng
Shanghai Jiao Tong University, Shanghai, 200240, China
Zihao Wei, Yidong Huang, Yuang Chen, Chenhao Zheng & Jingnan Gao

Authors

Zihao Wei
View author publications
You can also search for this author in PubMed Google Scholar
Yidong Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yuang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Chenhao Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Jingnan Gao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zihao Wei .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Fenrong Liu
SEEK Limited, Cremorne, NSW, Australia
Arun Anand Sadanandan
MIMOS Berhad, Kuala Lumpur, Malaysia
Duc Nghia Pham
Universitas Indonesia, Depok, Indonesia
Petrus Mursanto
Tabcorp Holdings Limited, Melbourne, VIC, Australia
Dickson Lukose

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wei, Z., Huang, Y., Chen, Y., Zheng, C., Gao, J. (2024). A-ESRGAN: Training Real-World Blind Super-Resolution with Attention U-Net Discriminators. In: Liu, F., Sadanandan, A.A., Pham, D.N., Mursanto, P., Lukose, D. (eds) PRICAI 2023: Trends in Artificial Intelligence. PRICAI 2023. Lecture Notes in Computer Science(), vol 14327. Springer, Singapore. https://doi.org/10.1007/978-981-99-7025-4_2

Download citation

DOI: https://doi.org/10.1007/978-981-99-7025-4_2
Published: 10 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7024-7
Online ISBN: 978-981-99-7025-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A-ESRGAN: Training Real-World Blind Super-Resolution with Attention U-Net Discriminators