Skip to main content
Log in

Detail-aware image denoising via structure preserved network and residual diffusion model

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

The rapid development of deep learning has led to significant strides in image denoising research and has achieved advanced denoising performance in terms of distortion metrics. However, most denoising models that construct loss functions based on pixel-by-pixel differences cause phenomena, such as blurred edges or over-smoothing in denoised images, unsatisfactory to human perception. Our approach to addressing this issue involves prioritizing visual perceptual quality and efficiently restoring high-frequency details that may have been lost during the point-by-point denoising process, all the while preserving the overall structure of the image. We introduce a structure preserved network to generate cost-effective initial predictions that are subsequently incorporated into a conditional diffusion model as a constraint that closely aligns with the actual images. This allows us to more accurately estimate the distribution of clean images by diffusing from the residuals. We observe that by maintaining image consistency in the initial prediction, we can use a residual diffusion model with lower complexity and fewer iterations to restore the detailed texture for the smoothed parts, ultimately leading to a denoised image sample that is more consistent with the visual perceptual quality. Our method is superior in matching human perceptual metrics, e.g. FID, and maintains its performance even at high noise levels, enabling the preservation of the sharp edge and texture features of the image, while reducing computational costs and equipment requirements. This not only achieves the objective of denoising but also results in enhanced subjective visual effects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Data availability

All data generated or analyzed during this study are included in this published article.

References

  1. Dabov, K., Foi, A., Katkovnik, V., Egiazarian, K.: Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans. Image Process. 16(8), 2080–2095 (2007). https://doi.org/10.1109/TIP.2007.901238

    Article  MathSciNet  Google Scholar 

  2. Mairal, J., Bach, F., Ponce, J., Sapiro, G., Zisserman, A.: Non-local sparse models for image restoration. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2272–2279 (2009). https://doi.org/10.1109/ICCV.2009.5459452

  3. Dong, W., Zhang, L., Shi, G., Li, X.: Nonlocally centralized sparse representation for image restoration. IEEE Trans. Image Process. 22(4), 1620–1630 (2012). https://doi.org/10.1109/TIP.2012.2235847

    Article  MathSciNet  Google Scholar 

  4. Gu, S., Zhang, L., Zuo, W., Feng, X.: Weighted nuclear norm minimization with application to image denoising. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2862–2869 (2014).

  5. Schmidt, U., Roth, S.: Shrinkage fields for effective image restoration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2774–2781 (2014).

  6. Chen, Y., Yu, W., Pock, T.: On learning optimized reaction diffusion processes for effective image restoration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5261–5269 (2015).

  7. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020)

    Article  MathSciNet  Google Scholar 

  8. Divakar, N., Venkatesh Babu, R.: Image denoising via CNNs: An adversarial approach. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 80–87 (2017).

  9. Alsaiari, A., Rustagi, R., Thomas, M. M., Forbes, A. G.: Image denoising using a generative adversarial network. In: 2019 IEEE 2nd International Conference on Information and Computer Technologies (ICICT), pp. 126–132 (2019). https://doi.org/10.1109/INFOCT.2019.8710893

  10. Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. Adv. Neural. Inf. Process. Syst. 33, 6840–6851 (2020)

    Google Scholar 

  11. Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., Ganguli, S.: Deep unsupervised learning using nonequilibrium thermodynamics. In: International Conference on Machine Learning, pp. 2256–2265 (2015).

  12. Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 26(7), 3142–3155 (2017). https://doi.org/10.1109/TIP.2017.2662206

    Article  MathSciNet  Google Scholar 

  13. Zamir, S. W., Arora, A., Khan, S., Hayat, M., Khan, F. S., Yang, M. H.: Restormer: efficient transformer for high-resolution image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5728–5739 (2022).

  14. Ohayon, G., Adrai, T., Vaksman, G., Elad, M., Milanfar, P.: High perceptual quality image denoising with a posterior sampling cgan. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1805–1813 (2021).

  15. Chen, J., Chen, J., Chao, H., Yang, M.: Image blind denoising with generative adversarial network based noise modeling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3155–3164 (2018).

  16. Oord, A. V. D., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A., Kavukcuoglu, K.: Wavenet: A generative model for raw audio. arXiv preprint arXiv: https://arxiv.org/abs/1609.03499 (2016).

  17. Van den Oord, A., Kalchbrenner, N., Espeholt, L., Vinyals, O., Graves, A.: Conditional image generation with pixelcnn decoders. In: Advances in Neural Information Processing Systems, vol. 29 (2016).

  18. Prakash, M., Krull, A., Jug, F.: Fully unsupervised diversity denoising with convolutional variational autoencoders. arXiv preprint arXiv: https://arxiv.org/abs/2006.06072 (2020).

  19. Lugmayr, A., Danelljan, M., Timofte, R.: NTIRE 2021 learning the super-resolution space challenge. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 596–612 (2021).

  20. Lugmayr, A., Danelljan, M., Van Gool, L., Timofte, R.: Srflow: learning the super-resolution space with normalizing flow. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part V 16, pp. 715–732. Springer International Publishing (2020).

  21. Özdenizci, O., Legenstein, R.: Restoring vision in adverse weather conditions with patch-based denoising diffusion models. IEEE Trans. Pattern Anal. Mach. Intell. (2023). https://doi.org/10.1109/TPAMI.2023.3238179

    Article  Google Scholar 

  22. Whang, J., Delbracio, M., Talebi, H., Saharia, C., Dimakis, A. G., Milanfar, P.: Deblurring via stochastic refinement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16293–16303 (2022).

  23. Song, J., Meng, C., Ermon, S.: Denoising diffusion implicit models. arXiv preprint arXiv: https://arxiv.org/abs/2010.02502 (2020).

  24. Lu, C., Zhou, Y., Bao, F., Chen, J., Li, C., Zhu, J.: Dpm-solver: a fast ode solver for diffusion probabilistic model sampling in around 10 steps. arXiv preprint arXiv: https://arxiv.org/abs/2206.00927 (2022).

  25. Luo, Z., Gustafsson, F. K., Zhao, Z., Sjölund, J., Schön, T. B.: Image Restoration with Mean-Reverting Stochastic Differential Equations. arXiv preprint arXiv: https://arxiv.org/abs/2301.11699 (2023).

  26. Chen, N., Zhang, Y., Zen, H., Weiss, R. J., Norouzi, M., Chan, W.: Wavegrad: estimating gradients for waveform generation. arXiv preprint arXiv: https://arxiv.org/abs/2009.00713 (2020).

  27. Chung, H., Sim, B., Ye, J. C.: Come-closer-diffuse-faster: Accelerating conditional diffusion models for inverse problems through stochastic contraction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12413–12422 (2022).

  28. Burger, H. C., Schuler, C. J., Harmeling, S.: Image denoising: can plain neural networks compete with BM3D?. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2392–2399 (2012). https://doi.org/10.1109/CVPR.2012.6247952

  29. Tu, Z., Talebi, H., Zhang, H., Yang, F., Milanfar, P., Bovik, A., Li, Y.: Maxim: multi-axis mlp for image processing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5769–5780 (2022).

  30. Valanarasu, J. M. J., Patel, V. M.: Unext: Mlp-based rapid medical image segmentation network. In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2022: 25th International Conference, Singapore, September 18–22, 2022, Proceedings, Part V, pp. 23–33. Springer, Cham (2022).

  31. Liu, H., Dai, Z., So, D., Le, Q.V.: Pay attention to mlps. Adv. Neural. Inf. Process. Syst. 34, 9204–9215 (2021)

    Google Scholar 

  32. Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018).

  33. Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv: https://arxiv.org/abs/1809.11096 (2018).

  34. Saharia, C., Ho, J., Chan, W., Salimans, T., Fleet, D.J., Norouzi, M.: Image super-resolution via iterative refinement. IEEE Trans. Pattern Anal. Mach. Intell. (2022). https://doi.org/10.1109/TPAMI.2022.3204461

    Article  Google Scholar 

  35. Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv: https://arxiv.org/abs/1711.05101 (2017).

  36. Loshchilov, I., Hutter, F.: Sgdr: stochastic gradient descent with warm restarts. arXiv preprint arXiv: https://arxiv.org/abs/1608.03983 (2016).

  37. Zhang, R., Isola, P., Efros, A. A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018).

  38. Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in Neural Information Processing Systems, 30 (2017).

  39. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004). https://doi.org/10.1109/TIP.2003.819861

    Article  Google Scholar 

  40. Menon, S., Damian, A., Hu, S., Ravi, N., Rudin, C.: Pulse: self-supervised photo upsampling via latent space exploration of generative models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2437–2445 (2020).

  41. Blau, Y., & Michaeli, T. The perception-distortion tradeoff. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6228–6237 (2018).

  42. Agustsson, E., Timofte, R.: Ntire 2017 challenge on single image super-resolution: dataset and study. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 126–135 (2017).

  43. Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 898–916 (2010). https://doi.org/10.1109/TPAMI.2010.161

    Article  Google Scholar 

  44. Ma, K., Duanmu, Z., Wu, Q., Wang, Z., Yong, H., Li, H., Zhang, L.: Waterloo exploration database: new challenges for image quality assessment models. IEEE Trans. Image Process. 26(2), 1004–1016 (2016). https://doi.org/10.1109/TIP.2016.2631888

    Article  MathSciNet  Google Scholar 

  45. Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4401–4410 (2019).

  46. Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv: https://arxiv.org/abs/1710.10196 (2017).

  47. Zhang, K., Zuo, W., Gu, S., Zhang, L.: Learning deep CNN denoiser prior for image restoration. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3929–3938 (2017).

  48. Zhang, K., Zuo, W., Zhang, L.: FFDNet: toward a fast and flexible solution for CNN-based image denoising. IEEE Trans. Image Process. 27(9), 4608–4622 (2018). https://doi.org/10.1109/TIP.2018.2839891

    Article  MathSciNet  Google Scholar 

  49. Tian, C., Xu, Y., Li, Z., Zuo, W., Fei, L., Liu, H.: Attention-guided CNN for image denoising. Neural Netw. 124, 117–129 (2020)

    Article  Google Scholar 

  50. Zhang, K., Li, Y., Zuo, W., Zhang, L., Van Gool, L., Timofte, R.: Plug-and-play image restoration with deep denoiser prior. IEEE Trans. Pattern Anal. Mach. Intell. 44(10), 6360–6376 (2021)

    Article  Google Scholar 

  51. Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: Swinir: image restoration using swin transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1833–1844 (2021).

  52. Zhang, K., Li, Y., Liang, J., Cao, J., Zhang, Y., Tang, H., Gool, L.V.: Practical blind image denoising via Swin-Conv-UNet and data synthesis. Mach. Intell. Res. 20(6), 822–836 (2023)

    Article  Google Scholar 

  53. Timofte, R., Agustsson, E., Van Gool, L., Yang, M. H., Zhang, L.: Ntire 2017 challenge on single image super-resolution: Methods and results. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 114–125 (2017).

  54. Martin, D., Fowlkes, C., Tal, D., Malik, J.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proceedings 8th IEEE International Conference on Computer Vision. ICCV 2001, Vol. 2, IEEE pp. 416–423 (2001).

  55. Franzen, R.: Kodak lossless true color image suite. source: http://r0k.us/graphics/kodak, 4(2) (1999).

  56. Zhang, L., Wu, X., Buades, A., Li, X.: Color demosaicking by local directional interpolation and nonlocal adaptive thresholding. J. Electron. Imaging 20(2), 023016–023016 (2011). https://doi.org/10.1117/1.3600632

    Article  Google Scholar 

  57. Chen, H., Wang, Y., Guo, T., Xu, C., Deng, Y., Liu, Z., et al.: Pre-trained image processing transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12299–12310 (2021).

  58. Abdelhamed, A., Lin, S., Brown, M. S.: A high-quality denoising dataset for smartphone cameras. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1692–1700 (2018).

  59. Guo, S., Yan, Z., Zhang, K., Zuo, W., Zhang, L.: Toward convolutional blind denoising of real photographs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1712–1722 (2019).

  60. Anwar, S., Barnes, N.: Real image denoising with feature attention. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3155–3164 (2019).

  61. Yue, Z., Zhao, Q., Zhang, L., Meng, D.: Dual adversarial network: toward real-world noise removal and noise generation. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part X 16, pp. 41–58. Springer (2020).

  62. Zamir, S. W., Arora, A., Khan, S., Hayat, M., Khan, F. S., Yang, M. H., Shao, L.: Cycleisp: real image restoration via improved data synthesis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2696–2705 (2020)

  63. Zamir, S. W., Arora, A., Khan, S., Hayat, M., Khan, F. S., Yang, M. H., Shao, L.: Multi-stage progressive image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14821–14831 (2021).

  64. Wang, Z., Cun, X., Bao, J., Zhou, W., Liu, J., Li, H.: Uformer: a general u-shaped transformer for image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17683–17693 (2022).

  65. Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.H., Shao, L.: Learning enriched features for fast image restoration and enhancement. IEEE Trans. Pattern Anal. Mach. Intell. 45(2), 1934–1948 (2022)

    Article  Google Scholar 

  66. Buades, A., Coll, B., & Morel, J. M. A non-local algorithm for image denoising. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 60–65 (2005).

  67. Wu, Y., He, K.: Group normalization. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018).

Download references

Funding

This study was funded by the National Natural Science Foundation of China (Grant No. 62061049, Grant No. 12263008), the Yunnan Provincial Department of Science and Technology–Yunnan University Joint Special Project for Double-Class Construction (Grant No. 202201BF070001-005) and the Practical Innovation Fund Project for Professional Degree Graduate Students of Yunnan University (Grant No. ZC-22221881).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hao Wu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, J., Wu, H. & Yuan, G. Detail-aware image denoising via structure preserved network and residual diffusion model. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03353-y

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00371-024-03353-y

Keywords

Navigation