Detail-aware image denoising via structure preserved network and residual diffusion model

Wu, Jing; Wu, Hao; Yuan, Guowu

doi:10.1007/s00371-024-03353-y

Detail-aware image denoising via structure preserved network and residual diffusion model

Original article
Published: 18 April 2024

(2024)
Cite this article

The Visual Computer Aims and scope Submit manuscript

133 Accesses
Explore all metrics

Abstract

The rapid development of deep learning has led to significant strides in image denoising research and has achieved advanced denoising performance in terms of distortion metrics. However, most denoising models that construct loss functions based on pixel-by-pixel differences cause phenomena, such as blurred edges or over-smoothing in denoised images, unsatisfactory to human perception. Our approach to addressing this issue involves prioritizing visual perceptual quality and efficiently restoring high-frequency details that may have been lost during the point-by-point denoising process, all the while preserving the overall structure of the image. We introduce a structure preserved network to generate cost-effective initial predictions that are subsequently incorporated into a conditional diffusion model as a constraint that closely aligns with the actual images. This allows us to more accurately estimate the distribution of clean images by diffusing from the residuals. We observe that by maintaining image consistency in the initial prediction, we can use a residual diffusion model with lower complexity and fewer iterations to restore the detailed texture for the smoothed parts, ultimately leading to a denoised image sample that is more consistent with the visual perceptual quality. Our method is superior in matching human perceptual metrics, e.g. FID, and maintains its performance even at high noise levels, enabling the preservation of the sharp edge and texture features of the image, while reducing computational costs and equipment requirements. This not only achieves the objective of denoising but also results in enhanced subjective visual effects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 8

Diffusion-Driven Image Denoising Model with Texture Preservation Capabilities

Article 10 January 2021

Image denoising via deep network based on edge enhancement

Article 18 September 2018

Fast and High Quality Image Denoising via Malleable Convolution

Data availability

All data generated or analyzed during this study are included in this published article.

References

Dabov, K., Foi, A., Katkovnik, V., Egiazarian, K.: Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans. Image Process. 16(8), 2080–2095 (2007). https://doi.org/10.1109/TIP.2007.901238
Article MathSciNet Google Scholar
Mairal, J., Bach, F., Ponce, J., Sapiro, G., Zisserman, A.: Non-local sparse models for image restoration. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2272–2279 (2009). https://doi.org/10.1109/ICCV.2009.5459452
Dong, W., Zhang, L., Shi, G., Li, X.: Nonlocally centralized sparse representation for image restoration. IEEE Trans. Image Process. 22(4), 1620–1630 (2012). https://doi.org/10.1109/TIP.2012.2235847
Article MathSciNet Google Scholar
Gu, S., Zhang, L., Zuo, W., Feng, X.: Weighted nuclear norm minimization with application to image denoising. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2862–2869 (2014).
Schmidt, U., Roth, S.: Shrinkage fields for effective image restoration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2774–2781 (2014).
Chen, Y., Yu, W., Pock, T.: On learning optimized reaction diffusion processes for effective image restoration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5261–5269 (2015).
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020)
Article MathSciNet Google Scholar
Divakar, N., Venkatesh Babu, R.: Image denoising via CNNs: An adversarial approach. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 80–87 (2017).
Alsaiari, A., Rustagi, R., Thomas, M. M., Forbes, A. G.: Image denoising using a generative adversarial network. In: 2019 IEEE 2nd International Conference on Information and Computer Technologies (ICICT), pp. 126–132 (2019). https://doi.org/10.1109/INFOCT.2019.8710893
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. Adv. Neural. Inf. Process. Syst. 33, 6840–6851 (2020)
Google Scholar
Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., Ganguli, S.: Deep unsupervised learning using nonequilibrium thermodynamics. In: International Conference on Machine Learning, pp. 2256–2265 (2015).
Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 26(7), 3142–3155 (2017). https://doi.org/10.1109/TIP.2017.2662206
Article MathSciNet Google Scholar
Zamir, S. W., Arora, A., Khan, S., Hayat, M., Khan, F. S., Yang, M. H.: Restormer: efficient transformer for high-resolution image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5728–5739 (2022).
Ohayon, G., Adrai, T., Vaksman, G., Elad, M., Milanfar, P.: High perceptual quality image denoising with a posterior sampling cgan. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1805–1813 (2021).
Chen, J., Chen, J., Chao, H., Yang, M.: Image blind denoising with generative adversarial network based noise modeling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3155–3164 (2018).
Oord, A. V. D., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A., Kavukcuoglu, K.: Wavenet: A generative model for raw audio. arXiv preprint arXiv: https://arxiv.org/abs/1609.03499 (2016).
Van den Oord, A., Kalchbrenner, N., Espeholt, L., Vinyals, O., Graves, A.: Conditional image generation with pixelcnn decoders. In: Advances in Neural Information Processing Systems, vol. 29 (2016).
Prakash, M., Krull, A., Jug, F.: Fully unsupervised diversity denoising with convolutional variational autoencoders. arXiv preprint arXiv: https://arxiv.org/abs/2006.06072 (2020).
Lugmayr, A., Danelljan, M., Timofte, R.: NTIRE 2021 learning the super-resolution space challenge. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 596–612 (2021).
Lugmayr, A., Danelljan, M., Van Gool, L., Timofte, R.: Srflow: learning the super-resolution space with normalizing flow. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part V 16, pp. 715–732. Springer International Publishing (2020).
Özdenizci, O., Legenstein, R.: Restoring vision in adverse weather conditions with patch-based denoising diffusion models. IEEE Trans. Pattern Anal. Mach. Intell. (2023). https://doi.org/10.1109/TPAMI.2023.3238179
Article Google Scholar
Whang, J., Delbracio, M., Talebi, H., Saharia, C., Dimakis, A. G., Milanfar, P.: Deblurring via stochastic refinement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16293–16303 (2022).
Song, J., Meng, C., Ermon, S.: Denoising diffusion implicit models. arXiv preprint arXiv: https://arxiv.org/abs/2010.02502 (2020).
Lu, C., Zhou, Y., Bao, F., Chen, J., Li, C., Zhu, J.: Dpm-solver: a fast ode solver for diffusion probabilistic model sampling in around 10 steps. arXiv preprint arXiv: https://arxiv.org/abs/2206.00927 (2022).
Luo, Z., Gustafsson, F. K., Zhao, Z., Sjölund, J., Schön, T. B.: Image Restoration with Mean-Reverting Stochastic Differential Equations. arXiv preprint arXiv: https://arxiv.org/abs/2301.11699 (2023).
Chen, N., Zhang, Y., Zen, H., Weiss, R. J., Norouzi, M., Chan, W.: Wavegrad: estimating gradients for waveform generation. arXiv preprint arXiv: https://arxiv.org/abs/2009.00713 (2020).
Chung, H., Sim, B., Ye, J. C.: Come-closer-diffuse-faster: Accelerating conditional diffusion models for inverse problems through stochastic contraction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12413–12422 (2022).
Burger, H. C., Schuler, C. J., Harmeling, S.: Image denoising: can plain neural networks compete with BM3D?. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2392–2399 (2012). https://doi.org/10.1109/CVPR.2012.6247952
Tu, Z., Talebi, H., Zhang, H., Yang, F., Milanfar, P., Bovik, A., Li, Y.: Maxim: multi-axis mlp for image processing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5769–5780 (2022).
Valanarasu, J. M. J., Patel, V. M.: Unext: Mlp-based rapid medical image segmentation network. In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2022: 25th International Conference, Singapore, September 18–22, 2022, Proceedings, Part V, pp. 23–33. Springer, Cham (2022).
Liu, H., Dai, Z., So, D., Le, Q.V.: Pay attention to mlps. Adv. Neural. Inf. Process. Syst. 34, 9204–9215 (2021)
Google Scholar
Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018).
Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv: https://arxiv.org/abs/1809.11096 (2018).
Saharia, C., Ho, J., Chan, W., Salimans, T., Fleet, D.J., Norouzi, M.: Image super-resolution via iterative refinement. IEEE Trans. Pattern Anal. Mach. Intell. (2022). https://doi.org/10.1109/TPAMI.2022.3204461
Article Google Scholar
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv: https://arxiv.org/abs/1711.05101 (2017).
Loshchilov, I., Hutter, F.: Sgdr: stochastic gradient descent with warm restarts. arXiv preprint arXiv: https://arxiv.org/abs/1608.03983 (2016).
Zhang, R., Isola, P., Efros, A. A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018).
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in Neural Information Processing Systems, 30 (2017).
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004). https://doi.org/10.1109/TIP.2003.819861
Article Google Scholar
Menon, S., Damian, A., Hu, S., Ravi, N., Rudin, C.: Pulse: self-supervised photo upsampling via latent space exploration of generative models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2437–2445 (2020).
Blau, Y., & Michaeli, T. The perception-distortion tradeoff. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6228–6237 (2018).
Agustsson, E., Timofte, R.: Ntire 2017 challenge on single image super-resolution: dataset and study. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 126–135 (2017).
Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 898–916 (2010). https://doi.org/10.1109/TPAMI.2010.161
Article Google Scholar
Ma, K., Duanmu, Z., Wu, Q., Wang, Z., Yong, H., Li, H., Zhang, L.: Waterloo exploration database: new challenges for image quality assessment models. IEEE Trans. Image Process. 26(2), 1004–1016 (2016). https://doi.org/10.1109/TIP.2016.2631888
Article MathSciNet Google Scholar
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4401–4410 (2019).
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv: https://arxiv.org/abs/1710.10196 (2017).
Zhang, K., Zuo, W., Gu, S., Zhang, L.: Learning deep CNN denoiser prior for image restoration. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3929–3938 (2017).
Zhang, K., Zuo, W., Zhang, L.: FFDNet: toward a fast and flexible solution for CNN-based image denoising. IEEE Trans. Image Process. 27(9), 4608–4622 (2018). https://doi.org/10.1109/TIP.2018.2839891
Article MathSciNet Google Scholar
Tian, C., Xu, Y., Li, Z., Zuo, W., Fei, L., Liu, H.: Attention-guided CNN for image denoising. Neural Netw. 124, 117–129 (2020)
Article Google Scholar
Zhang, K., Li, Y., Zuo, W., Zhang, L., Van Gool, L., Timofte, R.: Plug-and-play image restoration with deep denoiser prior. IEEE Trans. Pattern Anal. Mach. Intell. 44(10), 6360–6376 (2021)
Article Google Scholar
Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: Swinir: image restoration using swin transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1833–1844 (2021).
Zhang, K., Li, Y., Liang, J., Cao, J., Zhang, Y., Tang, H., Gool, L.V.: Practical blind image denoising via Swin-Conv-UNet and data synthesis. Mach. Intell. Res. 20(6), 822–836 (2023)
Article Google Scholar
Timofte, R., Agustsson, E., Van Gool, L., Yang, M. H., Zhang, L.: Ntire 2017 challenge on single image super-resolution: Methods and results. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 114–125 (2017).
Martin, D., Fowlkes, C., Tal, D., Malik, J.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proceedings 8th IEEE International Conference on Computer Vision. ICCV 2001, Vol. 2, IEEE pp. 416–423 (2001).
Franzen, R.: Kodak lossless true color image suite. source: http://r0k.us/graphics/kodak, 4(2) (1999).
Zhang, L., Wu, X., Buades, A., Li, X.: Color demosaicking by local directional interpolation and nonlocal adaptive thresholding. J. Electron. Imaging 20(2), 023016–023016 (2011). https://doi.org/10.1117/1.3600632
Article Google Scholar
Chen, H., Wang, Y., Guo, T., Xu, C., Deng, Y., Liu, Z., et al.: Pre-trained image processing transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12299–12310 (2021).
Abdelhamed, A., Lin, S., Brown, M. S.: A high-quality denoising dataset for smartphone cameras. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1692–1700 (2018).
Guo, S., Yan, Z., Zhang, K., Zuo, W., Zhang, L.: Toward convolutional blind denoising of real photographs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1712–1722 (2019).
Anwar, S., Barnes, N.: Real image denoising with feature attention. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3155–3164 (2019).
Yue, Z., Zhao, Q., Zhang, L., Meng, D.: Dual adversarial network: toward real-world noise removal and noise generation. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part X 16, pp. 41–58. Springer (2020).
Zamir, S. W., Arora, A., Khan, S., Hayat, M., Khan, F. S., Yang, M. H., Shao, L.: Cycleisp: real image restoration via improved data synthesis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2696–2705 (2020)
Zamir, S. W., Arora, A., Khan, S., Hayat, M., Khan, F. S., Yang, M. H., Shao, L.: Multi-stage progressive image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14821–14831 (2021).
Wang, Z., Cun, X., Bao, J., Zhou, W., Liu, J., Li, H.: Uformer: a general u-shaped transformer for image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17683–17693 (2022).
Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.H., Shao, L.: Learning enriched features for fast image restoration and enhancement. IEEE Trans. Pattern Anal. Mach. Intell. 45(2), 1934–1948 (2022)
Article Google Scholar
Buades, A., Coll, B., & Morel, J. M. A non-local algorithm for image denoising. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 60–65 (2005).
Wu, Y., He, K.: Group normalization. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018).

Download references

Funding

This study was funded by the National Natural Science Foundation of China (Grant No. 62061049, Grant No. 12263008), the Yunnan Provincial Department of Science and Technology–Yunnan University Joint Special Project for Double-Class Construction (Grant No. 202201BF070001-005) and the Practical Innovation Fund Project for Professional Degree Graduate Students of Yunnan University (Grant No. ZC-22221881).

Author information

Authors and Affiliations

School of Information Science and Engineering, Yunnan University, Kunming, China
Jing Wu, Hao Wu & Guowu Yuan

Authors

Jing Wu
View author publications
You can also search for this author in PubMed Google Scholar
Hao Wu
View author publications
You can also search for this author in PubMed Google Scholar
Guowu Yuan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hao Wu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wu, J., Wu, H. & Yuan, G. Detail-aware image denoising via structure preserved network and residual diffusion model. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03353-y

Download citation

Accepted: 28 February 2024
Published: 18 April 2024
DOI: https://doi.org/10.1007/s00371-024-03353-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detail-aware image denoising via structure preserved network and residual diffusion model

Abstract

Access this article

Similar content being viewed by others

Diffusion-Driven Image Denoising Model with Texture Preservation Capabilities

Image denoising via deep network based on edge enhancement

Fast and High Quality Image Denoising via Malleable Convolution

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Detail-aware image denoising via structure preserved network and residual diffusion model

Abstract

Access this article

Similar content being viewed by others

Diffusion-Driven Image Denoising Model with Texture Preservation Capabilities

Image denoising via deep network based on edge enhancement

Fast and High Quality Image Denoising via Malleable Convolution

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation