Uncertainty Estimation in Medical Image Denoising with Bayesian Deep Image Prior

Laves, Max-Heinrich; Tölle, Malte; Ortmaier, Tobias

doi:10.1007/978-3-030-60365-6_9

Max-Heinrich Laves²⁰,
Malte Tölle²⁰ &
Tobias Ortmaier²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12443))

Included in the following conference series:

2089 Accesses
14 Citations
3 Altmetric

Abstract

Uncertainty quantification in inverse medical imaging tasks with deep learning has received little attention. However, deep models trained on large data sets tend to hallucinate and create artifacts in the reconstructed output that are not anatomically present. We use a randomly initialized convolutional network as parameterization of the reconstructed image and perform gradient descent to match the observation, which is known as deep image prior. In this case, the reconstruction does not suffer from hallucinations as no prior training is performed. We extend this to a Bayesian approach with Monte Carlo dropout to quantify both aleatoric and epistemic uncertainty. The presented method is evaluated on the task of denoising different medical imaging modalities. The experimental results show that our approach yields well-calibrated uncertainty. That is, the predictive uncertainty correlates with the predictive error. This allows for reliable uncertainty estimates and can tackle the problem of hallucinations and artifacts in inverse medical imaging tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agostinelli, F., Anderson, M.R., Lee, H.: Adaptive multi-column deep neural networks with application to robust image denoising. In: Advances in Neural Information Processing Systems, pp. 1493–1501 (2013)
Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Boston (2006). https://doi.org/10.1007/978-1-4615-7566-5
Book MATH Google Scholar
Chang, S.G., Yu, B., Vetterli, M.: Adaptive wavelet thresholding for image denoising and compression. IEEE Trans. Image Process. 9(9), 1532–1546 (2000). https://doi.org/10.1109/83.862633
Article MathSciNet MATH Google Scholar
Cheng, Z., Gadelha, M., Maji, S., Sheldon, D.: A Bayesian perspective on the deep image prior. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5443–5451 (2019)
Google Scholar
Dabov, K., Foi, A., Katkovnik, V., Egiazarian, K.: Image denoising by sparse 3-D transform-domain collaborative filtering. Trans. Image Process. 16(8), 2080–2095 (2007). https://doi.org/10.1109/TIP.2007.901238
Article MathSciNet Google Scholar
Gal, Y., Ghahramani, Z.: Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. In: ICML, pp. 1050–1059 (2016)
Google Scholar
Gondara, L.: Medical image denoising using convolutional denoising autoencoders. In: International Conference on Data Mining Workshops, pp. 241–246 (2016). https://doi.org/10.1109/ICDMW.2016.0041
Guo, C., Pleiss, G., Sun, Y., Weinberger, K.Q.: On calibration of modern neural networks. In: ICML, pp. 1321–1330 (2017)
Google Scholar
van den Heuvel, T.L., de Bruijn, D., de Korte, C.L., Ginneken, B.v.: Automated measurement of fetal head circumference using 2D ultrasound images. PloS One 13(8), e0200412 (2018). https://doi.org/10.1371/journal.pone.0200412. US dataset source
Hogg, R.V., McKean, J., Craig, A.T.: Introduction to Mathematical Statistics, 8th edn. Pearson, New York (2018)
Google Scholar
Jain, V., Seung, S.: Natural image denoising with convolutional networks. In: Advances in Neural Information Processing Systems, pp. 769–776 (2009)
Google Scholar
Kendall, A., Gal, Y.: What uncertainties do we need in Bayesian deep learning for computer vision? In: NeurIPS, pp. 5574–5584 (2017)
Google Scholar
Kermany, D.S., et al.: Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 172(5), 1122–1131 (2018). https://doi.org/10.1016/j.cell.2018.02.010
Article Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: ICLR (2014)
Google Scholar
Laves, M.H., Ihler, S., Fast, J.F., Kahrs, L.A., Ortmaier, T.: Well-calibrated regression uncertainty in medical imaging with deep learning. In: Medical Imaging with Deep Learning (2020)
Google Scholar
Laves, M.H., Ihler, S., Kahrs, L.A., Ortmaier, T.: Semantic denoising autoencoders for retinal optical coherence tomography. In: SPIE/OSA European Conference on Biomedical Optics, vol. 11078, pp. 86–89 (2019). https://doi.org/10.1117/12.2526936
Lee, S., Lee, M.S., Kang, M.G.: Poisson-gaussian noise analysis and estimation for low-dose x-ray images in the NSCT domain. Sensors 18(4), 1019 (2018)
Article Google Scholar
Lempitsky, V., Vedaldi, A., Ulyanov, D.: Deep Image Prior. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9446–9454 (2018). https://doi.org/10.1109/CVPR.2018.00984
Levi, D., Gispan, L., Giladi, N., Fetaya, E.: Evaluating and calibrating uncertainty prediction in regression tasks. arXiv arXiv:1905.11659 (2019)
Li, C., Chen, C., Carlson, D., Carin, L.: Preconditioned stochastic gradient Langevin dynamics for deep neural networks. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pp. 1788–1794 (2016)
Google Scholar
Michailovich, O.V., Tannenbaum, A.: Despeckling of medical ultrasound images. Trans. Ultrason. Ferroelectr. Freq. Control 53(1), 64–78 (2006). https://doi.org/10.1109/TUFFC.2006.1588392
Article Google Scholar
Rabbani, H., Nezafat, R., Gazor, S.: Wavelet-domain medical image denoising using bivariate Laplacian mixture model. Trans. Biomed. Eng. 56(12), 2826–2837 (2009). https://doi.org/10.1109/TBME.2009.2028876
Article Google Scholar
Salinas, H.M., Fernandez, D.C.: Comparison of PDE-based nonlinear diffusion approaches for image enhancement and denoising in optical coherence tomography. IEEE Trans. Med. Imaging 26(6), 761–771 (2007). https://doi.org/10.1109/TMI.2006.887375
Article Google Scholar
Sotiras, A., Davatzikos, C., Paragios, N.: Deformable medical image registration: a survey. IEEE Trans. Med. Imaging 32(7), 1153–1190 (2013). https://doi.org/10.1109/TMI.2013.2265603
Article Google Scholar
Wang, N., Tao, D., Gao, X., Li, X., Li, J.: A comprehensive survey to face hallucination. Int. J. Comput. Vis. 106(1), 9–30 (2014). https://doi.org/10.1007/s11263-013-0645-9
Article Google Scholar
Welling, M., Teh, Y.W.: Bayesian learning via stochastic gradient Langevin dynamics. In: ICML, pp. 681–688 (2011)
Google Scholar
Žabić, S., Wang, Q., Morton, T., Brown, K.M.: A low dose simulation tool for CT systems with energy integrating detectors. Med. Phys. 40(3), 031102 (2013). https://doi.org/10.1118/1.4789628
Article Google Scholar
Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 26(7), 3142–3155 (2017). https://doi.org/10.1109/TIP.2017.2662206
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Leibniz Universität Hannover, Hanover, Germany
Max-Heinrich Laves, Malte Tölle & Tobias Ortmaier

Authors

Max-Heinrich Laves
View author publications
You can also search for this author in PubMed Google Scholar
Malte Tölle
View author publications
You can also search for this author in PubMed Google Scholar
Tobias Ortmaier
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Max-Heinrich Laves .

Editor information

Editors and Affiliations

University College London, London, UK
Carole H. Sudre
University of Oxford, Oxford, UK
Hamid Fehri
McGill University, Montreal, QC, Canada
Tal Arbel
ETH Zurich, Zürich, Switzerland
Christian F. Baumgartner
Massachusetts General Hospital, Charlestown, MA, USA
Adrian Dalca
University College London, London, UK
Ryutaro Tanno
Technical University of Denmark, Kongens Lyngby, Denmark
Koen Van Leemput
Harvard Medical School, Boston, MA, USA
William M. Wells
Washington University School of Medicine, St. Louis, MO, USA
Aristeidis Sotiras
University of Oxford, Oxford, UK
Bartlomiej Papiez
Ciudad Universitaria UNL, Santa Fe, Argentina
Enzo Ferrante
Huawei Noah’s Ark Lab, London, UK
Sarah Parisot

A Appendix

1.1 A.1 Additional Figures

(See Figs. 6, 7, 8 and 10)

1.2 A.2 Additional Tables

(See Table 2, 3 and 4)

Table 2. PSNR with early-stopping.

Full size table

Table 3. SSIM after convergence.

Full size table

Table 4. SSIM with early-stopping.

Full size table

1.3 A.3 SGLD with Step Size Decay

Additionall, we implement SGLD with step size decay as described by Welling et al. [26]. The step size $ \epsilon $ is used to scale the parameter update in the SGD step (i.e. the learning rate) and defines the variance of the noise that is injected into the gradients. Here, we reduce the step size at each step t exponentially with $ \epsilon _{t} = 0.999^{t} \epsilon _{0} $. To satisfy the step size property (Eq. (2) in [26]), we fix the step size once it decreases below 1e-8. We observe no overfitting of the noisy image with step size decay (see Fig. 11). However, the quality of the resulting denoised image is very sensitive to the decay scheme. Choosing a decrease that is too low (i.e. $ \epsilon _{t} = 0.9999^{t} \epsilon _{0} $) results in overfitting; a decrease that is too high (i.e. $ \epsilon _{t} = 0.99^{t} \epsilon _{0} $) results in convergence to a subpar reconstruction. This is equivalent to carefully applied early stopping and therefore nullifies the advantage of SGLD for denoising of medical images.

1.4 A.4 Downsampling

Here, we provide justification why downsampling of an image by averaging neighboring pixels reduces the noise level and can be used as an approximation to a ground truth noise-free image (by sacrificing image resolution).

Proposition 1

Downsampling of an image reduces the observation noise.

Proof

Let $ X = \mu _{x} + \varepsilon _{x} $ and $ Y = \mu _{y} + \varepsilon _{y} $ be two neighboring pixels affected by additive i.i.d. noise $ \varepsilon _{x} , \varepsilon _{y} \sim \mathcal {N}(0, \sigma ^{2}) $. The pixels are assumed to be uncorrelated to noise. Pixels in a local neighborhood are highly correlated and assumed to be of high similarity $ \mu _{x} \approx \mu _{y} = \mu $. Let $ Z = \tfrac{1}{2} \left( X + Y \right) $ be the average of two neighboring pixels (i.e. the result of downsampling). The expectation is given by

$$\begin{aligned} \mathbb {E}[Z]&= \frac{1}{2} \left( \mathbb {E}[X] + \mathbb {E}[Y] \right) \end{aligned}$$

(9)

$$\begin{aligned}&= \frac{1}{2} 2 \, \mathbb {E}[X] \end{aligned}$$

(10)

$$\begin{aligned}&= \mu \end{aligned}$$

(11)

and the variance is given by

$$\begin{aligned} \mathrm {Var}\left[ Z\right]&= \mathrm {Var}\left[ \frac{1}{2} \left( X + Y \right) \right] \end{aligned}$$

(12)

$$\begin{aligned}&= \frac{1}{2^{2}} \left( \mathrm {Var}\left[ X\right] + \mathrm {Var}\left[ Y\right] \right) \end{aligned}$$

(13)

$$\begin{aligned}&= \frac{1}{2^{2}} 2 \mathrm {Var}\left[ X\right] \end{aligned}$$

(14)

$$\begin{aligned}&= \frac{1}{2} \sigma ^{2} ~ . \end{aligned}$$

(15)

Thus, if the similarity of neighboring pixels is sufficiently high, downsampling reduces the variance of average pixel Z by a factor of 2. $\square $

Naturally, two neighboring pixels are not exactly equal. However, downsampling can also be viewed as superposing two signals, each with a highly correlated and an uncorrelated part. Without providing proof, the amplitude of the addition of two signals can be viewed as vector addition. In the uncorrelated case, the two signals are perpendicular to each other and in the correlated case, the angle between the two signals is acute. Thus, the correlated parts of the two signals have a higher impact on the resulting addition than the uncorrelated (noise) parts. In the ideal case, where the noise is uncorrelated and the signals are in parallel, the same noise reduction as above follows.

1.5 A.5 Link Between Poisson Distribution and Normal Distribution

We approximate the Poisson noise to simulate a low-dose X-ray image with a Normal distribution. It is well-known that the limiting distribution of $ \mathsf {Poisson}(\lambda ) $ is Normal as $ \lambda \rightarrow \infty $ [10]. For completeness, we list a common proof using the moment generating function of a standardized Poisson random variable:

Theorem 1

The Poisson($\lambda $) distribution can be approximated with a Normal distribution as $ \lambda \rightarrow \infty $.

Proof

Let $ X_{\lambda } \sim \mathsf {Poisson}(\lambda ), ~ \lambda \in \{ 1, 2, \ldots \} $. The probability mass function of $ X_{\lambda } $ is given by

$$\begin{aligned} f_{X_{\lambda }}(x) = \frac{\lambda ^{x}e^{-\lambda }}{x!} \quad x \in \{ 0, 1, 2, \ldots \} ~ . \end{aligned}$$

(16)

The moment generating function is given by [10]

$$\begin{aligned} M_{X_{\lambda }}(t) = \mathbb {E} [ e^{t X_{\lambda }} ] = e^{\lambda (e^{t}-1)} ~ . \end{aligned}$$

(17)

The standardized Poisson random variable

$$\begin{aligned} Z = \frac{X_{\lambda } - \lambda }{\sqrt{\lambda }} \end{aligned}$$

(18)

has the limiting moment generating function

$$\begin{aligned} \lim _{\lambda \rightarrow \infty } M_{Z} (t)&= \lim _{\lambda \rightarrow \infty } \mathbb {E} \left[ \exp {\left( t \cdot \frac{X_{\lambda } - \lambda }{\sqrt{\lambda }} \right) } \right] \end{aligned}$$

(19)

$$\begin{aligned}&= \lim _{\lambda \rightarrow \infty } \exp { \left( -t \sqrt{\lambda } \right) } \mathbb {E} \left[ \exp {\left( \frac{t X_{\lambda }}{\sqrt{\lambda }} \right) } \right] \end{aligned}$$

(20)

$$\begin{aligned}&= \lim _{\lambda \rightarrow \infty } \exp { \left( -t \sqrt{\lambda } \right) } \exp {\left( \lambda \left( e^{t/\sqrt{\lambda }} - 1 \right) \right) } \end{aligned}$$

(21)

$$\begin{aligned}&= \lim _{\lambda \rightarrow \infty } \exp { \left( -t \sqrt{\lambda } + \lambda \left( t \lambda ^{-1/2} + t^{2} \lambda ^{-1}/2 + t^{3} \lambda ^{-3/2}/6 + \ldots \right) \right) } \end{aligned}$$

(22)

$$\begin{aligned}&= \lim _{\lambda \rightarrow \infty } \exp { \left( t^{2} / 2 + t^{3}\lambda ^{-1/2}/6 + \ldots \right) } \end{aligned}$$

(23)

$$\begin{aligned}&= \exp {\left( t^{2} / 2 \right) } \end{aligned}$$

(24)

which is the moment generating function of a standard normal random variable. $\square $

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Laves, MH., Tölle, M., Ortmaier, T. (2020). Uncertainty Estimation in Medical Image Denoising with Bayesian Deep Image Prior. In: Sudre, C.H., et al. Uncertainty for Safe Utilization of Machine Learning in Medical Imaging, and Graphs in Biomedical Image Analysis. UNSURE GRAIL 2020 2020. Lecture Notes in Computer Science(), vol 12443. Springer, Cham. https://doi.org/10.1007/978-3-030-60365-6_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-60365-6_9
Published: 05 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60364-9
Online ISBN: 978-3-030-60365-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

Uncertainty Estimation in Medical Image Denoising with Bayesian Deep Image Prior

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Appendix

A Appendix

1.1 A.1 Additional Figures

1.2 A.2 Additional Tables

1.3 A.3 SGLD with Step Size Decay

1.4 A.4 Downsampling

Proposition 1

Proof

1.5 A.5 Link Between Poisson Distribution and Normal Distribution

Theorem 1

Proof

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation