Bayesian Image Denoising with Multiple Noisy Images

Kataoka, Shun; Yasuda, Muneki

doi:10.1007/s12626-019-00043-3

Bayesian Image Denoising with Multiple Noisy Images

Article
Open access
Published: 16 July 2019

Volume 13, pages 267–280, (2019)
Cite this article

Download PDF

You have full access to this open access article

The Review of Socionetwork Strategies Aims and scope Submit manuscript

Bayesian Image Denoising with Multiple Noisy Images

Download PDF

4221 Accesses
5 Citations
3 Altmetric
Explore all metrics

Abstract

In this paper, we propose a fast image denoising method based on discrete Markov random fields and the fast Fourier transform. The purpose of the image denoising is to infer the original noiseless image from a noise corrupted image. We consider the case where several noisy images are available for inferring the original image and the Bayesian approach is adopted to create the posterior probability distribution of the denoised image. In the proposed method, the estimation of the denoised image is achieved using belief propagation and an expectation–maximization algorithm. We numerically verified the performance of the proposed method using several standard images.

Fast Global Image Denoising Algorithm on the Basis of Nonstationary Gamma-Normal Statistical Model

Estimation of Noise Using Non-local Regularization Frameworks for Image Denoising and Analysis

Article 08 September 2018

A Flexible Statistical Model for Image Denoising

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Bayesian image processing based on Markov random fields (MRFs) is an important framework in the field of image processing [1, 2]. An MRF is a undirected graph representation of probability distribution, and many applications of MRFs exist in the image processing and computer vision fields [3,4,5]. MRFs have also been applied to other research fields, including traffic engineering [6, 7] and earth science [8, 9]. In Bayesian image processing, the objective image can be inferred based on the posterior probability distribution.

Recently, we proposed a fast image denoising method for the case where multiple noisy images are available for inferring the original noiseless image that is based on Gaussian MRFs [10]. However, in the study in [10], for ease of mathematical treatment, we made an unnatural assumption that pixel values are continuous. In general, a pixel takes a discrete value from 0 to 255, and an additional framework is required to treat pixel values as discrete instead of continuous values. Therefore, in this paper, we focus on the Bayesian image denoising problem of inferring the original noiseless image from multiple noisy images when the pixel values are treated as discrete values. We created a probability model for image denoising based on the discrete MRF and Bayesian perspective. A major disadvantage of an image processing model based on discrete MRFs is the computational complexity. In fact, the inference problem from discrete MRFs belongs to the NP-hard class. Therefore, an approximate inference technique is required to infer the objective image from a discrete MRF. Belief propagation [11] is known as one such effective technique. In this paper, we propose an effective image denoising algorithm for multiple noisy images that applies belief propagation. The main contributions of this paper are that an MRF model for image denoising with multiple noisy images is defined and a fast effective denoising algorithm based on our discrete MRF model and the fast Fourier Transform (FFT) is proposed.

The remainder of this paper is organized as follows. In Sect. 2, we define a probability model for image denoising with multiple noisy images based on the discrete MRF and Baysian perspective. In Sect. 3, we derive an image denoising algorithm based on the posterior probability distribution defined in Sect. 2. In Sect. 4, we describe the framework for estimating the parameters in the posterior probability distribution. We explain the implementation of our denoising method using the FFT in Sect. 5. In Sect. 6, we describe the numerical verification of the performance of the proposed method. Finally, in Sect. 7, we present our concluding remarks.

2 Framework of Bayesian Image Denoising Method

In this section, we briefly explain the framework of the Bayesian image denoising method for the case where multiple noisy images are available. Suppose that we have K degraded images that are independently obtained by adding additive white Gaussian noise (AWGN) to the original image. We assume that the images are composed of $N = h \times w$ pixels. Let $\varvec{x} = \left[ x_1 \ x_2 \ \cdots \ x_N \right] ^{T}$ and $\varvec{y}^{(k)} = \left[ y^{(k)}_1 \ y^{(k)}_2 \ \cdots \ y^{(k)}_N \right] ^{T}$ be N dimensional vectors corresponding to the original image and the k-th noisy image, respectively. Vectors $\varvec{x}$ and $\varvec{y}^{(k)}$ can be easily obtained by raster scanning the images. We assume that $x_i \left( i = 1, 2, \dots N \right)$ takes L discrete values from 0 to $L-1$.

The purpose of the image denoising is to infer the original noiseless image $\varvec{x}$ from K noisy images $Y = \left\{ \varvec{y}^{(1)}, \varvec{y}^{(2)}, \dots , \varvec{y}^{(K)} \right\}$. In the Bayesian framework, the original image $\varvec{x}$ can be inferred using the posterior probability distribution $P \left( \varvec{x} | Y \right)$ that is expressed as

$$\begin{aligned} P \left( \varvec{x} | Y \right) = \frac{P \left( Y | \varvec{x} \right) P \left( \varvec{x} \right) }{\sum _{\varvec{x}} P \left( Y | \varvec{x} \right) P\left( \varvec{x} \right) }, \end{aligned}$$

(1)

where $\sum _{\varvec{x}}$ denotes the multiple summations over all the possible $L^N$ states of $\varvec{x}$. The framework of the proposed Bayesian image denoising method is illustrated in Fig. 1. From the definition of $\varvec{y}^{(k)}$, the probability density function $P \left( Y | \varvec{x} \right)$ is expressed as

$$\begin{aligned} P \left( Y | \varvec{x}; \sigma ^2 \right)&= \prod _{k=1}^K P \left( \varvec{y}^{(k)} \Big | \varvec{x}; \sigma ^2 \right) \nonumber \\&= \prod _{k=1}^K \prod _{i \in V} \frac{1}{\sqrt{2 \pi \sigma ^2}} \exp \left( - \frac{1}{2 \sigma ^2} \left( y^{(k)}_i - x_i \right) ^2 \right) , \end{aligned}$$

(2)

where $V = \left\{ 1, 2, \dots , N \right\}$ and $\sigma ^2$ is the variance of the AWGN. We express the parameters of the probability model by its arguments after the semicolon as Eq. (2). We define the prior probability distribution as

$$\begin{aligned} P \left( \varvec{x}; \alpha \right) = \frac{1}{Z_{\mathrm{prior}} \left( \alpha \right) } \exp \left( - \alpha \sum _{ij \in E} \phi \left( x_i - x_j \right) \right) , \end{aligned}$$

(3)

where E is a set of edges of the $h \times w$ lattice graph and $\phi \left( x \right)$ is a downward convex even function taking its minimum at $x=0$. In this study, we assumed the periodic boundary condition on the graph structure E, as demonstrated in Fig. 2. $\alpha > 0$ is the parameter of the prior probability distribution; if $\alpha$ is set to a large value, neighboring $x_i$ and $x_j$ tend to take close values. $Z_{\mathrm{prior}} \left( \alpha \right)$ is a normalization constant defined as

$$\begin{aligned} Z_{\mathrm{prior}} \left( \alpha \right) = \sum _{ \varvec{x} } \exp \left( - \alpha \sum _{ij \in E} \phi \left( x_i - x_j \right) \right) . \end{aligned}$$

(4)

By substituting Eqs. (2) and (3) into Eq. (1), the posterior probability distribution $P \left( \varvec{x} | Y \right)$ is expressed as

$$\begin{aligned}&P \left( \varvec{x} | Y; \alpha , \sigma ^2 \right) \nonumber \\&\quad = \frac{1}{Z_{\mathrm{post}} \left( \alpha , \sigma ^2 \right) } \exp \left( -\frac{1}{2 \sigma ^2} \sum _{i \in V} \psi _i \left( x_i \right) - \alpha \sum _{ij \in E} \phi \left( x_i - x_j \right) \right) , \end{aligned}$$

(5)

where

$$\begin{aligned} \psi _i \left( x_i \right) = \sum _{k=1}^K \left( x_i - y^{(k)}_i \right) ^2, \end{aligned}$$

(6)

and

$$\begin{aligned} Z_{\mathrm{post}} \left( \alpha , \sigma ^2 \right) = \sum _{\varvec{x}} \exp \left( -\frac{1}{2 \sigma ^2} \sum _{i \in V} \psi _i \left( x_i \right) - \alpha \sum _{ij \in E} \phi \left( x_i - x_j \right) \right) , \end{aligned}$$

(7)

respectively.

3 Inference Algorithm Based on Belief Propagation

The image denoising is achieved by finding the image $\widehat{ \varvec{x} }$ that maximizes the posterior probability distribution in Eq. (5):

$$\begin{aligned} \widehat{ \varvec{x} } = \mathop {\mathrm{argmax}}\limits _{ \varvec{x} } P \left( \varvec{x} | Y; \alpha , \sigma ^2 \right) . \end{aligned}$$

(8)

However, the problem of determining such an image is intractable, because this maximization problem belongs to the NP-hard class. Therefore, we need an approximate inference method to find $\widehat{ \varvec{x} }$. In this section, we explain an effective approximate inference method called belief propagation for inferring the denoised image $\widehat{ \varvec{x} }$.

Belief propagation is a method of computing the approximate marginal distributions $b_i \left( x_i \right)$ and $b_{ij} \left( x_i, x_j \right)$ for each $i \in V$ and $ij \in E$. In the belief propagation framework, the approximate marginal distributions $b_i \left( x_i \right)$ and $b_{ij} \left( x_i, x_j \right)$ are given by

$$\begin{aligned} b_i \left( x_i \right) =&\frac{1}{Z_i} \exp \left( -\frac{1}{2 \sigma ^2} \psi _i \left( x_i \right) \right) \prod _{k \in \partial i} M_{k \rightarrow i}^{\mathrm{post}} \left( x_i \right) , \end{aligned}$$

(9)

$$\begin{aligned} Z_i =&\sum _{x_i} \exp \left( -\frac{1}{2 \sigma ^2} \psi _i \left( x_i \right) \right) \prod _{k \in \partial i} M_{k \rightarrow i}^{\mathrm{post}} \left( x_i \right) , \end{aligned}$$

(10)

and

$$\begin{aligned} b_{ij} \left( x_i, x_j \right) =&\frac{1}{Z_{ij}} \exp \left( - \alpha \phi \left( x_i - x_j \right) \right) m_{\backslash j \rightarrow i}^{\mathrm{post}} \left( x_i \right) m_{\backslash i \rightarrow j}^{\mathrm{post}} \left( x_j \right) , \end{aligned}$$

(11)

$$\begin{aligned} Z_{ij} =&\sum _{x_i} \sum _{ x_j } \exp \left( - \alpha \phi \left( x_i - x_j \right) \right) m_{\backslash j \rightarrow i}^{\mathrm{post}} \left( x_i \right) m_{\backslash i \rightarrow j}^{\mathrm{post}} \left( x_j \right) , \end{aligned}$$

(12)

respectively, where $\partial i = \left\{ k \in V | ik \in E \right\}$ is a set of all the neighboring pixels of pixel i. $M_{k \rightarrow i}^{\mathrm{post}} \left( x_i \right)$ in Eqs. (9) and (10) is a message from pixel k to pixel i and is obtained by the convergence point of the message update rule

$$\begin{aligned} M_{j \rightarrow i}^{\mathrm{post}} \left( x_i \right) =&\frac{1}{Z_{j \rightarrow i}} \sum _{x_j} \exp \left( - \alpha \phi \left( x_i - x_j \right) \right) m_{\backslash i \rightarrow j}^{\mathrm{post}} \left( x_j \right) , \end{aligned}$$

(13)

$$\begin{aligned} m_{\backslash j \rightarrow i}^{\mathrm{post}} \left( x_i \right) =&\exp \left( -\frac{1}{2 \sigma ^2} \psi _i \left( x_i \right) \right) \prod _{k \in \partial i \backslash \left\{ j \right\} } M_{k \rightarrow i}^{\mathrm{post}} \left( x_i \right) , \end{aligned}$$

(14)

where $Z_{j \rightarrow i}$ is a normalization constant to ensure algorithmic stability. The estimation of the denoised image $\widehat{ \varvec{x} } = \left\{ \widehat{x}_i | i \in V \right\}$ is achieved by finding

$$\begin{aligned} \widehat{x}_i = \mathop {\mathrm{argmax}}\limits _{x_i} b_i \left( x_i \right) , \end{aligned}$$

(15)

for $i \in V$ in the belief propagation framework.

4 Parameter Estimation Using Expectation–Maximization Algorithm

In the preceding section, we explained the method for inferring the denoised image $\widehat{ \varvec{x} }$ based on belief propagation. In our framework, the denoised image is inferred from the posterior probability distribution in Eq. (5), which has two parameters, $\alpha$ and $\sigma ^2$. It is obvious that the inferred denoised image $\widehat{ \varvec{x} }$ depends on these parameters. In this section, we explain the method for determining these parameters from degraded images Y based on the expectation–maximization (EM) algorithm [12].

The EM algorithm is a statistical inference method to infer the maximum likelihood estimates

$$\begin{aligned} \widehat{ \alpha }, \widehat{ \sigma }^2 = \mathop {\mathrm{argmax}}\limits _{ \alpha , \sigma ^2 } \sum _{ \varvec{x} } P \left( \varvec{x}, Y; \alpha , \sigma ^2 \right) , \end{aligned}$$

(16)

by an iterated method. In the EM algorithm framework, the parameters $\alpha$ and $\sigma ^2$ are estimated by iterative maximization of the Q function defined as

$$\begin{aligned} Q \left( \alpha , \sigma ^2; \alpha _t, \sigma ^2_t \right) =&\sum _{ \varvec{x} } P \left( \varvec{x} | Y; \alpha _t, \sigma ^2_t \right) \log P \left( \varvec{x}, Y; \alpha , \sigma ^2 \right) \nonumber \\ =&-\frac{1}{2 \sigma ^2} \sum _{i \in V} \left\langle \psi _i \left( x_i \right) ; \alpha _t, \sigma ^2_t \right\rangle _{\mathrm{post}} - \frac{NK}{2} \log \sigma ^2 \nonumber \\&- \alpha \sum _{ij \in E} \left\langle \phi \left( x_i - x_j \right) ; \alpha _t, \sigma ^2_t \right\rangle _{\mathrm{post}} - \log Z_{\mathrm{prior}} \left( \alpha \right) \nonumber \\&+ \hbox {Const}., \end{aligned}$$

(17)

where $\alpha _t$ and $\sigma ^2_t$ are estimates of the parameters at the t-th iteration and

$$\begin{aligned} \left\langle f \left( \varvec{x} \right) ; \alpha _t, \sigma ^2_t \right\rangle _{\mathrm{post}} = \sum _{ \varvec{x} } f \left( \varvec{x} \right) P \left( \varvec{x} | Y; \alpha _t, \sigma ^2_t \right) . \end{aligned}$$

(18)

Using belief propagation, we can approximate the expectations in Eq. (17) as

$$\begin{aligned} \left\langle \psi _i \left( x_i \right) ; \alpha _t, \sigma ^2_t \right\rangle _{\mathrm{post}} =&\sum _{x_i} \psi _i \left( x_i \right) b_i^{(t)} \left( x_i \right) , \end{aligned}$$

(19)

and

$$\begin{aligned} \left\langle \phi \left( x_i - x_j \right) ; \alpha _t, \sigma ^2_t \right\rangle _{\mathrm{post}} =&\sum _{ x_i } \sum _{ x_j } \phi \left( x_i - x_j \right) b_{ij}^{(t)} \left( x_i, x_j \right) , \end{aligned}$$

(20)

respectively, where $b_i^{(t)} \left( x_i \right)$ and $b_{ij}^{(t)} \left( x_i, x_j \right)$ are the approximate marginal distributions of the posterior probability distribution $P \left( \varvec{x} | Y ; \alpha _t, \sigma ^2_t \right)$ computed using Eqs. (9) and (11). The parameter update at iteration t is given by

$$\begin{aligned} \alpha _{t+1}, \sigma ^2_{t+1} = \mathop {\mathrm{argmax}}\limits _{ \alpha , \sigma ^2 } Q \left( \alpha , \sigma ^2; \alpha _t, \sigma ^2_t \right) , \end{aligned}$$

(21)

and the maximum likelihood estimates in Eq. (16) are given as the convergence point of the above iterative estimation. By differentiating the Q function with respect to $\alpha$ and $\sigma ^2$ and considering the conditions for the extremal value, the updated parameter $\alpha _{t+1}$ in Eq. (21) is expressed as the solution of the equation

$$\begin{aligned} \sum _{ij \in E} \left\langle \phi \left( x_i - x_j \right) ; \alpha _t, \sigma ^2_t \right\rangle _{\mathrm{post}} = \sum _{ij \in E} \left\langle \phi \left( x_i - x_j \right) ; \alpha \right\rangle _{\mathrm{prior}}, \end{aligned}$$

(22)

and the updated parameter $\sigma ^2_{t+1}$ is calculated as

$$\begin{aligned} \sigma ^2_{t+1} = \frac{1}{NK} \sum _{i \in V } \left\langle \psi _i \left( x_i \right) ; \alpha _t, \sigma ^2_t \right\rangle _{\mathrm{post}}, \end{aligned}$$

(23)

where $\left\langle \phi \left( x_i - x_j \right) ; \alpha \right\rangle _{\mathrm{prior}} = \sum _{\varvec{x}} \phi \left( x_i - x_j \right) P \left( \varvec{x}; \alpha \right)$; this expectation can also be computed approximately using belief propagation similarly to Eq. (20). Using the bisection method, we can easily find $\alpha _{t+1}$ that satisfies Eq. (22).

5 Proposed Algorithm: Fast Implementation Based on the Fast Fourier Transform

The image denoising algorithm based on our probabilistic model in Eq. (5) described in Sects. 2–4 is summarized in Algorithm 1, together with the differences in the computation times of the naive implementation and the proposed method, which is explained in this section. The worst computation time of the naive implementation of this algorithm is $O \left( T_{{\mathrm{EM}}} T_{{\mathrm{BP}}} N L^2 \right)$, where $T_{{\mathrm{EM}}}$ and $T_{{\mathrm{BP}}}$ are the maximum number of updates for the parameter update in Eq. (21) and the message update in Eq. (13), respectively. In Algorithm 1, we terminate the parameter and message updates in iteration $T_{{\mathrm{EM}}}$ and $T_{{\mathrm{BP}}}$, respectively. Because we assume the periodic boundary condition for the graph structure E, the number of edges is $\left| E \right| = 2N$. Therefore, the number of messages passing each edge is O(N). The computation time of the naive message update from pixel j to pixel i is $O \left( L^2 \right)$, because Eq. (13) must be computed for each $x_i = 0, 1, \dots , L-1$ to update a message $M_{j \rightarrow i}^{\mathrm{post}} ( x_i )$.

It should be noted that the computation time of the message update can be reduced to $O \left( L \log L \right)$ using the FFT [13]. It has been confirmed that this FFT-based method in fact accelerates the message computation for the probabilistic image denoising model in Eq. (5) for the case where $K=1$ [14]. However, there exist additional $O ( L^2 )$ computation terms in Algorithm 1: Steps 11 and 12. In this section, we show that these $O ( L^2 )$ computation terms can also be computed in $O ( L \log L )$ using the FFT. Therefore, we can reduce the worst computation time of Algorithm 1 to $O \left( T_{{\mathrm{EM}}} T_{{\mathrm{BP}}} N L \log L \right)$.

The key idea for accelerating the message updates in Eq. (13) is to consider the update rule a convolution calculation. If we define the function $f(x; \alpha )$ as

$$\begin{aligned} f( x; \alpha ) = \exp \left( - \alpha \phi ( x ) \right) , \end{aligned}$$

(24)

we can reformulate the message update rules as

$$\begin{aligned} m_{j \rightarrow i}^{\mathrm{post}} \left( x_i \right) =&\sum _{x_j} f(x_i - x_j; \alpha ) m_{\backslash i \rightarrow j}^{\mathrm{post}} \left( x_j \right) , \end{aligned}$$

(25)

$$\begin{aligned} M_{j \rightarrow i}^{\mathrm{post}} \left( x_i \right) =&\left. m_{j \rightarrow i}^{\mathrm{post}} \left( x_i \right) \bigg / \sum _{l=0}^{L-1} m_{j \rightarrow i}^{\mathrm{post}} \left( l\right) \right. . \end{aligned}$$

(26)

The calculation of $m_{j \rightarrow i}^{\mathrm{post}} \left( x_i \right)$ in Eq. (25) is a convolution calculation. Therefore, we can calculate $m_{j \rightarrow i}^{\mathrm{post}} \left( x_i \right)$ for $x_i = 0, 1, \dots , L-1$ in $O \left( L \log L \right)$ computation time using the FFT, and the computation time of $M_{j \rightarrow i}^{\mathrm{post}} \left( x_i \right)$ in Eq. (26) is linear with respect to L. Therefore, we can update a message $M_{j \rightarrow i}^{\mathrm{post}} \left( x_i \right)$ in $O \left( L \log L \right)$ computation time.

Now, we show that the expectation in Eq. (20) can be calculated in $O \left( L \log L \right)$ computation time by using the FFT. By substituting Eqs. (11) into (20), the expectation calculation can be expressed as

$$\begin{aligned}&\left\langle \phi \left( x_i - x_j \right) ; \alpha , \sigma ^2 \right\rangle _{\mathrm{post}} = \frac{G}{H}, \end{aligned}$$

(27)

$$\begin{aligned}&G = \sum _{x_i} \sum _{x_j} \phi \left( x_i - x_j \right) f( x_i - x_j; \alpha ) m_{\backslash i \rightarrow j}^{\mathrm{post}} \left( x_j \right) m_{\backslash j \rightarrow i}^{\mathrm{post}} \left( x_i \right) , \end{aligned}$$

(28)

$$\begin{aligned}&H = Z_{ij} = \sum _{x_i} \sum _{x_j} f( x_i - x_j; \alpha ) m_{\backslash i \rightarrow j}^{\mathrm{post}} \left( x_j \right) m_{\backslash j \rightarrow i}^{\mathrm{post}} \left( x_i \right) . \end{aligned}$$

(29)

If we define functions $g(x_i)$ and $h(x_i)$ as

$$\begin{aligned} g ( x_i ) =&\sum _{x_j} \phi \left( x_i - x_j \right) f( x_i - x_j; \alpha ) m_{\backslash i \rightarrow j}^{\mathrm{post}} \left( x_j \right) , \end{aligned}$$

(30)

$$\begin{aligned} h ( x_i ) =&\sum _{x_j} f( x_i - x_j; \alpha ) m_{\backslash i \rightarrow j}^{\mathrm{post}} \left( x_j \right) , \end{aligned}$$

(31)

respectively, we can reformulate the expectation calculation as

$$\begin{aligned} G = \sum _{x_i} g ( x_i ) m_{\backslash j \rightarrow i}^{\mathrm{post}} \left( x_i \right) , \end{aligned}$$

(32)

and

$$\begin{aligned} H = \sum _{x_i} h ( x_i ) m_{\backslash j \rightarrow i}^{\mathrm{post}} \left( x_i \right) , \end{aligned}$$

(33)

respectively. Therefore, the computation time of the expectation in Eq. (27) is $O \left( L \log L \right)$, because the total computation time to calculate convolutions $g(x_i)$ and $h(x_i)$ in Eqs. (30) and (31) for all $x_i = 0,1,\dots ,L-1$ is $O \left( L \log L \right)$, and Eqs. (32) and (33) can be computed in $O \left( L \right)$ computation time.

In Eq. (22), we need to calculate the messages and expectation of prior probability distribution $P \left( \varvec{x}; \alpha _t \right)$ to find $\alpha _{t+1}$ that satisfies this equation using the bisection method. It should be noted that, because we assume the periodic boundary condition for the graph structure E, we can calculate the messages and expectation of prior probability distribution faster than those of posterior probability distribution by considering the translational symmetry assumption. If we assume both a periodic boundary condition and translational symmetry, the messages and expectation of prior probability distribution become not dependent on the position of the edges $ij \in E$. Therefore, the message update rule and expectation calculation for prior probability distribution are expressed as

$$\begin{aligned} M^{\mathrm{prior}} \left( x_i \right) =&\frac{1}{Z^{\mathrm{prior}}} \sum _{x_j} \phi \left( x_i - x_j \right) f( x_i - x_j; \alpha ) \left( M^{\mathrm{prior}} \left( x_j \right) \right) ^3, \end{aligned}$$

(34)

$$\begin{aligned} Z^{\mathrm{prior}} =&\sum _{x_i} \sum _{x_j} \exp \left( - \alpha \phi \left( x_i - x_j \right) \right) \left( M^{\mathrm{prior}} \left( x_j \right) \right) ^3, \end{aligned}$$

(35)

and

$$\begin{aligned}&\sum _{ij \in E} \left\langle \phi \left( x_i - x_j \right) ; \alpha \right\rangle _{\mathrm{prior}} = 2N \frac{G^{\mathrm{prior}}}{H^{\mathrm{prior}}}, \end{aligned}$$

(36)

$$\begin{aligned}&G^{\mathrm{prior}} = \sum _{x_i} \sum _{x_j} \phi \left( x_i - x_j \right) f( x_i - x_j; \alpha ) \left( M^{\mathrm{prior}} \left( x_i \right) \right) ^3 \left( M^{\mathrm{prior}} \left( x_j \right) \right) ^3, \end{aligned}$$

(37)

$$\begin{aligned}&H^{\mathrm{prior}} = \sum _{x_i} \sum _{x_j} f( x_i - x_j; \alpha ) \left( M^{\mathrm{prior}} \left( x_i \right) \right) ^3 \left( M^{\mathrm{prior}} \left( x_j \right) \right) ^3, \end{aligned}$$

(38)

respectively, where $M^{\mathrm{prior}} \left( x_i \right)$ is a message of the prior probability distribution. The calculation of the message and the expectation of the prior probability distribution in Eqs. (34) and (36) can be computed in $O \left( L \log L \right)$ computation time by the same calculation method as Eqs. (25, 26) and Eqs. (27)–(33), respectively.

6 Numerical Experiments

In this section, we describe the numerical verification of the proposed method. We used the standard images in Fig. 3, which are widely used in the image processing research field. The pixel values of these images take $L=256$ different values. All the experiments were implemented using C++ and were run single-threaded on an Ubuntu 18.04.1 LTS (64 bit) machine with an Intel Core i7-6850K CPU running at 3.60 GHz and 128 GB RAM. In this experiment, we defined the function $\phi (x)$ as

$$\begin{aligned} \phi ( x ) = x^2, \end{aligned}$$

(39)

and set the parameters of the proposed method as follows. The initial parameter $\alpha _0$ was set at 0.005 and $\sigma _0^2$ was set at the sample variance calculated from Y. The maximum numbers of iterations $T_{{\mathrm{EM}}}$ and $T_{{\mathrm{BP}}}$ were set at 100 and 1000, respectively. We considered that the messages converged if the absolute value of the average change in the messages was smaller than $10^{-4}$; the same applied to the parameters $\alpha$ and $\sigma ^2$. We set the search interval of the bisection method for computing $\alpha _{t+1}$ as $[0, 2\alpha _t]$.

First, we compared the computation time of the proposed method with that of the previous methods (belief propagation FFT (BP-FFT) and Naive). The Naive method is the naive implementation version of Algorithm 1; the worst computation time is $O \left( T_{{\mathrm{EM}}} T_{{\mathrm{BP}}} N L^2 \right)$. BP-FFT is the method used in the study presented in [14], where only the message updates in Eqs. (13) and (14) were speeded-up by using the FFT. Tables 1 and 2 show the average computation times over 10 trials for each method where K noisy images Y were generated by adding an AWGN of $\sigma = 15$ to the original noiseless images. According to the results, the proposed method was faster than the other methods. It should be noted that the difference between the three methods is in whether Algorithm 1 is implemented using the FFT. Therefore, the image denoising results of these method are all the same.

Table 1 Average computation time over 10 trials with $\sigma = 15$, $K=1$, and image size $128\times 128$

Full size table

Table 2 Average computation time over 10 trials with $\sigma = 15$, $K=5$, and image size $128\times 128$

Full size table

Figure 4 shows the average computation time versus image size for $K=1$ and $K=5$ over 10 trials, where the noise level of AWGN was $\sigma =15$. The computation time of our denoising methods grows approximately linearly with the increase in the image size (it dose not grow strictly linearly, because we break off the message and parameter updates according to the convergence condition).

Figure 5 shows the denoising performance of the proposed method versus various values of K for two levels of AWGN ($\sigma =15$ and $\sigma =30$). We evaluated the performance of the method according to the average peak signal-to-noise ratio (PSNR) over 10 trials. The PSNR is defined as

$$\begin{aligned} \hbox {PSNR} = 10 \log _{10} \frac{255^2}{\hbox {MSE}}, \end{aligned}$$

(40)

where MSE is the mean squared error between the original noiseless image and the inferred denoised image $\widehat{\varvec{x}}$. Figure 5 conforms that the image denoising results improve as the value of K is increased. Example of image denoising results for Fig. 3a are shown in Fig. 6 for $K=1$ and $K=5$, respectively.

7 Concluding Remarks

In this paper, we defined a discrete MRF model for the Bayesian image denoising problem with multiple noisy images. We proposed a fast denoising algorithm for inferring a denoised image in $O \left( T_{{\mathrm{EM}}} T_{{\mathrm{BP}}} N L \log L \right)$-time by using belief propagation and an EM algorithm based on our MRF model and FFT. We numerically verified the proposed denoising method using standard images. The results show that the proposed algorithm inferred the denoised image faster than previous implementation methods that use belief propagation.

We believe that the proposed method is the most fastest implementation of an image denoising algorithm based on a discrete MRF model that uses belief propagation and an EM algorithm. However, the method cannot yet be used for real-time processing. Therefore, we need to seek a further effective fast approximate method that preserves the restoration quality for the discrete MRF model. In our experiment, we adopted the quadratic function as the form of function $\phi (x)$. However, the proposed method is not restricted to the quadratic function: we can apply it to other types of the function $\phi (x)$, such as the Huber prior [15] and generalized sparse prior [16]. Moreover, because it can be used in any discrete MRF with $\phi ( x_i - x_j )$ potential for $ij \in E$ interaction, it is expected that the proposed method is applicable to not only image denoising but also other inference problems such as sparse modeling [17, 18]. We intend to develop the method in these directions.

References

Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741.
Article Google Scholar
Tanaka, K. (2002). Statistical-mechanical approach to image processing. Journal of Physics A: Mathematical and General, 35(37), R81–R150.
Article Google Scholar
Li, S. Z. (2009). Markov random field modeling in image analysis. Berlin: Springer.
Google Scholar
Blake, A., Kohli, P., & Rother, C. (2011). Markov random fields for vision and image processing. Cambridge: MIT Press.
Book Google Scholar
Lezoray, O., & Grady, L. (Eds.). (2012). Image processing and analysis with graphs. Boca Raton: CRC Press.
Google Scholar
Kataoka, S., Yasuda, M., Furtlehner, C., & Tanaka, K. (2014). Traffic data reconstruction based on Markov random field modeling. Inverse Problem, 30(2), 025003.
Article Google Scholar
Hara, Y., Suzuki, J., & Kuwahara, M. (2018). Network-wide traffic state estimation using a mixture Gaussian graphical model and graphical lasso. Transportation Research Part C, 86, 622–638.
Article Google Scholar
Kuwatani, T., Nagata, K., Okada, M., & Toriumi, M. (2014). Markov random field modeling for mapping geofluid distributions from seismic velocity structures. Earth, Planets and Space, 66(5), 1–9.
Google Scholar
Kuwatani, T., Nagata, K., Okada, M., & Toriumi, M. (2014). Markov-random-field modeling for linear seismic tomography. Physical Review E, 90, 042137.
Article Google Scholar
Yasuda, M., Watanabe, J., Kataoka, S., & Tanaka, K. (2018). Linear-time algorithm in Bayesian image denoising based on Gaussian Markov random field. In: IEICE Transactions on Information and Systems, vol. E101.D, no. 6, pp. 1629-1639.
Pearl, J. (1988). Probabilistic reasoning in intelligent system:networks of plausible inference. Burlington: Morgan Kaufmann.
Google Scholar
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximun likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39(1), 1–38.
Article Google Scholar
Felzenszwalb, P. F., & Huttenlocher, D. P. (2006). Efficient belief propagation for early vision. International Journal of Computer Vision, 70(1), 41–54.
Article Google Scholar
Inoue, K. (2009). Kakuritisu denpan hou ni yoru gazou shori algorithm no kairyou ni kansuru kenkyu (Study on an improvement of the image processing using belief propagation), Master Thesis in Tohoku University (unpublished) (in Japanese).
Schulta, R. R., & Stevenson, R. L. (1994). A Bayesian approach to image expansion for improved definition. IEEE Transactions on Image Processing, 3(3), 233–242.
Article Google Scholar
Tanaka, K., Yasuda, M., & Titterington, D. M. (2012). Bayesian image modeling by means of generalized sparse prior and loopy belief propagation. Journal of the Physical Society of Japan, 81(11), 114802.
Article Google Scholar
Krzakala, F., Mézard, M., Sausset, F., Sun, Y., & Zdeborová, L. (2012). Probabilistic reconstruction in compressed sensing: Algorithms, phase diagrams, and threshold achieving matrices. Journal of Statistical Mechanics: Theory and Experiment, 2012, P08009.
Article Google Scholar
Elad, M. (2010). Sparse and redundant representations: From theory to applications in signal and image processing. Berlin: Springer.
Book Google Scholar

Download references

Acknowledgements

This work was partially supported by JST CREST Grant no. JPMJCR1402 and JSPS KAKENHI Grant nos. 18K18120, 18H03303, 18K11459, and 15H03699.

Author information

Authors and Affiliations

Faculty of Commerce, Otaru University of Commerce, Otaru-shi, 047-8571, Japan
Shun Kataoka
Graduate School of Science and Engineering, Yamagata University, Yonezawa-shi, 992-8510, Japan
Muneki Yasuda

Authors

Shun Kataoka
View author publications
You can also search for this author in PubMed Google Scholar
Muneki Yasuda
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shun Kataoka.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Kataoka, S., Yasuda, M. Bayesian Image Denoising with Multiple Noisy Images. Rev Socionetwork Strat 13, 267–280 (2019). https://doi.org/10.1007/s12626-019-00043-3

Download citation

Received: 03 February 2019
Accepted: 20 June 2019
Published: 16 July 2019
Issue Date: October 2019
DOI: https://doi.org/10.1007/s12626-019-00043-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Bayesian Image Denoising with Multiple Noisy Images

Abstract

Similar content being viewed by others

Fast Global Image Denoising Algorithm on the Basis of Nonstationary Gamma-Normal Statistical Model

Estimation of Noise Using Non-local Regularization Frameworks for Image Denoising and Analysis

A Flexible Statistical Model for Image Denoising

1 Introduction

2 Framework of Bayesian Image Denoising Method

3 Inference Algorithm Based on Belief Propagation

4 Parameter Estimation Using Expectation–Maximization Algorithm

5 Proposed Algorithm: Fast Implementation Based on the Fast Fourier Transform

6 Numerical Experiments

7 Concluding Remarks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Bayesian Image Denoising with Multiple Noisy Images

Abstract

Similar content being viewed by others

Fast Global Image Denoising Algorithm on the Basis of Nonstationary Gamma-Normal Statistical Model

Estimation of Noise Using Non-local Regularization Frameworks for Image Denoising and Analysis

A Flexible Statistical Model for Image Denoising

1 Introduction

2 Framework of Bayesian Image Denoising Method

3 Inference Algorithm Based on Belief Propagation

4 Parameter Estimation Using Expectation–Maximization Algorithm

5 Proposed Algorithm: Fast Implementation Based on the Fast Fourier Transform

6 Numerical Experiments

7 Concluding Remarks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation