1 Introduction

The aim of this paper is to deal with noise in certain types of images arising in electron microscopy. A typical example for some of our applications of interest is depicted in Fig. 1. The displayed test image shows gold sputtered Polytetrafluoroethylene (PTFE) particles that were recorded using the secondary electron detector of a ZEISS DSM 962 scanning electron microscope.

Fig. 1.
figure 1

Noisy image from an electron microscope as an example for a typical application in our lab, image size is \(M \times N =2048 \times 1664\). For image acquisition, we used an accelerating voltage of 20 kV, a working distance of 6 mm and a magnification of 5000 with a ZEISS DSM 962 scanning electron microscope. The marked image region is later on employed for demonstrating filtering results.

Let us briefly discuss the properties of images we deal with here as exemplified via Fig. 1. Beginning with image content, we observe that there are many relatively large smooth (and round) regions with strong edges at particle outlines and particle overlaps. However, there are also relatively thin, elongated structures. Turning to the image disturbances, the image is characterised by random electron discharges appearing as bright horizontally striped artefacts. To circumvent sample destruction by electron impact at the image acquisition process, we had to choose a high scanning speed, which further resulted in a grainy texture.

Our Modelling Approach. Variational methods have been very successful for image denoising and restoration tasks [4]. Their main conceptual advantages are their flexibility with respect to the possible model components and the inherent robustness due to the use of suitable regularisers. Moreover, the components of a variational method can often be easily interpreted and fine tuned. As the images in our applications may show various types of structures of potential interest, we opt to employ a variational approach because of its flexibility. As a further benefit we may devise a model for enhancing specific image structures.

Let us now turn to the variational model. The task is to consider the terms in a corresponding energy functional E(u) that shall be minimised for finding the optimal solution \(u^*\). The usual design incorporates terms for data fidelity and regularisation, respectively, so that on a coarse level of abstraction the variational model reads as:

$$\begin{aligned} E(u) :=R(u) + \tau S(u;u^p). \end{aligned}$$
(1)

Thereby, \(u^p\) denotes the given degraded input image, and \(\tau >0\) is a parameter defining the balance between data term \(S(u;u^p)\) and regulariser R(u).

Let us first consider the data term \(S(u;u^p)\). We may safely assume that an adequate model for noise in electron microscopy images \(u^p\) is Poisson noise [18]. For the regulariser R(u) we need to take into account the types of prominent image structures. We choose here to employ an anisotropic regulariser that is able to preserve the strong edges and the thin elongated structures that are prominent in our input images. Furthermore, exploring the flexibility of the variational approach, we also add a total variation (TV) regulariser in order to enhance edges and achieve visually more plateau-like structures corresponding to individual particles in our application.

Related Work. As the subject of this paper touches various aspects of image processing, we can only attempt to cover here some of the most important works that influenced the considerations for the construction of our approach.

Several methods and approaches for Poisson denoising (PD) have been proposed in previous literature. Often a variance stabilizing transformation (VST) is applied to transform Poisson noisy data into data with Gaussian noise. One such transformation was proposed by Anscombe in [1]. Methods to optimise the denoising with this transformation were presented by Mäkitalo and Foi in [13] and by Azzari and Foi in [2]. An extension of the Anscombe transformation and its application to PD was proposed by Zhang et al. [22]. Also, some methods from the field of machine learning have been considered, see e.g. Giryes and Elad [9] and Salmon et al. [17]. Poisson denoising methods based on multiscale models were studied by Lefkimmiatis et al. [11].

There also have been some variational methods for PD. Rudin et al. proposed a model for TV based image denoising in [16]. The basic denoising model was customised by Le et al. in [10] for PD. As this model is technically relevant for our approach, let us give some more details. As the data term naturally includes the formulation for Poisson noise, the data term in [10] resembles the one we will formulate, as is also the case in [7, 8]. However, we will introduce a slight technical modification enabling better algorithmical treatment. In contrast to our work, Le et al. consider only TV regularisation. As another major difference we make use of a modern, non-trivial numerical optimisation approach for the numerical realisation, whereas Le et al. consider a simple finite difference method solving the Euler-Lagrange equation for their functional. Figueiredo et al. employ for the regulariser besides TV also frame-based models. The arising optimisation problems are tackled with the alternating direction method of multipliers. The recent variational model of Feng et al. follows the field of experts methodology for devising the regulariser [7].

In summary, we introduce several alternatives in modeling and numerics compared to the works [7, 8, 10]. Especially, we consider a novel anisotropic model for the problem and make use of efficient numerical tools.

Turning to the use of anisotropic diffusion, Weickert presented an extensive framework in [20]. The anisotropic regulariser we employ here has been proposed before in the context of diffusion [20]. In [21] accurate numerics, such as a family of specific finite difference schemes is given. We also take into account these recent, dedicated approximations within our numerical framework.

2 Poisson Noise

Suppose we have a Poisson noisy image \(u^p\in \mathbb {N}_0^{MN}\) with \(\mathbb {N}_0:=\mathbb {N}\cup \left\{ 0\right\} \), that is the result of a random experiment with the probability distribution

$$\begin{aligned} \mathbf P (U=u^p|u^0):=\prod _{(i,j)\in I} P(U_{i,j}=u^p_{i,j}|u^0_{i,j})\,, \end{aligned}$$
(2)

where for every \((i,j)\in I:=\lbrace 1,\dots ,M\rbrace \times \left\{ 1,\dots ,N\right\} \) the parameter \(u_{i,j}^0\ge 0\) is supplied by the original image \(u^0\) and P is the discrete Poisson distribution

$$\begin{aligned} P(X=x|y):=\left\{ \begin{array}{ll} \dfrac{\left( y\right) ^{x}}{x!}\exp \left( -y\right) \,,&{} \qquad y>0,\\ \delta _{\left\{ 0\right\} }(x)\,, &{} \qquad y=0. \end{array} \right. \end{aligned}$$
(3)

Here, \(\delta _\mathscr {S}(z)\) is the Kronecker delta function that equals 1 if \(z\in \mathscr {S}\) and 0 otherwise.

For any \(u\in \mathbb {R}^{MN}\) we define sets of indices \(I^+(u) :=\left\{ (i,j)\in I:u_{i,j}>0\right\} \), \(I^0(u) :=\left\{ (i,j)\in I:u_{i,j}=0\right\} \) and \(I^-(u) :=\left\{ (i,j)\in I:u_{i,j}<0\right\} \). While \(\mathbf P \) in (2) is defined only for nonnegative \(u^0\), the likelihood of any \(u\in \mathbb {R}^{MN}\) to be the image that led to \(u^p\) can be computed by

$$\begin{aligned} \mathscr {P}(u|u^p):=\delta _{\left\{ \emptyset \right\} }(I^-(u))\prod _{(i,j)\in I\setminus I^-(u)}P(U_{i,j}=u^p_{i,j}|u_{i,j}). \end{aligned}$$
(4)

With the probability distribution (3) we immediately see that \(\mathscr {P}(u|u^p)>0\) if and only if \(I^-(u)=\emptyset \) and \(u_{i,j}>0\) for all \(i\in I^+(u^p)\). Because of \(\lim _{x\rightarrow 0}x\log x=0\), the log-likelihood is given by

$$\begin{aligned} \begin{aligned} \mathscr {L}(u|u^p):=\log \left( \mathscr {P}(u|u^p)\right) =&\sum _{(i,j)\in I^+(u)}\left( u_{i,j}^p\log u_{i,j}-\log u_{i,j}^p ! - u_{i,j}\right) \\&-\gamma _{\left\{ \emptyset \right\} }(I^-(u))-\sum _{(i,j)\in I^0(u)}\gamma _{\lbrace 0\rbrace }(u_{i,j}^p), \end{aligned} \end{aligned}$$
(5)

where \(\gamma _\mathscr {S}(z)\) is the indicator function that equals 0 if \(z\in \mathscr {S}\) and \(\infty \) otherwise. For a given \(u^p\), maximizing \(\mathscr {L}(u|u^p)\) is equivalent to minimizing

$$\begin{aligned} \sum _{(i,j)\in I^+(u)}\left( u_{i,j}-u_{i,j}^p\log u_{i,j} \right) +\gamma _{\left\{ \emptyset \right\} }(I^-(u))+\sum _{(i,j)\in I^0(u)}\gamma _{\lbrace 0\rbrace }(u_{i,j}^p). \end{aligned}$$
(6)

Minimizing a data fidelity in form of (6) is an approach for PD, which was used e.g. in [7, 8, 10].

For the noise model sometimes also a mixed Poisson distribution is used, e.g. in [12]. We assume that the image \(u^p\) has only Poisson noise, for a detailed overview of noise sources in scanning electron microscopy see [18].

3 Anisotropic Diffusion

As we rely in this paper on the variational formulation of anisotropic diffusion in the sense of Weickert [20] let us briefly recall some details.

The anisotropic diffusion flow within the image plane during a given time can be described by the partial differential equation (PDE) \(\partial _t u = \mathrm {div}\left( D\nabla u\right) \). Thereby D denotes the diffusion tensor with \(D=v_1v_1^\top +\lambda _2v_2v_2^\top \), where the vectors \(v_1\parallel \nabla u^\sigma \) and \(v_2\perp \nabla u^\sigma \) are of unit length and \(\lambda _2 = g( \left\| \nabla u^\sigma \right\| ^2)\), see also [20]. The function \(g(\cdot )\) denotes the diffusivity function. By \(u^\sigma \) we denote the convolution of u with a Gaussian of standard deviation \(\sigma \). For the spatial derivatives in \(\nabla u^\sigma \) we use here central differences.

In [21] a discretisation of the term \(\mathrm {div}\left( D\nabla u\right) \) is obtained by minimizing

$$\begin{aligned} R(u):=\frac{1}{2}\int _{\varOmega }\nabla ^\top u D \nabla u{\text {d}} x{\text {d}} y\,, \end{aligned}$$
(7)

where D is in this notation the time-invariant diffusion tensor

$$\begin{aligned} D= \begin{pmatrix} a(x,y) &{} b(x,y) \\ b(x,y) &{} c(x,y) \end{pmatrix}. \end{aligned}$$
(8)

Following [21] we remark that (7) can be discretised and written as

$$\begin{aligned} R(u):=-\frac{1}{2}u^\top A u\qquad \text {and}\qquad \nabla R(u)=-Au \end{aligned}$$
(9)

with some negative semidefinite matrix A. An explicit time discretisation can be computed by iterating \(u^{(l+1)}=u^{(l)}+\alpha A(u^{(l)})u^{(l)}\), where \(\alpha \) is a suitable step size and \(A(u^{(l)})\) is the matrix A, which stems from the diffusion tensor using the iterate \(u^{(l)}\), see [21] for more details. Similar to that, we employ a lagged diffusivity approach and evaluate A always at the given iterate in our optimisation method.

4 Our Variational Approach

As indicated we aim to find a local minimiser \(u^*\in \mathbb {R}^{MN}\) of

$$\begin{aligned} E(u):=R(u)+\tau S(u;u^p), \end{aligned}$$
(10)

see (1). By E we want to combine anisotroptic diffusion with data fidelity \(S(\;\cdot \;;u^p)\) customised for a certain noise model. Among other models that we also test here, we especially consider the log-likelihood data term as developed before in (5). In the total, we used for \(S:=S(u;u^p)\) the following functions:

$$\begin{aligned} S_2&:=\frac{1}{2} \left\| u-u^p \right\| ^2_2\,,\qquad S_1:= \left\| u-u^p \right\| _1, \end{aligned}$$
(11)
$$\begin{aligned} S_\ell&:=\sum _{(i,j)\in I^+(u)}\left( u_{i,j}-u_{i,j}^p\log u_{i,j} \right) +\gamma _{\left\{ \emptyset \right\} }(I^-(u))+\sum _{(i,j)\in I^0(u)}\gamma _{\lbrace 0\rbrace }(u_j^p). \end{aligned}$$
(12)

For the diffusivity \(g:=g(s^2)\), that contributes to R through the eigenvalue \(\lambda _2\), some possibilities that we consider are

$$\begin{aligned} g_{\text {PM}}&:=\frac{1}{1+s^2/\lambda ^2},\qquad&g_{\text {ePM}}&:=\exp \left( -s^2/(2\lambda ^2)\right) , \end{aligned}$$
(13)
$$\begin{aligned} g_{\text {Ch}}&:=\frac{1}{\sqrt{1+s^2/\lambda ^2}},\qquad&g_{\text {W}}&:=\left\{ \begin{array}{ll} 1\,, &{}\qquad s^2=0,\\ 1-\exp \left( \frac{-3.31488}{(s/\lambda )^8}\right) \,, &{}\qquad s^2>0. \end{array} \right. \end{aligned}$$
(14)

For the Perona-Malik diffusivity \(g_{\text {PM}}\) and the exponential Perona-Malik diffusivity \(g_{\text {ePM}}\) see [15], for the Charbonnier diffusivity \(g_{\text {Ch}}\) see [5] and for the Weickert diffusivity \(g_{\text {W}}\) see [20]. Each of the diffusivity functions depend on the contrast parameter \(\lambda \). While the impact of \(\lambda \) on the diffusion process is different in each diffusivity, edges with a contrast below \(\lambda \) are smoothed out more than those with a contrast above \(\lambda \), generally speaking.

5 A Numerical Solution Strategy

Our variational model \(E(u) :=R(u) + \tau S\left( u; u^{p}\right) \) requires the minimisation of a non-convex and non-smooth cost function. A common choice to handle the non-convexity is to embed our method into a lagged diffusivity fixed-point iteration, thereby fixing the diffusivities at the previous iteration and thus, considering

$$\begin{aligned} R\left( u; \hat{u}^{(l)} \right) :=-\frac{1}{2} u^{\top } A\left( \hat{u}^{(l)} \right) u. \end{aligned}$$
(15)

Now, the minimisation of \(R\left( \;\cdot \;\right) + \tau S\left( \;\cdot \;;\; u^{p}\right) \) comes down to a series of convex optimisation tasks. \(R(\;\cdot \;;\; \hat{u}^{(l)})\) is smooth and convex, although not necessarily strictly, nor strongly convex. On the other hand, our smoothness term \(S_{2}\left( \;\cdot \;; u^{p}\right) \) is convex and smooth, whereas \(S_{1}\left( \;\cdot \;; u^{p}\right) \) and \(S_{\ell }\left( \;\cdot \;; u^{p}\right) \) are convex but continuous at the very best. Our convex energies \(R\left( \;\cdot \;;\; \hat{u}^{(l)}\right) + \tau S\left( \;\cdot \;;\; u^{p}\right) \) also suggest a natural splitting into a sum of two terms, here \(R\left( \;\cdot \;;\; \hat{u}^{(l)} \right) \) and \(\tau S\left( \;\cdot \;;\; u^{p} \right) \). One of our goals is to present a single concise but modular framework that is flexible enough to handle all presented setups efficiently. To this end we propose to use the inertial proximal algorithm for nonconvex optimization (iPiano) by Ochs et al. [14]. In its most generic form it considers the minimisation of cost functions \(f + g\), where g is a convex (possibly non-smooth) and f a smooth (possibly non-convex) function. Thus, it suits our requirements well. The algorithm, inspired by the Heavy-ball method of Polyak, combines a forward-backward splitting scheme with an additional inertial force to improve the convergence properties. In its generic form it iterates

$$\begin{aligned} x^{(k+1)} = {{\mathrm{prox}}}_{\alpha _k g} \left( x^{(k)} - \alpha _{k} \nabla f\left( x^{(k)} \right) + \beta _{k} \left( x^{(k)} - x^{(k-1)} \right) \right) , \end{aligned}$$
(16)

where \(\alpha _{k}\) and \(\beta _{k}\) are parameters that stem from the numerical scheme. Here, the function \({{\mathrm{prox}}}_{\alpha g}\) denotes the proximal operator given by

$$\begin{aligned} {{\mathrm{prox}}}_{\alpha g}\left( y \right) :=\mathop {{{\mathrm{arg\,min}}}}\limits _{x} \left( \frac{1}{2} \left\| x-y \right\| ^{2} + \alpha g\left( x \right) \right) . \end{aligned}$$
(17)

Applying the iPiano formalism to our setup in (10) yields the following iterative strategy:

$$\begin{aligned} u^{(k+1)} = {{\mathrm{prox}}}_{\alpha _{k}\tau S\left( \;\cdot \;; u^{p} \right) } \left( u^{(k)} - \alpha _{k} \nabla R\left( u^{(k)};\hat{u}^{(l)} \right) + \beta _{k} \left( u^{(k)} - u^{(k-1)} \right) \right) \end{aligned}$$
(18)

with initial values \(u^{(0)} = u^{(1)}\) and certain parameters \(\alpha _k\) and \(\beta _k\).

Whether the iPiano algorithm is fast hinges on an effective evaluation of the proximal mapping. For our choices of \(S_{1}\), \(S_{2}\), and \(S_{\ell }\) the corresponding proximal mappings are well known and can be expressed in closed form. The proximal mapping of \(S_{1}\) is given by the soft shrinkage operation, while it corresponds to a simple averaging for \(S_{2}\). Finally, the proximal mapping for \(S_{\ell }\) is easily found by setting the first derivative to 0. We also remark in this context that it suffices in each case to consider the 1D formulation. The computation of the proximal mapping decouples for all our choices of S.

Besides the computation of the proximal mapping we must also specify an update strategy for the free parameters \(\alpha _{k}\) and \(\beta _{k}\) in (18) (resp. (17)). Here, we follow the recommendations found in [14] and set \(\beta _{k} = \frac{4}{5}\) and \(\alpha _{k} = \frac{2\left( 1-\beta _{k}\right) }{L+c}\), where \(c = \frac{1}{1000}\) and L being an upper bound of the spectral norm of \(A(\hat{u}^{(l)})\), that is, the Lipschitz constant of \(\nabla R(\;\cdot \;;\; \hat{u}^{(l)})\). For performance reasons we opt to estimate L by means of the Geršgorin circle theorem. A detailed listing of our approach is given in Algorithm 1.

figure a

We also use the following stopping criteria in our implementation. The outer iteration stops when \(\hat{u}^{(l+1)}\) fulfils either

$$\begin{aligned} \frac{ |{E_{l+1}(\hat{u}^{(l+1)})-E_l(\hat{u}^{(l)})}|}{ |{E_{l+1}(\hat{u}^{(l+1)})+E_l(\hat{u}^{(l)})}|} < 5\cdot 10^{-7} \end{aligned}$$
(19)

or

$$\begin{aligned} \frac{1}{MN} \left\| \hat{u}^{(l+1)}-\hat{u}^{(l)} \right\| _1 < 10^{-6} \left\| u^p \right\| _\infty , \end{aligned}$$
(20)

where \(E_l\) denotes the function E with the diffusion tensor fixed with the iterate \(\hat{u}^{(l)}\). For the iPiano algorithm we use nearly the same break criteria, we substitute \(\hat{u}^{(l+1)}\) with \(u^{(k+1)}\), \(\hat{u}^{(l)}\) with \(u^{(k)}\) and \(E_{l+1}\) with \(E_l\).

6 Additional Model Improvements

Even though our approach already delivers convincing results, we investigate further means to improve the quality of our findings. Our real world data is assumed to have hard edges with almost constant regions in between in the ideal case. Such results can, for example, be obtained by adding a second regulariser in form of a TV term. Thus, we must minimise the following energy.

$$\begin{aligned} F(u):=R(u)+\tau S(u\;;\; u^p) + \kappa \mathrm {TV}(u) \end{aligned}$$
(21)
figure b

Here, \(\mathrm {TV}(u)\) defines the discretised form of the total variation of a smooth function.

$$\begin{aligned} \mathrm {TV}(u) :=\sum _{(i,j)\in I} \left\| \begin{pmatrix} d_{x} u_{i,j} \\ d_{y} u_{i,j} \end{pmatrix} \right\| _{2}, \end{aligned}$$
(22)

where \(d_{x}\) and \(d_{y}\) denote standard forward differences. To find a minimiser \(u^{*}\) we proceed similarly as before. We use a lagged diffusivity fixed-point iteration to overcome the nonlinearities in the regulariser R. Similarly as before, we obtain a series of convex problems. The difference lies in the fact that we must minimise a sum of three terms now. Even though one could apply iPiano, we opt for a better adapted algorithmic approach. We use the three-operator splitting scheme presented in [6] to handle the additional TV term. It requires us to evaluate the proximal mapping of the TV term

$$\begin{aligned} {{\mathrm{prox}}}_{\alpha \mathrm {TV}}\left( \tilde{u}\right) = \mathop {{{\mathrm{arg\,min}}}}\limits _{u} \left( \frac{ \left\| u-\tilde{u} \right\| _2^2}{2} + \alpha \sum _{(i,j)\in I} \left\| \begin{pmatrix} d_{x} u_{i,j} \\ d_{y} u_{i,j} \end{pmatrix} \right\| _{2} \right) . \end{aligned}$$
(23)

In this work, we use the algorithm of Chambolle and Pock [3]. It is appealing due to its simple form and decent efficiency. The complete algorithm, including the numerical strategy with the three-operator splitting is listed in Algorithm 2.

Here, we use the same stopping criteria as for Algorithm 1, see (19) and (20). In addition, we abort the iterative algorithm for computing \({{\mathrm{prox}}}_{\alpha \mathrm {TV}}(\tilde{u})\) when the following condition is met.

$$\begin{aligned} \frac{1}{MN} \left\| u^{(m+1)}-u^{(m)} \right\| _1 < 10^{-6} \left( \frac{9}{10} \right) ^l \left\| u^p \right\| _\infty \end{aligned}$$
(24)

Here, l corresponds to the loop counter for the lagged diffusivity scheme. By tightening the convergence requirements while progressing through the outer iteration of Algorithm 2, we aim to shorten computing time at the start and get a better final result. It seems plausible that the first few iterates can be rather rough, whereas accurate estimates are much more important towards the end.

Fig. 2.
figure 2

(a–c) Noise-free original versions of our considered test images. The numbers in brackets indicate the size of the square images. (d) Exemplary noisy version of ramp. The lowercase index states the peak value, that the original image was scaled to before computing the Poisson noisy image. The noisy image has a range going from 0 to 19.

7 Numerical Results

We use three different synthetic test images for a qualitative analysis of our methods. These images are chosen in such a way that they contain features commonly occurring in microscopic images. The considered samples are depicted in Fig. 2. For our evaluation we corrupt each image three times with Poisson noise in three different strengths, yielding a total of nine noisy images. One of them is shown in Fig. 2, in an exemplary manner. In order to simulate different noise levels we rescale the images such that their peak value equals 255, 10, and 1 respectively before computing the sample of the probability distribution (2). In the last case the strength of noise is fairly high and the features of the original image are barely visible. We also use the mean squared error (MSE) and the structural similarity (SSIM) as a means to measure the quality of our reconstructions (see [19] for a reference on the SSIM). In addition, we consider the methods proposed in [2] and [9] for comparison. The authors of these works provide reference implementations of their algorithms, allowing us to do an objective evaluation.

Fig. 3.
figure 3

(a) denoised image obtained by using the method from [2] for \(\texttt {ramp}_{10}\) (SSIM\(\,=\,0.7845\)) (b) denoised image obtained by using the method from [9] for \(\texttt {ramp}_{10}\) (SSIM\(\,=\,0.7437\)) (c) result of Algorithm 1 optimised for maximal SSIM with \(S_\ell \), \(g_{\text {ePM}}\), \(\lambda \approx 0.2446\), \(\tau \approx 0.2443\) for \(\texttt {ramp}_{10}\) (SSIM\(\,=\,0.7817\)) (d) result of Algorithm 2 optimised for maximal SSIM with \(S_1\), \(g_{\text {W}}\), \(\lambda \approx 10.39\), \(\tau \approx 96.74\), \(\kappa \approx 226.2\) for \(\texttt {ramp}_{10}\) (SSIM\(\,=\,0.6552\))

Table 1. MSE and SSIM for all tested images. Numbers in bold show the best result of all considered methods. The index after the image name denotes the peak value to which the image was scaled for the computation of the noisy image. Our approach outperforms the reference algorithms in terms of MSE for bars and ramp. Also, we obtain a higher SSIM for bars and circles.
Fig. 4.
figure 4

(a) Original noisy image, cut out from Fig. 1 (b) result of VST from  [2] (c) result of Algorithm 1 with \(S_\ell \), \(g_{\text {W}}\) \(\lambda =1.25\) and \(\tau =2.5\) (d) Result of Algorithm 2 with \(S_\ell \), \(g_{\text {W}}\) \(\lambda =1.25\) and \(\tau =2.5\) and \(\kappa =0.5\)

Table 1 presents our findings for all considered images. Figure 3 shows one of the denoised images for every considered algorithm. The results of Algorithms 1 and 2 were achieved by thoroughly tuning all the model parameters to minimise the MSE respectively maximise the SSIM. The findings from Table 1 indicate that we achieve competitive results in most cases. If we focus solely on the SSIM, then our results are notably better (0.8226 vs. 0.6563 resp. 0.6912 for \(\texttt {circles}_{10}\)).

Finally, Fig. 4 presents the results of our algorithms on a real world example. The images depict gold sputtered PTFE particles recorded by a secondary electron detector of a ZEISS DSM 962 scanning electron microscope at an accelerating voltage of 20 kV. We have used a working distance of 6 mm and a magnification factor of 5000. Since the ground truth is unknown, the parameters have been tuned to obtain the most visually appealing result. Clearly, we are able to remove noise and other artefacts while preserving the contrast and sharp edges.

8 Conclusion

We have shown that modern optimisation methods and discretisation schemes can be applied with benefit in electron microscopy. Although our basic anisotropic model is relatively simple, it already yields visually convincing results in our test application. The flexible variational framework enables many possible extensions that can be used for specific applications. For future work we may especially include a thorough study of possible optimisation methods for optimal balancing of data term and regularisers.