Convergent numerical approximation of the stochastic total variation flow

We study the stochastic total variation flow (STVF) equation with linear multiplicative noise. By considering a limit of a sequence of regularized stochastic gradient flows with respect to a regularization parameter $\varepsilon$ we obtain the existence of a unique variational solution of the STVF equation which satisfies a stochastic variational inequality. We propose an energy preserving fully discrete finite element approximation for the regularized gradient flow equation and show that the numerical solution converges to the solution of the unregularized STVF equation. We perform numerical experiments to demonstrate the practicability of the proposed numerical approximation. This paper contains a mistake: in the proof of Lemma 4.4 the last inequality is not valid. Meanwhile, this mistake has been fixed in [6] for a slightly modified numerical approximation in spatial dimension $d=1$. For $d\geq 1$ the validity of the estimate in Lemma 4.4 is still open but the convergence of the numerical approximation can be shown by a different approach, see [5].


Introduction
We study numerical approximation of the stochastic total variation flow (STVF) dX = div ∇X |∇X| dt − λ(X − g) dt + X dW, in (0, T ) × O, where O ⊂ R d , d ≥ 1 is a bounded, convex domain with a piecewise C 2 -smooth boundary ∂O, and λ ≥ 0, T > 0 are constants. We assume that x 0 , g ∈ L 2 and consider a one dimensional real-valued Wiener process W , for simplicity; generalization for a sufficiently regular trace-class noise is straightforward. Equation (1) can be interpreted as a stochastically perturbed gradient flow of the penalized total variation energy functional The minimization of above functional, so-called ROF-method, is a prototypical approach for image denoising, cf. [15]; in this context the function g represents a given noisy image and λ serves as a penalization parameter. Further applications of the functional include, for instance, elastoplasticity and the modeling of damage and fracture, for more details see for instance [4] and the references therein.
The use of stochastically perturbed gradient flows has proven useful in image processing. Stochastic numerical methods for models with nonconvex energy functionals are able to avoid local energy minima and thus achieve faster convergence and/or more accurate results than their deterministic counterparts; see [12] which applies stochastic level-set method in image segmentation, and [16] which uses stochastic gradient flow of a modified (non-convex) total variation energy functional for binary tomography.
Due to the singular character of total variation flow (1), it is convenient to perform numerical simulations using a regularized problem with a regularization parameter ε > 0. In the deterministic setting (W ≡ 0) equation (3) corresponds to the gradient flow of the regularized energy functional It is well-known that the minimizers of the above regularized energy functional converge to the minimizers of (2) for ε → 0, cf. [9] and the references therein.
Owing to the singular character of the diffusion term in (1) the classical variational approach for the analysis of stochastic partial differential equations (SPDEs), see e.g. [13], [14], is not applicable to this problem. To study well-posedeness of singular gradient flow problems it is convenient to apply the solution framework developed in [3] which characterizes the solutions of (1) as stochastic variational inequalities (SVIs). In this paper, we show the well posedness of SVI solutions using the practically relevant regularization procedure (3) which, in the regularization limit, yields a SVI solution in the sense of [3]. Throughout the paper, we will refer to the solutions which satisfy a stochastic variational inequality as SVI solutions, and to the classical SPDE solutions as variational solutions. Convergence of numerical approximation of (3) in the deterministic setting (W ≡ 0) has been shown in [9]. Analogically to the deterministic setting, we construct an implementable finite element approximation of the problem (1) via the numerical discretization of the regularized problem (3). The scheme is implicit in time and preserves the gradient structure of the problem, i.e., it satisfies a discrete energy inequality. The deterministic variational inequality framework used in the the numerical analysis of [9] is not directly transferable to the stochastic setting. Instead, we show the convergence of the proposed numerical approximation of (3) to the SVI solution of (1) via an additional regularization step on the discrete level. The convergence analysis of the discrete approximation is inspired by the analytical approach of [10] where the SVI solution concept was applied to the stochastic p-Laplace equation. As far as we are aware, the present work is the first to show convergence of implementable numerical approximation for singular stochastic gradient flows in the framework of stochastic variational inequalities.
The paper is organized as follows. In Section 2 we introduce the notation and state some auxiliary results. The existence of a unique SVI solution of the regularized stochastic TV flow (3) and its convergence towards a unique SVI solution of (1) for ε → 0 is shown in Section 3. In Section 4 we introduce a fully discrete finite element scheme for the regularizared problem (3) and show its convergence to the SVI solution of (1). Numerical experiments are presented in Section 5.

Notation and preliminaries
Throughout the paper we denote by C a generic positive constant that may change from line to line. For 1 ≤ p ≤ ∞, we denote by (L p , · L p ) the standard spaces of p-th order integrable functions on O, and use · := · L 2 and (·, ·) := (·, ·) L 2 for the L 2 -inner product. For k ∈ N we denote the usual Sobolev space on O as (H k , · H k ), and (H 1 0 , · H 1 0 ) stands for the H 1 space with zero trace on ∂O with its dual space (H −1 , · H −1 ). Furthermore, we set ·, · := ·, · H −1 ×H 1 0 , where ·, · H −1 ×H 1 0 is the duality pairing between H 1 0 and H −1 . The functional (4) with λ = 0 will be denoted as J ε := J ε,0 . We say that a mapping For the convenience of the reader we state some basic definitions below. Furthermore we define the Yosida approximation of A as Definition 2.2. The mapping P m : defines the L 2 -orthogonal projection onto V m .
is finite. The space of functions of bounded variations is denoted by BV (O).
The following proposition plays an important role in the analysis below; the proposition holds for convex domains with piecewise smooth boundary, which includes the case of practically relevant polygonal domains, cf. [3, Proposition 8.2 and Remark 8.1].
be a bounded domain with a piecewise C 2 -smooth and convex boundary. Let g : [0, ∞) → [0, ∞) be a continuous and convex function of at most quadratic growth such that g(0) = 0, then it holds

Well posedness of STVF
In this section we show existence and uniques of the SVI solution of (1) (see below for a precise definition) via a two-level regularization procedure. To be able to treat problems with L 2 -regular data, i.e., We introduce a regularization of (3) as dX ε,δ n =δ∆X ε,δ n dt + div where δ > 0 is an additional regularization parameter. We define the operator A ε,δ : and note that (7) is equivalent to X ε,δ n (0) = x n 0 . The operator A ε,δ is coercive, demicontinuos and satisfies (cf. [14,Remark 4 The following monotonicity property, which follows from the convexity of the function | · | 2 + ε 2 , will be used frequently in the subsequent arguments The existence and uniqueness of a variational solution of (7) is established in the next lemma; we note that the result only requires L 2 -regularity of data.
In next step, we show a priori estimate for the solution of (7) in stronger norms; the estimate requires H 1 0 -regularity of the data. Lemma 3.2. Let x n 0 ∈ L 2 (Ω, F 0 ; H 1 0 ), g n ∈ H 1 0 . There exists a constant C ≡ C(T ) such that for any ε, δ > 0 the corresponding variational solution X ε,δ n of (7) satisfies E sup be an orthonormal basis of eigenfunctions of the Dirichlet Laplacian −∆ on L 2 and V m := span{e 0 , . . . , e m }. Let P m : For fixed ε, δ, n the Galerkin approximation X ε,δ n,m ∈ V m of X ε,δ n satisfies dX ε,δ n,m =δ∆X ε,δ n,m dt + P m div X ε,δ n,m (0) =P m x n 0 .
For fixed n 1 , n 2 , ε we get from (21) using (22) by the lower-semicontinuity of norms that In the next step, we show that the limiting process X ε is a SVI solution of (3). We subtract the process The Itô formula implies We rewrite the second term on the right-hand side in above inequality as The convexity of J ε along with the Cauchy-Schwarz and Young's inequalities imply that By combining two inequalities above with (25) we get Since X ε,δ n ∈ H 1 0 and Z ∈ H 1 0 it holds that J ε,λ (X ε,δ n ) =J ε,λ (X ε,δ n ) and J ε,λ (Z) =J ε,λ (Z). The lower-semicontinuity ofJ ε,λ in BV (O) with respect to convergence in L 1 , cf. [1], and (22), (24) and the strong convergence g n → g in L 2 imply that for δ → 0 and n → ∞ the limiting process X ε ∈ L 2 (Ω; C([0, T ]; L 2 ) satisfies (18).
To conclude that X ε is a SVI solution of (3) it remains to show that X ε ∈ L 1 (Ω; (17)) yields On noting that (cf. Definition 2.3 or [9, proof of Theorem 1.3]) Hence, by the Tonelli and Gronwall lemmas it follows that In the next step we show the uniqueness of the SVI solution. Let X ε 1 , X ε 2 be two SVI solutions to (3) for a fixed ε ∈ (0, 1] with initial values The term III is estimated using Young's inequality as We have to show the following estimate We consider an approximating sequence , the second boundary integral vanishes. We estimate the first boundary integral as in Ω × (0, T ). We further assert that the trace of each approximating function Next, we obtain The convergences (22), (24) imply the convergence X ε,δ 2,n → X ε 2 in L 2 (Ω; C([0, T ]; L 2 )) for δ → 0, n → ∞. We note that for δ → 0 the fourth term on the right-hand side of (30) vanishes due to Lemma 3.2. Hence, by taking the limits for δ → 0, n → ∞ in (30), using the strong convergence g 2,n → g 2 in L 2 for n → ∞ , the lower-semicontinuity of norms and (22), (24) we obtain for all t ∈ [0, T ]. After applying the Tonelli and Gronwall lemmas we obtain (20).
Our second main theorem establishes existence and uniqueness of a SVI solution to (1) in the sense of Definition 3.1. The solution is obtained as a limit of solutions of the regularized gradient flow (3) for ε → 0.
Furthermore, the following estimate holds where X 1 and X 2 are SVI solutions of (1) with x 0 ≡ x 1 0 , g ≡ g 1 and x 0 ≡ x 2 0 , g ≡ g 2 , respectively.

Proof of Theorem 3.2.
We consider L 2 -approximating sequences {x n 0 } n∈N ⊂ L 2 (Ω, F 0 ; H 1 0 ) and {g n } n∈N ⊂ H 1 0 of the initial condition x 0 ∈ L 2 (Ω, F 0 ; L 2 ) and g ∈ L 2 , respectively. For n ∈ N, δ > 0 we denote by X ε 1 ,δ n , X ε 2 ,δ n the variational solutions of (7) with ε ≡ ε 1 , ε ≡ ε 2 , respectively. By Itô's formula the difference satisfies We estimate the second term on the right-hand side of (33) using the convexity (12) Next, we observe that Using the inequality above, we get Substituting (34) along with the last inequality into (33) yields After using the Burkholder-Davis-Gundy inequality for p = 1, the Tonelli and Gronwall lemmas we obtain that We take the limit for δ → 0 in (36) for fixed n and ε 1 , ε 2 , and obtain using (22) by the lower-semicontinuity of norms that Hence, by (24) and the lower-semicontinuity of norms, after taking the limit n → ∞ in (37) for fixed ε 1 , ε 2 we get The above inequality implies that {X ε } ε>0 is a Cauchy Sequence in ε. Consequently there exists a unique {F t }-adapted process X ∈ L 2 (Ω; C([0, T ]; L 2 )) with X(0) = x 0 such that This concludes the proof of (31).

Numerical Approximation
We construct a fully-discrete approximation of the STVF equation (1) via an implicit time-discretization of the regularized STVF equation (3). For N ∈ N we consider the timestep τ := T /N , set t i := iτ for i = 0, . . . , N and denote the discrete Wiener increments as ∆ i W := W (t i ) − W (t i−1 ). We combine the discretization in time with a the standard H 1 0 -conforming finite element method, see, e.g., [7], [9], [4]. Given a family of quasi-uniform triangulations T h h>0 of O into open simplices with mesh size h = max K∈T h {diam(K)} we consider the associated space of piecewise linear, globally continuous functions 0 and set L ≡ dimV h for the rest of the paper. We set X h 0 := P h x 0 , g h := P h g, where P h is the L 2 -projection onto V h . The implicit fully-discrete approximation of (3) is defined as follows: To show convergence of the solution of the numerical scheme (41) we need to consider a discretization of the regularized problem (7). Given x 0 ∈ L 2 (Ω, F 0 ; L 2 ), g ∈ L 2 and n ∈ N we choose x n 0 := P n x 0 ∈ V n , g n := P n g ∈ V n in (7). Since V n ⊂ H 1 0 the sequences {x n 0 } n∈N ⊂ L 2 (Ω, F 0 ; H 1 0 ), {g n } n∈N ∈ H 1 0 constitute H 1 0 -approximating sequences of x 0 ∈ L 2 (Ω, F 0 ; L 2 ), g ∈ L 2 , respectively. We set x h,n 0 := P h x n 0 , g h,n := P h g n , where P h is the L 2 -projection onto V h . The fully-discrete Galerkin approximation of (7) for fixed n ∈ N is then defined as follows: fix N ∈ N, h > 0 set X 0 ε,δ,n,h = x h,n 0 and determine X i ε,δ,n,h ∈ V h , i = 1, . . . , N as the solution of The next lemma, cf. [17, Lemma II.1.4] is used to show P-a.s. existence of discrete we make use of the following lemma, cf. [11,8].
Lemma 4.2. Let (S, Σ) be a measure space. Let f : S × R L → R L be a function that is Σ-measurable in its first argument for every x ∈ R L , that is continuous in its second argument for every α ∈ S and moreover such that for every α ∈ S the equation f (α, x) = 0 has an unique solution x = g(α). Then g : S → R L is Σ-measurable.

We define the discrete Laplacian
To obtain the required the stability properties of the numerical approximation (42) we need the following lemma.
Lemma 4.4. Let ∆ h be the discrete Laplacian defined by (43). Then for any v h ∈ V h , ε, h > 0 the following inequality holds: where we denote (v, w) K : We rewrite (45) with the mass matrix M := {M } i,k := (ϕ i , ϕ k ) and the stiffness matrix Since V h consists of functions, which are piecewise linear on the triangles K ∈ T h , (|∇v h | 2 + ε 2 ) − 1 2 is constant on every triangle T . We note, that the matrices M and M −1 are positive definite. We get using the Young's inequality In the next lemma we state the stability properties of the numerical solution of the scheme (42) which are discrete analogues of estimates in Lemma 3.1 and Lemma 3.2.
Lemma 4.5. Let x 0 ∈ L 2 (Ω, F 0 ; L 2 ) and g ∈ L 2 be given. Then there exists a constant (48) Proof of Lemma 4.5. We set v h = X i ε,δ,n,h in (42), use the identity 2(a − b)a = a 2 − b 2 + (a − b) 2 and get for i = 1, . . . , N We take expected value in (49) and on noting the properties of Wiener increments E [∆ i W ] = 0, E [|∆ i W | 2 ] = τ and the independence of ∆ i W and X i−1 ε,δ,n,h we estimate the stochastic term as We neglect the positive term and get from (49) that We sum up the above inequality for k = 1, . . . , i and obtain By the discrete Gronwall lemma it follows from (50) that We substitute the above estimate into the right-hand side of (50) to conclude (47). To show the estimate (48) we set v h = ∆ h X i ε,δ,n,h in (42) use integration by parts and proceed analogically to the first part of the proof. We note that by Lemma 4.4 it holds that Hence we may neglect the positive term and get that and obtain (48) after an application of the discrete Gronwall lemma.
The following result shows that the limit Y ≡ X ε,δ n , i.e., that the numerical solution of scheme (42) converges to the unique variational solution of (7) for τ, h → 0. Owing to the properties (10), (11) the convergence proof follows standard arguments for the convergence of numerical approximations of monotone equations, see for instance [11], [8], and is therefore omitted. We note that the convergence of the whole sequence {X ε,δ,n τ,h } τ,h>0 follows by the uniqueness of the variational solution.
We substitute the equality above into (63) and obtain for κ ≥ 1 that We observe that T 0 e −κs |R τ (s)| ds → 0 for τ . Hence, by the lower-semicontinuity of norms using the convergence properties from Lemma 4.6 and the monotonicity property (10) we get for τ, h → 0 that It is not difficult to see that (61) for Y ≡ X ε,δ n implies We subtract the equality (65) from (64) and obtain for κ ≥ 1 Hence, we conclude that X ε,δ,n τ,h → X ε,δ n in L 2 (Ω; L 2 ((0, T ); L 2 ). Remark 4.1. It is obvious from the proof of Lemma 4.7 that the strong convergence in L 2 (Ω × (0, T ); L 2 ) remains valid for λ = 0 due to (10) by the Poincaré inequality.
Next lemma guarantees the convergence of the numerical solution of scheme (42) to the numerical solution of scheme (41) for δ → 0.
We note that the n-dependent constant C n in the estimate above is due to the a priori estimate (48), for H 1 0 -regular data x 0 , g it holds that C n ≡ C(E[ x 0 H 1 0 ], g H 1 0 ) by the stability of the discrete L 2 -projection P h : Proof of Lemma 4.8. We define Z i ε,h := X i ε,h − X i ε,δ,n,h . From (41) and (42) we get and by the Cauchy-Schwarz and Young's inequalities From the convexity (12) it follows that Hence, we obtain that We estimate the last term on the right-hand side above as and substitute the above identity into (66) Next, we sum up the above inequality up to i ≤ N and obtain After taking expectation in the above and using the independence properties of Wiener increments and the estimate (48) we arrive at with C n ≡ C( x n 0 H 1 0 , g n H 1 0 ). Finally, the Discrete Gronwall lemma yields for i = 1, . . . , N that which concludes the proof .
We define piecewise constant time-interpolant of the discrete solution {X i ε,h } N i=0 of (41) for t ∈ [0, T ) as We are now ready to state the second main result of this paper which is the convergence of the numerical approximation (41) to the unique SVI solution of the total variation flow (1) (cf. Definition 3.1).
Proof of Theorem 4.1. For x 0 ∈ L 2 (Ω, F 0 ; L 2 ) and g ∈ L 2 we define the H 1 0 -approximating We consider the solutions X ε , X ε,δ n of (3), (7), respectively, and denote by X ε n the SVI solution of (3) for x 0 ≡ x n 0 , g ≡ g n . Furthermore, we recall that the interpolant X ε,δ,n τ,h of the numerical solution of (42) was defined in (52). We split the numerical error as Finally, we consecutively take τ, h → 0, δ → 0, n → ∞ and ε → 0 in (70) and use the above convergence of I − V to obtain (69).

Remark 4.2.
We note that the convergence analysis simplifies in the case that the problem data have higher regularity. For x 0 , g ∈ H 1 0 it is possible to show that the problem (3) admits a unique variational solution (which is also a SVI solution of (3) by uniqueness) by a slight modification of standard monotonicity arguments. This is due to the fact that the operator (8) retains all its properties for δ = 0 except for the coercivity. The coercivity is only required to guarantee H 1 0 -stability of the solution, nevertheless the stability can also be obtained directly by the Itô formula on the continuous level, cf. Lemma 3.2, or analogically to Lemma 4.5 on the discrete level, even for δ = 0. Consequently, for H 1 0 -data the convergence of the numerical solution X ε τ,h can be shown as in Theorem 4.1 without the additional δ-regularization step.
We conclude this section by showing unconditional stability of scheme (41), i.e., we show that the numerical solution satisfies a discrete energy law which is an analogue of the energy estimate (27).
Proof of Lemma 4.9. We set v h ≡ X i ε,h in (41) and obtain Using the the convexity of J ε along with the inequality we get from (72) that After taking the expectation and summing up over i in (73), and noting that J ε (0) = ε|O| we obtain Hence (71) follows after an application of the discrete Gronwall lemma.

Numerical experiments
We perform numerical experiments using a generalization of the fully discrete finite element scheme (41) on the unit square O = (0, 1) 2 . The scheme for i = 1, . . . , N then reads as X 0 ε,h =x h 0 , where g h , x h 0 ∈ V h are suitable approximations of g, x 0 (e.g., the orthogonal projections onto V h ), respectively, and µ > 0 is a constant. The multiplicative space-time noise σ(X i−1 ε,h )∆ i W h is constructed as follows. The term W h is taken to be a V h -valued spacetime noise of the form where β , = 1, . . . , L are independent scalar-valued Wiener processes and {ϕ } L =1 is the standard 'nodal' finite element basis of V h . In the simulations below we employ three practically relevant choices of σ: a tracking-type noise σ(X) ≡ σ 1 (X) = |X −g h |, a gradient type noise σ(X) ≡ σ 2 (X) = |∇X| and the additive noise σ(X) ≡ σ 3 = 1; in the first case the noise is small when the solution is close to the 'noisy image' g h , in the gradient noise case the noise is localized along the edges of the image. We note that the fully discrete finite element scheme (74) corresponds to an approximation of the regularized equation (3) with a slightly more general space-time noise term of the form µσ(X ε )dW .
In all experiments we set T = 0.05, λ = 200, x 0 ≡ x h 0 ≡ 0. If not mentioned otherwise we use the time step τ = 10 −5 , the mesh size h = 2 −5 and set ε = h = 2 −5 , µ = 1. We define g ∈ V h as a piecewise linear interpolation of the characteristic function of a circle with radius 0.25 on the finite element mesh, see Figure 1 (left), and set ϕ (x)ξ , x ∈ O where ξ , = 1, . . . , L are realizations of independent U(−1, 1)-distributed random variables. If not indicated otherwise we use ν = 0.1; the corresponding realization of ξ h is displayed in Figure 1 (right).
We choose ε = h = 2 −5 , µ = 1, σ ≡ σ 1 as parameters for the 'baseline' experiment; the individual parameters are then varied in order to demonstrate their influence on the evolution. The time-evolution of the discrete energy functional J ε,λ (X i ε,h ), i = 1, . . . , N for a typical realization of the space-time noise W h is displayed in Figure 2; in the legend of the graph we state parameters which differ from the parameters of the baseline experiment, e.g., the legend 'sigma 2 , mu = 0.125' corresponds to the parameters σ ≡ σ 2 , µ = 0.125 Figure 1. The function g (left), the noise ξ h (middle) and the noisy image (right). and the remaining parameters are left unchanged, i.e., ε = h = 2 −5 . For all considered parameter setups, except for the case of noisier image ν = 0.2, the evolution remained close to the discrete energy of the deterministic problem (i.e., (74) with µ = 0). The energy decreases over time until the solution is close to the (discrete) minimum of J ε,λ ; to highlight the differences we display a zoom at the graphs. We observe that in the early stages (not displayed) the energy of stochastic evolutions with sufficiently small noise typically remained below the energy of the deterministic problems and the situation reversed as the solution approached the stationary state.  Figure 2. Evolution of the discrete energy: σ ≡ σ 1 , h = 2 −5 , ε = h, h 2 , µ = 1, 2 (left); σ ≡ σ 1 , σ 2 , σ 3 , σ ≡ σ 1 , h = 2 −5 , ε = 2h, σ = σ 1 , ε = h = 2 −6 and ν = 0.2 (middle and right).
In Figure 3 we display the solution at the final time computed with σ ≡ σ 1 , ε = h for h = 2 −5 , 2 −6 , respectively, and σ ≡ σ 2 , ε = h = 2 −5 ; graphically the results of the remaining simulations did not significantly differ from the first case. The displayed results may indicate that the noise σ 2 yields worse results than the noise σ 1 and σ 2 ; however, for sufficiently small value of µ the results would remain close to the deterministic simulation as well. We have magnified noise intensity µ to highlight the differences to the other noise types (i.e., the noise is concentrated along the edges of the image). We note that the gradient type noise σ 2 might be a preferred choice for practical computations, cf. [16].