A duality based approach to the minimizing total variation flow in the space $H^{-s}$

We consider a gradient flow of the total variation in a negative Sobolev space $H^{-s}$ $(0\leq s \leq 1)$ under the periodic boundary condition. If $s=0$, the flow is nothing but the classical total variation flow. If $s=1$, this is the fourth order total variation flow. We consider a convex variational problem which gives an implicit-time discrete scheme for the flow. By a duality based method, we give a simple numerical scheme to calculate this minimizing problem numerically and discuss convergence of a forward-backward splitting scheme. Several numerical experiments are given.


Introduction
During the last two decades, total variation models have became very popular in image processing and analysis. This interest has been initiated by the seminal paper of Rudin, Osher and Fatemi [30], where the authors proposed to minimize the functional 1 2τ to solve the problem of image denoising. Here, Ω ⊂ R 2 denotes the image domain, the function f : Ω → R represents a given gray-scale image and τ > 0 is a parameter. The first term in (1.1) is called the fidelity term and it enforces a minimizer with respect to u to be close to a given image f in the sense of L 2 -norm. The last term in (1.1) is the total variation of u and it plays the role of regularization. The total variation model (1.1) was investigated in detail by Meyer [25] in context of the image decomposition problem. In this problem, it is assumed that a given image f is a sum of two components, u and v, where u is a cartoon representation of f and v is an oscillatory component, composed of noise and texture. Meyer has observed that in the case when f is a characteristic function, which is small with respect to a certain norm, the model (1.1) does not provide expected decomposition of f , since it treats such a function as oscillations. To overcome this inaccuracy, he proposed to consider a new model, replacing the L 2norm in the fidelity term, by a weaker norm, which is more appropriate to deal with textures or oscillatory patterns (see [25] for details). This change, however, introduced a practical difficulty in numerical computation of a minimizer, due to the impossibility of expressing the associated Euler-Lagrange equation. One of the first attempts to overcome this difficulty has been made by Osher, Solé and Vese [29]. They proposed to approximate Meyer's model by its simplified version, given by 1 2τ Here H −1 (Ω) is the function space dual of H 1 0 (Ω). Later, this model was studied in context of the decomposition problem by Elliott and Smitheman [12], [13]. It has been also considered for other image processing task, like inpainting, by Schönlieb [32] and Burger et. al [7].
The total variation models (1.1) and (1.2) and their interesting and successful applications in image processing became also the motivation for many authors to perform rigorous analysis of properties of solutions to the corresponding total variation flows. In the case of the model (1.1), the corresponding total variation flow is formally given by the second order partial differential equation Whereas the total variation flow corresponding to the model (1.2) is provided by the fourth order partial differential equation We intend to give a few references related to (1.3) and (1.4) which is not at all exhaustive. Although the meaning of solutions for (1.3), (1.4) is unclear, it can be understood in a naive way as the gradient flow of a lower semi-continuous convex functional in a Hilbert space. The well-posedness of (1.3), (1.4) has been established by the theory of maximal monotone operators initiated by Komura [21] and developed by Brezis [6]. A behavior of solution for (1.3) is rigorously studied, for example, in [2] [5]. The book [2] contains several types of wellposedness results in different function spaces other than a Hilbert space L 2 . Its anisotropic version was studied by Mucha, Muszkieta and Rybka [27], and Lasica, Moll and Mucha [24] with intention to propose a numerical scheme for image denoising. Meanwhile, a viscosity approach for (1.3) and its non-divergent generalization is established by M.-H. Giga, Y. Giga and N. Pozar [15]. There is less literature on (1.4) based on rigorous analysis. A characterization of the speed is given in Kashima [18] (see [19] for a general dimension) and the extinction time estimate is given [16], [17]. It is also noted that a solution to (1.4) may become discontinuous even if it is initially Lipschitz continuous [14], which is different from (1.3). This type of equation is very important to model relaxation phenomena in materials science below the roughening temperature; see a recent paper by J.-G. Liu, J. Lu, D. Margetis and J. L. Marzuola [23] and papers therein. A numerical analysis for (1.4) is given by Kohn and Versieux [20].
In this paper, we generalize equations (1.3) and (1.4) and consider the total variation flow in the space H −s with s ∈ [0, 1]. More precisely, we consider the (2s + 2)-order parabolic differential equation with periodic boundary conditions and initial data in H −s . Our aim is to present a consistent approach to the construction of the minimizing total variation sequence in the space H −s and to discuss the problem of its convergence in an infinite dimensional space setting. Application of derived scheme will allow us to perform numerical experiments to observe evolution of solutions to the equation (1.5) and their characteristic features with respect to different values of the index s. Such a close look at this problem may be the basis for further study on the considered evolution equation and its applications. For example, our numerical experiment suggests that the solution may be discontinuous instantaneously for s = 1/2 and s = 1 for Lipschitz initial data. The case s = 1 has been rigorously proved in [14]. We conjecture that such phenomena occur for all s ∈ (0, 1] excluding s = 0 of course. The well-posedness problem for (1.5) is similar to (1.3) and (1.4). Apparently, the characterization of speed of (1.5) was not derived before so we give its characterization. The proof given here is very close to the one for (1.3) given in [2]. We consider a semi-discretization of this equation with respect to a time variable. The scheme we consider here is an implicit-time discrete scheme. Such formula leads to recursive minimization of the functional functional The non-differentiability of the total variation term caused a problem although the problem is convex. One way is to use Bergman's splitting method presented by Oberman, Osher, Takei and Tsai [28]. Here, we consider a dual formulation of this problem which is new at least for s ∈ (0, 1]. We generate the minimizing dual sequence by the well-known splitting scheme and prove its convergence in H −s . It seems that the proof of this setting seems to be new. We use the characterization of the subdifferential of the total variation in H −s to derive an explicit form of this sequence. It turns out that its form is very simple and convenient to compute an approximate minimizer. However, the proof of its convergence in L 2 can be carried out only in a finite dimensional space. Due to this limitation, we investigate the problem of convergence in ergodic sense. This paper is organized as follows. We recall H −s space and the total variation in Section 2. We give a characterization of the subdifferential of the total variation in H −s in Section 3. In Section 4, we give a way of semi-discritization, which is a recursive minimization problem of convex but non-differentiable functional. In Section 5, we formulate its dual problem. In Section 6, we discuss convergence of a forward-backward splitting scheme. In Section 7, we derive its explicit form. In Section 8, we discuss its ergodic convergence. In Section 9, we explain how to discretize the introduced scheme and provide an exact condition for a convergent problem in a finite dimensional space. In Section 10, we present results of numerical experiments to illustrate evolution of solutions to considered total variation flows, showing their characteristic features with respect to different values of index s ∈ [0, 1].

Preliminaries
To give a rigours interpretation of the equation (1.5) with the periodic boundary conditions we first need to introduce preliminary definitions and notations.
Let T d be a d-dimensional torus defined by Hereû denotes the Fourier transform of the function u, i.e., We define the operator (−∆ av ) s : Then the inner product in H s av is defined by The last equality in the above definition follows Parseval's theorem.
Since the operator (−∆ av ) s is an isometry, therefore the inner product in . Now, we introduce the definition of a total variation of the function u. It is given by where for a vector field z(x) = (z 1 (x), z 2 (x)), the norm · ∞ is defined by z ∞ := sup x |z(x)| and |·| denotes the standard Euclidean norm. The subspace of periodic functions u ∈ L 1 (T d ) such that the total variation of u is finite is denoted by BV (T d ).

A characterization of the subdifferential
We define the functional Φ on L 2 (T d ) by We also need to define the associated dual functional Further, we need to introduce some standard results. We assume that H is a normed space and H * is its dual space.
Proof. See proof of [3, Proposition 1.6] Definition 1. Let Φ be a proper lower-semicontinuous, convex functional on H, endowed with the inner product (·, ·) H . The subdifferential of Φ at u ∈ H is the set Lemma 3. Suppose Φ is convex, lower semicontinuous, nonnegative, and positively homogeneous of degree one.
In the next lemma, we will show thatΦ(u) = Ψ(u), where Ψ is ginve by (3.4). This will enable us to characterize the subdifferential of Φ. In the proof, we follow results presented in [17, Lemma 8.5, Proposition 8.6] and [16].
Lemma 4. Let Ψ be the functional defined by Proof. Let w ∈ H −s av (T d ). First, we showΦ(w) ≤ Ψ(w). If Ψ(w) is infinite, then the inequality is obvious. Thus, assume that Ψ(w) < ∞, then there exists z ∈ X(T d ) such that Then using this result, we obtain . Therefore, we get that which is the desired inequality. Now we prove thatΦ(w) ≥ Ψ(w). To do so, we first note that if we prove Φ(w) ≤Ψ(w) then by Lemma 1 we get thatΦ(w) ≥Ψ(w). Furthermore, since Ψ is convex, lower-semicontinuous, and positively homogeneous of degree one, we get that Ψ =Ψ. This will imply the desired inequality. Therefore, we need to prove that Φ(w) ≤Ψ(w). By the definition of Ψ, we get that (3.6) . Moreover, Ψ =Φ holds by Lemma 4. These conditions are equivalent to Theorem 1 gives a rigorous interpretation of the equation The result concerning the existence and uniqueness of a solution to system (3.8) is given in the theorem below. (1) for all t > 0 we have that u(t) ∈ D(A), Proof. The proof of this theorem can be found in Brezis [6].

A semi-discretization
In order to construct the scheme to solve the equation (3.8), we introduce its semi-discretization with respect to the time variable t. That is, we consider the finite set of n + 1 equidistant points This result has been justified in a very general setting by Crandall-Liggett [10]: where S(t) is the semigroup generated by −A. The convergence is uniform on compact intervals of [0, ∞).
The original paper by Crandall-Liggett contains also an error estimate, which is not optimal. The optimal one is of order O(τ 2 ) and it has been derived by Rulla [31,Theorem 4].
Hence, to solve (3.8), we need to know how to find a solution to the one iteration of the scheme (4.1). First, we observe that the equation (4.1) for u τ (t i ) is the optimality condition for the minimization problem The main difficulty in the construction of a minimizing sequence for such kind of problems is caused by the lack of differentiability of the total variation term. The commonly used approach to overcome this difficulty consists in considering the dual formulation of (4.2). The first work related to image processing application where this approach has been used and where the proof of the convergence has been provided, was the paper of Chambolle [8]. Nowadays, there is a variety of efficient schemes that one can apply to solve the dual formulation of (4.2). Among others, we mention here papers of Chambolle and Pock [9] and Beck and Teboulle [4]. We also refer to papers of Aujol [1] and Weiss et al. [33], where the authors adapt the existing methods for the purpose of the total variation minimization. In this work, we rather aim to present a consistent approach for construction of the minimizing sequence and to discuss its convergence in an infinite dimensional space. For this purpose, we consider the well-known splitting scheme introduced by Lions and Mercier [22], which in fact is a variant of the classical Douglas-Rachford algorithm.

A dual problem
Let C ⊂ H be a non-empty convex and closed set. We define the indicator function of a set C by To drive the dual problem we will need the following result Lemma 5. If we consider Φ given by (3.1) as a functional definwed on H −s av . Then, the convex conjugate of Φ is given by Φ with respect to the strong H −s av (T d )-topology.
Proof. Let u ∈ BV (T d ) ∩ H −s av (T d ), then the convex conjugate of function χ K is given by Since the set K is convex, the function χ K is convex. Moreover, since χ K is lower semi-continuous, then it follows from [11,Proposition I.4 Proposition 1. Let f ∈ H −s av (T d ) be a given function, then the problem Moreover, the solution u of (5.3) is associated with the solution v of (5.4) by the relation Proof. Using the standard result of convex analysis (see, e.g, [11, Corollary I.5.2]), we get that the Euler-Lagrange equation associated with the problem (5.3) is equivalent to and we note that this is the Euler-Lagrange equation of the functional . Using Lemma 5, we conclude that the dual problem to (5.3) is given by (5.4) and the relation (5.5) holds. τ where Z is the closure of the set with respect to the strong L 2 -topology.

Convergence of a forward-backward splitting scheme
Let us define the functional J on H −s av (T d ) by Then the dual problem (5.4) can be equivalently written as In order to construct a minimizing sequence {v k } for the above problem, we consider the forward-backward splitting scheme introduced by Lions and Mercier [22], given by Remark 2. Application of the scheme (6.1) requires that . This is ensured by [11, Proposition I.5.6], because int (D(J))∩ D(Φ * ) = ∅.
we obtain that the scheme (6.1) is equivalent to where H 1/λ denotes the Yosida approximation of the operator A = ∂ H −s av Φ. It is well known that H 1/λ converges as λ → ∞ to the minimal selection A 0 of A.
In the proof of convergence of the sequence {v k } we will need the lemma below: Proof. By the definition of the subdifferential Proof. Assume that v k , v k+1 ∈ K. After simple calculations, we obtain From Lemma 6, it follows that for all w ∈ K. In particular, taking w = v k and recalling that u k = f − τ v k , we get Taking into account this fact in (6.2), we obtain Therefore, taking λ so that 0 < λτ < 2, we get J(v k+1 ) < J(v k ) as long as v k+1 = v k . This implies that the sequence {J(v k )} is convergent. Let m = lim k→∞ J(v k ). Then, obviously lim k→∞ J(v k+1 ) = m, and we have Since λτ < 2 by the assumption, we have Then, passing to the limit as k → ∞, we obtain We notice that boundedness of the sequence {J(v k )} and the definition of J imply that the sequence {v k } is bounded in H −s av . Moreover, since u k = f − τ v k , the sequence {u k } is also bounded in H −s av . Then let denote by v * a limit of the convergent subsequence {v kj } of {v k } and by v * * a limit of the convergent subsequence {v kj +1 } of {v k+1 }. The fact (6.5) implies that v * = v * * . It remains The monotonicity of the operator ∂ H −s av Φ * implies that Using the second line of the scheme (6.1) we can rewrite the above inequality as The first term in (6.6) goes to zero as k → ∞, ecause it can be estimated by The first factor above goes to zero, while the second one is bounded.
We rewrite the second term in (6.6),as follows, We notice that due to (6.5),the term If we combine these pieces of information, then passing to the limit in (6.6) with k → ∞, we obtain Since the functional Φ * is convex, proper and lower semi-continuous, its sub-differential ∂ H −s av Φ * is a maximal monotone operator. This implies that

An explicit form of a scheme
Since the nonlinear projection, given in Corollary 1, is difficult to compute in practice, we use the characterization of subdifferential ∂ H −s av Φ, provided in Theorem 1, to construct an explicit scheme for minimizing sequence of the dual problem (5.9). More precisely, we apply the same forward-backward scheme as in the previous section to generate the sequence {z k } minimizing (5.9). This sequence is related to {v k } by formula v k = −τ (−∆ av ) s div z k .
In order to proceed, we need to introduce definitions of functionals F and G on L 2 (T d , R d ). They are given by +∞ otherwise . and Then, we notice that the dual problem (5.9) is equivalent with In order to find z * such that 0 ∈ ∂ L 2 (F (z * )+G(z * )), we consider the splitting scheme: For practical implementation of this scheme, it remains to find its explicit formula.
By direct calculations, we obtain for any η ∈ L 2 (T d ) and ǫ > 0. Thus, using the definition of the inner product in H −s av and the integration by parts formula, we get what yields the desired result.
Lemma 8. Let G be a functional on L 2 (T d , R 2 ) defined by (7.2), and let for some y ∈ L 2 (T d , R d ) and λ > 0. Then, we have that z = P L 2 Z (y), where P L 2 Z (y) denotes the orthogonal projection of y on the set Z with respect to the inner product in L 2 .
Proof. The formula (7.6) implies that z is a solution to the equation which is the optimality condition to the problem inf z∈L 2 This implies that z = P L 2 Z (y).
Remark 4. Due to the form of the set Z, the projection operator P L 2 Z is given by the explicit formula where | · | denotes the Euclidean norm and a ∨ b := max(a, b). Using this fact, Lemma 7 and Lemma 8, we obtain an explicit form of the scheme (7.4). It is given by

Ergodic convergence
In this section, we investigate ergodic convergence of sequences {v k } and {z k } defined by schemes (6.1) and (7.8), respectively. First, we recall from Proposition 2 that if 0 < λτ < 2, then the sequence {v k } converges weakly in H −s av (T d ) to v * ∈ K, where v * is a unique solution of the dual problem (5.4). Then, Mazur's lemma implies existence of the sequencē where {α k } is such that n k=0 α k = 1, which converges strongly in H −s av (T d ) to v * as n → ∞. Our aim in this section is to construct a sequence {α k } such that v n → v * as n → ∞, and next, to use this result in order to prove that sequencē converges weakly in QL 2 (T d ) to z * ∈ Z as n → ∞, where Q is the orthogonal projection onto the space of gradient fields. It is based on Helmholtz-Weyl decomposition. Note that for this proof, we will not require a weak convergence of z k to z * ∈ Z in L 2 (T d ). We are not able to prove this in an infinite dimensional space setting (see Lemma 9 and Proposition 4). All these results are given in the following proposition: Proposition 3. Let {v k } be a weakly convergent sequence in K generated by the scheme (6.1) and let {β k } be a sequence of positive real numbers such that {β k } ∈ l 2 \ l 1 . Then, for α k = β k / n j=1 β j the sequence {v n } given by (8.1) converges strongly in H −s av (T d ) to v * ∈ K as n → ∞. Moreover, the sequence {z n } given by (8.2), where {z k } is generated by the scheme (7.8), converges weakly in QL 2 (T d ) to z * ∈ Z as n → ∞.
Proof. Let α k = β k / n k=0 β k , where the sequence {β k } ∈ l 2 \ l 1 . Then, by the triangle inequality and Cauchy-Schwarz inequality, we have Since {β k } ∈ l 2 \ l 1 , and therefore n k=0 β k → ∞ as n → ∞, it remains to show that is bounded. Using the scheme (6.1) and the fact that the resolvent ( H −s av for k = 0, 1, . . . , n. By assumption, 0 < λτ < 2, we have that |1 − λτ | < 1. Therefore, we get the estimate If we apply this fact to (8.3), then passing to the limit and using the monotone convergence theorem, we obtain lim n→∞ v n − v * H −s av = 0. Due to the weak closedness of the set K, we have existence of z * ∈ Z such that v * = −(−∆ av ) s div z * . We also havē

Implementation
In this section, we explain how to discretize the system (7.8) and to solve numerically one iteration of the semi-implicit scheme (4.1). Recursive application of this procedure leads to a numerical solution of the nonlinear evolution equation (3.8). For simplicity of presentation, we construct a method for the one-dimensional problem, but it can be easily extended to higher dimensions. We also present results concerning the convergence of the introduced scheme in a finite dimensional space. We denote by X the Euclidean space R N . The scalar product of two elements u, v ∈ X is defined by u, v := N i=1 u i v i and the norm u := u, u . The discrete gradient operator satisfying periodic boundary conditions is defined as where h > 0 is the space discretization step. With this notation in hand, the discrete divergence and the Laplace operator are given by div := ∇ T and ∆ := div ∇, respectively. Note that ∆ is a symmetric, positive definite, circulant matrix of size N × N . It is well know, that in this case, its eigenvectors are given by Moreover, C = µ s+1 max , where µ max denotes the largest eigenvalue of the discrete Laplace operator.
Proof. From definitions (9.1), (9.2) and Parseval's theorem, we have then using this fact in (9.4) and applying the Plancherel identity, we get We denote by Z := {z ∈ X : z ∞ ≤ 1}, where z ∞ := max j |z|. For f ∈ X we define functionals F and G on X by +∞ otherwise . and can be solved by the iterative scheme Proposition 4. Assume that Cλτ < 2, where the constant C > 0 is as in Proof. In the proof, we essentially follow steps in the proof of Proposition 2. We assume that z k , z k+1 ∈ Z. After simple calculations, we obtain From the second line of the scheme (9.7) we have that Using the definition of subdifferential ∂ X G and the fact that G is an indicator function of the set Z, we get for any p ∈ Z. Then in particular for p = z k and using that w k = −∇(f + τ (−∆) s div z k ), we obtain Taking into account this fact and Lemma 9 in (9.8), we obtain (9.9) Therefore, taking λ so that Cλτ < 2, we get F (z k+1 ) < F (z k ), as long as z k+1 = z k , which implies that the sequence {F (z k )} is convergent. To complete the proof we proceed in a similar way as in the proof of Proposition 2 using the fact that in a finite dimensional space linear operators are bounded and norms are equivalent.
We note that from Proposition 4 and Lemma 9 it follows that the sequence {z k } converges weakly to z * in X if and only if where the constant C = µ s+1 max and µ max := max j µ j is the largest eigenvalue of the discrete Laplace operator ∆. In the case of considered here finite difference discretization, we have that where h denotes the space discretization step. In other words, to achieve the convergence one has to select values of parameters τ , λ and h so that µ s+1 max τ λ < 2. Taking, for instance, τ = O(h 2s+3 ), λ = O(1/h) and passing to the limit with h → 0, we get the desired result.

Numerical results
In this section, we present results of numerical experiments, obtained by application of the introduced earlier schemes, to solve the evolution equation (3.8). These experiments have been performed on one-dimensional data for more accurate presentation of differences in solutions with respect to different values of the index s and their characteristic features.
For experiments, we were considering initial data f , g : [−10, 10] → R, given by explicit formulas In all calculations, we have fixed values of parameters h = 0.1 and τ = 0.1. In each case, a value of λ was selected so that the criterion for the convergence  Figure 1: Graphs of functions f and g considered in experiments as initial data.
given in Proposition 4 would be satisfied, i.e., we set λ = 4 · 10 −2 , λ = 2 · 10 −3 and λ = 10 −4 for s = 0, s = 0.5 and s = 1, respectively. In Figure 2, we present evolution of solutions to the H −s total variation flow for initial data f and g, and for different values of the index s (s = 0, 0.5, 1). From our computation, we conjecture that a solution may be instantaneously discontinuous for s ∈ (0, 1] for Lipschitz initial data. This is rigorously proved for s = 1 in [14]. Also, we see from this computation, that the motion becomes slower as s becomes larger.