# On the Taut String Interpretation and Other Properties of the Rudin–Osher–Fatemi Model in One Dimension

- 222 Downloads

## Abstract

We study the one-dimensional version of the Rudin–Osher–Fatemi (ROF) denoising model and some related TV-minimization problems. A new proof of the equivalence between the ROF model and the so-called taut string algorithm is presented, and a fundamental estimate on the denoised signal in terms of the corrupted signal is derived. Based on duality and the projection theorem in Hilbert space, the proof of the taut string interpretation is strictly elementary with the existence and uniqueness of solutions (in the continuous setting) to both models following as by-products. The standard convergence properties of the denoised signal, as the regularizing parameter tends to zero, are recalled and efficient proofs provided. The taut string interpretation plays an essential role in the proof of the fundamental estimate. This estimate implies, among other things, the strong convergence (in the space of functions of bounded variation) of the denoised signal to the corrupted signal as the regularization parameter vanishes. It can also be used to prove semi-group properties of the denoising model. Finally, it is indicated how the methods developed can be applied to related problems such as the fused lasso model, isotonic regression and signal restoration with higher-order total variation regularization.

## Keywords

Total variation minimization Taut string Regression splines Lewy–Stampacchia inequality Denoising semi-group Fused Lasso Isotonic regression Higher-order total variation## 1 Introduction

*u*. (The set of functions for which the total variation is finite is denoted \(\mathrm{BV}({\varOmega })\).) The linear constraint models that the noise has zero mean, and the quadratic constraint, that it has variance \(\sigma ^2\). In practice, one studies the Lagrange formulation of this problem and minimizes the functional

*taut string algorithm*, which is an alternative method for denoising of signals with applications in statistics, nonparametric estimation, real-time communication systems and stochastic analysis. In the continuous setting, for analogue signals, the taut string algorithm can be stated in the following manner (cf. Fig. 1):

The taut string algorithm has been extensively studied in the discrete setting by Mammen and de Geer as well as Davies and Kovacs [20, 29] and Dumgen and Kovacs [21]. Recently, using the highly developed methods of real interpolation theory (Peetre’s *K*-functional and the notion of invariant *K*-minimal sets, etc.), Niyobuhingiro [31] has investigated the ROF model in the discrete case and Setterqvist [40] has probed the limits to which taut string methods may be extended.

*L*(

*W*) in (2) among all functions

*W*whose graphs are curves through the points (

*a*,

*F*(

*a*)) and (

*b*,

*F*(

*b*)) and which lies within the tube \(T_\lambda \). The name of the algorithm derives from this shortest path problem. It turns out that one may just as well minimize the energy associated with an elastic rubber band satisfying the same boundary conditions and the same constraints:

### Lemma 1

*H*is strictly convex then \(W_*\) is the unique minimizer in \(T_\lambda \) of \(L_H\).

If we take \(H(s)=(1+s^2)^{1/2}\), it follows that (2) and (3) have precisely the same minimizer in \(T_\lambda \), namely \(W_\lambda \). While this statement seems intuitively clear from our everyday experience with rubber bands and strings, the mathematical assertion is not equally self-evident. A proof is consequently offered in “Appendix A”.

The paper, which is a considerably modified and enlarged version of the author’s preprint [32] and subsequent conference paper [33], has two main purposes. The first is to present a new, elementary proof of the following remarkable result:

### Theorem 1

The taut string algorithm and the ROF model yield the same solution; \(f_\lambda = u_\lambda \).

The second main purpose of the paper is to prove the following “fundamental” estimate on the denoised signal (see Sect. 8):

### Theorem 2

*f*belongs to \(\mathrm{BV}(I)\), then for any \(\lambda > 0\), the denoised signal \(u_\lambda \) satisfies the inequality

Here, \(f'\), as well as the derivative \(u_\lambda '\), is computed in the distributional sense and is, in general, a signed measure. Recall that \((f')^+\) and \((f')^-\) are finite positive measures satisfying \(f'=(f')^+ - (f')^-\), see, e.g. [38, Sect. 6.6]. As an example, the reader may compute the derivatives of *f* and \(u_\lambda =f_\lambda \) shown in Fig. 1d. Theorem 2 immediately implies the “edge-preserving” property of the one-dimensional ROF model: the solution \(u_\lambda \) can only have a “jump” at points where the data *f* has a jump, a qualitative result which holds even in the multidimensional case, as proved by Caselles et al. [16]. In the one-dimensional case, we obtain, in addition, a quantification of these jumps; they have the same sign as and are dominated by the jumps in the data. This assertion is embodied in the estimate (4) which does not carry over to higher dimensions. The proof of Theorem 2 is based on (an extension to bilateral obstacle problems of) the classic Lewy–Stampacchia inequality [28] and uses the taut string interpretation (Theorem 1) in an essential way. The proof of Theorem 2 and its implications are given in Sect. 8.

Our Theorem 2 turns out to be a special case of an estimate proved in Briani et al. [14, Lemmas 3, 4] and is related to a result (Lemmas 2, 3) in Bonfore and Figalli [11, p. 4459]. Both papers study the gradient flows associated with certain one-homogeneous functionals. In [14], it is functionals of the form \(\int _{{\varOmega }} | {\text {div}}\,{\varvec{w}}|\,\mathrm{d}x\) for vector fields \({\varvec{w}}\) defined on \({\varOmega }\), and the paper was not directly concerned with the ROF model. The relevance of these papers was first pointed out the author after the publication of [33]. Moreover, our method of proof differs from those in [11, 14] and we use Theorem 2 to derive the number of fundamental properties of the one-dimensional ROF model.

Perhaps, the most significant consequence of Theorem 2 is that for any in-signal *f* belonging to \(\mathrm{BV}(I)\), we get \(u_\lambda \rightarrow f\) strongly in \(\mathrm{BV}(I)\) as \(\lambda \rightarrow 0+\), in particular we show that \(\int | f' - u_\lambda '|\,\mathrm{d}x \rightarrow 0\) as \(\lambda \rightarrow 0+\), see Proposition 8. The usual Moreau–Yosida approximation result, see, for example, [2, Ch. 17], only contains the weaker assertion that \(u_\lambda \rightarrow f\) in \(L^2(I)\) and \(\int _I |u_\lambda '|\,dx\rightarrow \int _I|f'|\,dx\) as \(\lambda \) tends to zero.

The literature on the ROF model is extensive, and many of the results about the one-dimensional case are often scattered throughout research articles and monographs as examples illustrating the more general multidimensional theory that is their real focus. This sometimes makes these results hard to find and, once found, these examples may be hard to follow as they rest on the general theoretical framework already developed up to that point in the text. Newcomers to the field as well as more application-oriented researchers would probably welcome an introduction to the topic which could give an overview of the theory of the one-dimensional ROF model, on the one hand, and introduce him to some of the ideas which is used in the analysis of the general case.

The present paper may be seen as such an overview. Here, the theory is developed from scratch and known properties of the ROF model are collected in one place and given efficient proofs within a unified framework. The style is expository and is considered to be accessible to anyone who wants to learn about total variation techniques—a little bit of measure theory, knowledge of basic functional analysis and of Sobolev spaces in one dimension are the only prerequisites needed to follow the text. In fact, once the total variation of a function has been properly defined, it turns out that the theory of the ROF model in one dimension hinges on little more than the projection theorem (onto closed convex sets) and completion by squares. As we shall see, this elementary setting allows us to introduce and highlight, in a concrete setting, some of the interesting phenomena which occur in the analysis of more general convex variational problems.

Some of the known results, apart from Theorems 1 and 2, for which new proofs have been supplied are: (i) Propositions 4 where some basic properties of the ROF model are re-derived, and (ii) Propositions 6 and 5, where some precise results on the rate of convergence \(u_\lambda \rightarrow f\), and of the value function \(E_\lambda (u_\lambda )\), are given as \(\lambda \) tends to zero—collecting all such result in one place! (iii) Moreover, a new and slick proof of the fact that \(u_\lambda \) is a semi-group with respect to \(\lambda \) is given (Proposition 10), and we also derive the infinitesimal generator of this semi-group. (iv) Finally, we indicated how our method can be modified prove the “lower convex envelope” interpretation of the solution to the *isotonic regression* problem, Sect. 11.

Some entirely new results have also emerged: Proposition 8, which asserts that \(u_\lambda \rightarrow f\) in \(\mathrm{BV}(I)\) as \(\lambda \rightarrow 0+\) whenever \(f\in \mathrm{BV}(I)\) is new. The author also believes that the statement in part 2 of Proposition 9 is new and likewise is its consequence in Corollary 2. The explicit solution to the ROF model given in Example 5 also seems to appear here for the first time. And, at least in the context of the ROF model, the improved convergence rate proved in Proposition 6 seems to have gone unnoticed until now. Moreover, Proposition 3 gives an improvement to a known “gap” estimate found in [44]. Finally, our treatment of the *fused lasso model* in the continuous setting in Sect. 10 seems to be the first of its kind.

## 2 Our Analysis Toolbox

Throughout this paper, *I* denotes an open, bounded interval (*a*, *b*), where \(a<b\) are real numbers and \({\bar{I}}=[a,b]\) is the corresponding closed interval.

\(C_0^1(I)\) denotes the space of continuously differentiable (test) functions \(\xi :I\rightarrow {\mathbf {R}}\) with compact support in *I*, and \(C({\bar{I}})\) is the space of continuous functions on the closure of *I*.

For \(1\le p \le \infty \), \(L^p(I)\) denotes the Lebesgue space of measurable functions \(f:I\rightarrow {\mathbf {R}}\) with finite *p*-norm; \(\Vert f\Vert _p:= \big (\int _a^b |f(x)|^p\,\mathrm{d}x\big )^{1/p} < \infty \), when *p* is finite, and \(\Vert f\Vert _\infty = {\text {ess sup}}_{x\in I}|f(x)| <\infty \) when \(p=\infty \). The space \(L^2(I)\) is a Hilbert space with the inner product \( \langle f,g\rangle =\langle f,g\rangle _{L^2(I)} := \int _a^b f(x)g(x)\,\mathrm{d}x\) and the corresponding norm \(\Vert f\Vert :=(\langle f,f\rangle _{L^2(I)})^{1/2}=\Vert f\Vert _2\).

*u*. This is a Hilbert space when equipped with the inner product \(\langle u,v\rangle _{H^1}:= \langle u,v\rangle + \langle u',v'\rangle \) and the corresponding norm \(\Vert u\Vert _{H^1} =( \Vert u'\Vert _2^2 + \Vert u\Vert _2^2 )^{1/2}\). Any \(u\in H^1(I)\) can, after correction on a set of measure zero, be identified with a unique function in \(C({\bar{I}})\). In particular, a unique value

*u*(

*x*) can be assigned to

*u*for every \(x\in {\bar{I}}\).

Finally, let *H* be a (general) real Hilbert space with inner product between \(u,v\in H\) denoted by \(\langle u,v \rangle \) and the corresponding norm \(\Vert u\Vert =\sqrt{\langle u,u \rangle }\). The following result is standard [13, Théorème V.2]:

### Proposition 1

*z*onto

*K*and is denoted \(x_*={Pr}_K(z)\).

*K*is a non-expansive mapping:

*K*;

*y*taking \(v={Pr}_K(x)\) and then applying (5) to the projection of

*x*with \(v={Pr}_K(y)\). If the resulting inequalities are added, the above inequality follows. This proves (9).

### Example 1

*F*satisfies \(F(a)=F(b)=0\). Then, \(T_\lambda \) is a closed convex subset of \(H_0^1(I)\)—non-empty because \(F\in T_\lambda \). Since \(\Vert u\Vert _{H_0^1(I)} = \Vert u'\Vert _{L^2(I)}\), we immediately see that (3) is equivalent to

### Example 2

*soft threshold map*or

*shrinkage map*defined by \({Sr}_{[-\lambda ,\lambda ]}(t) = t - {Pr}_{[-\lambda ,\lambda ]}(t)\). We shall meet both functions again in the sequel.

## 3 Precise Definition of the ROF Model

*u*is said to be a

*function of bounded variation*on

*I*, and

*J*(

*u*) is called the

*total variation*of

*u*(using the same notation as in [17, 19]). The set of all integrable functions on

*I*of bounded variation is denoted \(\mathrm{BV}(I)\), that is, \(\mathrm{BV}(I) = \big \{u\in L^1(I)\, :\, J(u) <\infty \big \}\). This becomes a Banach space when equipped with the norm \(\Vert u\Vert _{BV}:=J(u) + \Vert u\Vert _{L^1}\). Notice that, as already mentioned, if \(u\in H^1(I)\)\(J(u)=\int _I |u'|\,\mathrm{d}x < \infty \), so \(u\in \mathrm{BV}(I)\).

Let us illustrate how the definition works for a function with a jump discontinuity:

### Example 3

The following lemma shows that the definition of the total variation *J* and the space \(\mathrm{BV}(I)\) can be moved to a Hilbert space setting involving \(L^2\) and \(H_0^1\).

### Lemma 2

### Proof

If \(u\in \mathrm{BV}(I)\), then Sobolev’s lemma for functions of bounded variation, see [2, p. 152], ensures that \(u\in L^\infty (I)\). This in turn implies \(u\in L^2(I)\) because *I* is bounded. The (ordinary) Sobolev’s lemma asserts that \(H_0^1(I)\) is continuously embedded in \(L^\infty (I)\). Since *K* is the inverse image under the embedding map of the unit ball in \(L^\infty (I)\), which is both closed and convex, we draw the conclusion that *K* is closed and convex in \(H_0^1\).

*J*(

*u*) cannot exceed the right-hand side because the set \(\{\, \xi \in C_0^1(I)\,:\, \Vert \xi \Vert _\infty \le 1\,\}\) is contained in

*K*. To verify that equality holds, it is enough to prove the inequality

*J*(

*u*). To do this, we first notice that the inequality holds for all \(\zeta \in C_0^1(I)\). This follows by applying homogeneity to the definition of

*J*(

*u*). Second, if \(\xi \in H_0^1(I)\) we can use that \(C_0^1(I)\) is dense in \(H_0^1(I)\) and find functions \(\zeta _n\in C_0^1(I)\) such that \(\zeta _n\rightarrow \xi \) in \(H_0^1(I)\) (and in \(L^\infty (I)\) by the continuous embedding). It follows that

*I*, and that we may write

*f*, we sometimes write \(E_\lambda (f;u)\) instead of \(E_\lambda (u)\) and denote the corresponding ROF minimizer \(u_\lambda \) by the more elaborate \(u_\lambda (f)\). Well-posedness of the ROF model is demonstrated in Sect. 5 after some simple observations about the symmetry properties of \(E_\lambda (f,\cdot )\) have been presented in the next section.

## 4 Simple Symmetries of the ROF Functional

*J*satisfies \(J(u+c)=J(u)\) for all \(c\in {\mathbf {R}}\) and \(J(cu) = cJ(u)\) whenever \(c>0\). This implies the following formulas for the ROF functional:

*f*, let \(f_I=|I|^{-1}\int _I f\,\mathrm{d}x\) denote the mean value of

*f*on the interval

*I*. The formula (15) with \(c=-f_I\) becomes

*f*, we have \((u_\lambda )_I = f_I\) which is seen by taking mean values in (17).) Consequently, if \(f_I=0\) it is enough to minimize \(E_\lambda \) over functions

*u*with zero mean.

## 5 Existence Theory for the ROF Model

The following theorem contains the key result for developing the properties of the ROF model. It proves existence and uniqueness of the ROF minimizer \(u_\lambda \) for a general signal *f* in \(L^2\) and gives a necessary and sufficient characterization of this minimizer in terms of itself and a dual variable (\(\xi _\lambda \) in the theorem). Throughout the analysis, we assume that *f* has mean value zero in \(I=(a,b)\). This assumption, which is strictly speaking not needed in order for the result to be true, implies that the cumulative signal *F*(*x*) satisfies \(F(a)=F(b)=0\), and hence, \(F\in H_0^1(I)\), which will simplify the exposition.

### Theorem 3

This result is an instance of the Fenchel–Rockafellar theorem, see, for example, Brezis [13, p. 11]. It is tailored with our specific needs in mind and will be proved with our bare hands using the projection theorem. (The general version was used by Hintermüller and Kunisch [26] in their analysis of the multidimensional ROF model with the “Manhattan metric”.) In one of the first theoretical analyses of the ROF model, Chambolle and Lions [19] proved the existence of a minimizer (for a more general case) using the standard argument where a minimizing sequence is shown to converge weakly to function which can be shown to be the desired solution. The equality (18) has played an important role in the development of numerical algorithms for total variation minimization, either directly, as in Zhu et al. [44] or, more indirectly, as in Chambolle [17].

*M*and

*N*are arbitrary non-empty sets and \({\varPhi }:M\times N\rightarrow {\mathbf {R}}\) is any real-valued function, then the inequality

### Proof

*u*yields:

The equivalence of the two denoising models can now be established:

### Proof of Theorem 1

*W*satisfies \(F(x)-\lambda \le W(x)\le F(x)+\lambda \) on

*I*. Therefore, (23) is equivalent to

Our proof of Theorem 1 is essentially a change of variables and, as such, becomes almost a “derivation” of the taut string interpretation. We also get the existence and uniqueness of solutions to both models in one stroke. The proof given in [24] first shows that \(u_\lambda \) and \(W_\lambda '\) satisfy the same set of three necessary conditions, and that these conditions admit at most one solution. Then, it proceeds to drive home the point by establishing existence separately for both models. The argument assumes \(f\in L^\infty \) and involves a fair amount of measure-theoretic considerations. The proof of equivalence given in [39] is based on a thorough functional analytic study of Meyer’s *G*-norm and is not elementary.

The last two proofs contain the following useful observations:

### Corollary 1

*K*,

Denoising according to the ROF model is a mapping \(f\mapsto u_\lambda (f)\) which is contractive and continuous with respect to \(\lambda >0\):

### Proposition 2

The first assertion of the proposition is a well-known property of the Moreau–Yosida approximation (or of the proximal map), see [6, Theorem 17.2.1]. Both assertions are easy consequences of the corollary.

### Proof

*F*and \({\bar{F}}\) be the cumulative signals corresponding to

*f*and \({\bar{f}}\), and let \(W_\lambda \) and \({\bar{W}}_\lambda \) be the associated taut strings. The non-expansiveness of the shrinkage map (8) yields

It is an interesting observation that Theorem 3 associates a *unique* test function (or *dual variable*), \(\xi _\lambda \in K\), with the solution \(u_\lambda \) of the ROF model, namely the one which satisfies \(J(u_\lambda )=\langle u_\lambda , \xi _\lambda '\rangle _{L^2}\). In particular, as demonstrated in Example 3, because there are functions *u* for which the supremum in the definition of \(J(u)\) is not attained. An explicit example of a ROF minimizer looks as follows:

### Example 4

This example was considered in Strong and Chan [42], one of the very first papers treating explicit solutions to the ROF model. At the time, they did neither have access to the the duality formulation of the ROF model nor its taut string interpretation, and their solution consequently goes over several pages.

*f*, namely \(u_\lambda =(1-\lambda )f\) for \(0<\lambda <1\). Therefore,

*f*is an eigenfunction of the nonlinear operator \(u\mapsto (\mathrm {d}/\mathrm{d}x)(u'/|u'|)\). Eigenfunctions of this sort have been extensively studied in the two-dimensional setting in Bellettini et al. [9]. There it was shown that if

*f*is any such eigenfunction, then \(u_\lambda =(1-\lambda )_+f\) minimizes the two-dimensional ROF functional with regularization weight \(\lambda \) (see also [1, 8]).

## 6 Certifying the Quality of Approximate Solutions

*u*is close to \(u_\lambda \) and \(\xi \) close to \(\xi _\lambda \) in their respective functional spaces?

### Proposition 3

*u*and the dual variable \(\xi \) satisfy the relation \(u = f - \lambda \xi '\), as is the case in many numerical algorithms for TV-minimization, then we get the following estimate as a corollary of the above proposition:

### Proof

## 7 Consequences of Theorem 3 and the Taut String Interpretation

We now prove some known, and some new, properties of the ROF model.

The taut string algorithm suggests that \(W_\lambda =0\), and therefore, \(u_\lambda =0\), when \(\lambda \) is sufficiently large, and that \(W_\lambda \) must touch the sides \(F\pm \lambda \) of the tube \(T_\lambda \) when \(\lambda \) is small. These assertions can be made precise:

### Proposition 4

- (a)
The denoised signal \(u_\lambda = 0\) if and only if \(\lambda \ge \Vert F\Vert _\infty \), and

- (b)
if \(0< \lambda < \Vert F\Vert _\infty \) then \(\Vert F - W_\lambda \Vert _\infty = \lambda \).

- (c)
\(\Vert W_\lambda \Vert _\infty = \max (0, \Vert F\Vert _\infty - \lambda )\).

The results (a) and (b) are well known, and proofs, valid in the multidimensional case, can be found in Meyer’s treatise [30]. The natural estimate in (c) seems to be stated here for the first time. Notice that the maximum norm \(\Vert F\Vert _\infty \) of the cumulative signal *F* coincides, in one dimension, with the Meyer’s *G*-norm \(\Vert f\Vert _{*}\) of the signal *f*. Theorem 3 and the taut string interpretation of the ROF model allow us to give very short and direct proofs of all three properties.

### Proof

(a) By Theorem 1, the denoised signal \(u_\lambda \) is zero if and only if the taut string \(W_\lambda \) is zero. We know that \(W_\lambda =F-\lambda \xi _\lambda \) where, as seen from (22), \(\xi _\lambda \) is the projection in \(H_0^1(I)\) of \(\lambda ^{-1}F\) onto the closed convex set *K*. Therefore, \(u_\lambda = 0\) if and only if \(\lambda ^{-1}F\in K\), that is, if and only if \(\Vert F\Vert _\infty \le \lambda \), as claimed.

(b) If \(0< \lambda < \Vert F\Vert _\infty \), then \(u_\lambda \ne 0\) hence \(\Vert \xi _\lambda \Vert _\infty = 1\), by Theorem 3. The assertion now follows by taking norms in the identity \(\lambda \xi _\lambda = F-W_\lambda \).

(c) The equality clearly holds when \(\lambda \ge \Vert F\Vert _\infty \) because \(W_\lambda =0\) by (a). When \(c:=\Vert F\Vert _\infty -\lambda >0\), we use a truncation argument: If *W* belongs to \(T_\lambda \), then so does \({\hat{W}}:=\min (c,W)\), in particular \(c>0\) ensures that \({\hat{W}}(a)={\hat{W}}(b)=0\). Since \(E({\hat{W}})\le E(W)\), and \(W_\lambda \) is the (unique) minimizer of *E* over \(T_\lambda \), we conclude that \(\max _I W_\lambda \le c\). A similar argument gives \(-\min _I W_\lambda \le c\). Thus, \(\Vert W_\lambda \Vert _\infty \le \max (0, \Vert F\Vert _\infty - \lambda )\). The reverse inequality follows from (b). \(\square \)

As an application of the above result and of Theorem 3, we consider the following exactly solvable example which confirms the result of several numerical simulations and which would most likely be out of reach if one had used the methods developed in [42]:

### Example 5

On the interval, \(I=(0,2\pi ]\) and for some positive integer, *n* define \(f(x)=\cos (nx)\). If, for simplicity, we change the setting a little and impose periodic boundary conditions on the admissible functions *u* in \(E_\lambda (u)\), then we find that

*If*\(\lambda \ge 1/n\),

*then*\(u_\lambda =0\)

*and for*\(0<\lambda <1/n\),

*there exists a number*\(\alpha = \alpha (\lambda )\)

*such that*\(0<\alpha <1\)

*and the denoised signal is given by truncation,*

*Here,*\({Pr}_{[-\alpha ,\alpha ]}: {\mathbf {R}}\rightarrow {\mathbf {R}}\)

*denotes projection onto the closed (convex) interval*\([-\alpha ,\alpha ]\).

*The case with*\(n=3\)

*is illustrated in Fig.*2.

Let us first clarify what we mean by periodic boundary conditions. It is the same as defining and minimizing the ROF functional \(E_\lambda \) for functions defined on the unit circle *T* rather than over an interval *I*. The theory developed earlier, in particular Theorem 3, still holds if we replace \(H_0^1(I)\) by \(H^1(T)\) and define the closed convex set *K* accordingly.

*I*is zero. It remains to be verified that \(\Vert \xi _*\Vert _\infty \le 1\) and that the second of the two conditions holds. We consider the latter first. It is easy to see that

*a*in the limits of the integral is the smallest positive number such that \(\alpha = \cos (na)\). We note that \(0<a<\pi /2n\), see Fig. 2. Evaluation of this integral leads to the following condition on the number

*a*, and therefore, on \(\alpha \):

*I*and the verification is complete.

*value function*

### Proposition 5

### Proof

If \(\lambda _2\ge \lambda _1 > 0\), then the inequality \(E_{\lambda _2}(u) \ge E_{\lambda _1}(u)\) holds trivially for all *u*. Taking infimum over the functions in \(\mathrm{BV}(I)\) yields \(e(\lambda _2) \ge e(\lambda _1)\), so *e* is non-decreasing.

*u*, the right-hand side of the inequality

For \(\lambda \ge \Vert F\Vert _\infty \), we know from the previous theorem that \(u_\lambda = 0\), so \(e(\lambda )=E_\lambda (0)=\Vert f\Vert ^2/2\).

To prove the assertion about \(e(\lambda )\) as \(\lambda \) tends to zero from the right, we first assume that \(f\in \mathrm{BV}(I)\), in which case it follows that \(0 < e(\lambda ) \le E_\lambda (f) =\lambda J(f)\), so \(e(\lambda )=O(\lambda )\) because \(J(f)<\infty \).

If we merely have \(f\in L^2(I)\), an approximation argument is needed: For any \(\epsilon > 0\), take a function \(f_\epsilon \in H_0^1(I)\) such that \(\Vert f-f_\epsilon \Vert ^2/2 < \epsilon \). Then \(f_\epsilon \in \mathrm{BV}(I)\) and \(0\le e(\lambda ) \le E_\lambda (f_\epsilon ) < \lambda J(f_\epsilon ) + \epsilon .\) It follows that \(0 \le {{{{\mathrm{lim}}~{\mathrm{sup}}}}}_{\lambda \rightarrow 0+} e(\lambda ) < \epsilon \). Since \(\epsilon \) is arbitrary, we get \(\lim _{\lambda \rightarrow 0+}e(\lambda )=0\). \(\square \)

The map \(f\rightarrow u_\lambda \) is in fact the Moreau–Fenchel resolvent of the total variation functional *J*, see [6, Sect. 17.2.1], and the following proposition is a therefore a special case of a much more general result from the theory of Moreau–Fenchel approximations. Notice, however, that the second part of our proposition contains a refined quantification of the rate of convergence of \(u_\lambda \) to *f* as \(\lambda \rightarrow 0\) in that the common \(O(\lambda )\) is replaced by \(o(\lambda )\). The latter is *not* easily located in the literature.

### Proposition 6

For any \(f\in L^2(I)\), we have \(u_\lambda \rightarrow f\) in \(L^2\) as \(\lambda \rightarrow 0+\). Moreover, if \(f\in \mathrm{BV}(I)\) then \(J(u_\lambda )\rightarrow J(f)\) and \(\Vert u_\lambda -f\Vert _{L^2(I)}=o(\lambda ^{1/2})\) as \(\lambda \rightarrow 0+\).

### Proof

*J*, cf. [2, Prop. 3.6], was used. Since \(J(u_\lambda )\le J(f)\), we also obtain an estimate from below: \({{{{\mathrm{lim}}~{\mathrm{sup}}}}}_{\lambda \rightarrow 0+} J(u_\lambda ) \le J(f)\). We conclude that \(\lim _{\lambda \rightarrow 0+}J(u_\lambda ) = J(f)\). If this is used in (31), we find that \(\Vert u_\lambda -f\Vert ^2_{L^2(I)} = o(\lambda )\) as \(\lambda \rightarrow 0+\). \(\square \)

The next example shows that the convergence rate stated in the proposition is optimal in the sense that the exponent cannot be lowered.

### Example 6

On the interval \(I=(0,1)\) and with \(\alpha > 1\), let the data be given by \(f(x)=\alpha x^{\alpha -1}\) for \(x\in I\). Since *f* is monotone, \(J(f)= \lim _{\epsilon \rightarrow 0+}(f(1-\epsilon )-f(\epsilon ))=\alpha < \infty \), and hence, \(f\in \mathrm{BV}(I)\). Applying the taut string interpretation to the cumulative function is \(F(x)=x^\alpha \) shows that the denoised signal \(u_\lambda \) is constant near the interval end points and coincides with *f* in between. In fact, near the left interval end point we have \(u_\lambda (x)= \alpha (\lambda /(\alpha -1))^{(\alpha -1)/\alpha }\) for \(0\le x\le (\lambda /(\alpha -1))^{1/\alpha }\). It follows, by an easy computation, that for each choice of \( \alpha \) there exists a positive constant \(C=C(\alpha )\) such that \(\Vert f-u_\lambda \Vert _{L^2(I)}^2 \ge C\lambda ^{2-1/\alpha }\). This bounds the rate of convergence from below.

The following result, mentioned in by Burger and Osher [15, Sect. 5], shows that if the data *f* belong to the space of functions with bounded variation and satisfy an additional regularity condition, then the convergence rate for the limits \(u_\lambda \rightarrow f\) and \(J(u_\lambda )\rightarrow J(f)\) as \(\lambda \rightarrow 0+\) can be improved (considerably) to \(O(\lambda )\) .

### Proposition 7

The additional requirement on the data—the supremum in \(J(f)=\sup _{\xi \in K}\langle \xi ',f\rangle \) being attained for some \(\xi _0\)—is an instance of the so-called *source condition*. The source condition for non-quadratic convex variational regularization of inverse problems was identified in [15] and used to derive convergence rates for the generalized Bregman distances. The authors point out that the above result, which they write in a slightly different way, may be proved along the same lines as their other estimates. Here, we provide the details:

### Proof

*f*gives yet an inequality:

## 8 Proof and Applications of Theorem 2

We begin with the proof of the fundamental estimate on the derivative of the denoised signal:

### Proof of Theorem 2

The estimate (4) is a consequence of an extension of the original Lewy–Stampacchia inequality [28] to bilateral obstacle problems. The bilateral obstacle problem, in the one-dimensional setting, is to minimize the energy \(E(u):=\frac{1}{2}\int _a^b u'(x)^2\,\mathrm{d}x\) in (3) over the closed convex set \(C=\{ u\in H_0^1(I): \phi (x) \le u(x)\le \psi (x) \text { a.e. }I\}\). The obstacles are given by the functions \(\phi ,\psi \in H^1(I)\) which satisfy the conditions \(\phi <\psi \) on *I*, and \(\phi< 0 <\psi \) on \(\partial I =\{ a,b\}\). The latter ensures that *C* is non-empty.

Having established Theorem 2, we are able to prove the following result about the strong convergence in \(\mathrm{BV}(I)\) of the ROF minimizer as the regularization parameter approaches zero.

### Proposition 8

### Proof

Theorem 2 also implies the first part of

### Proposition 9

Suppose *f* is a piecewise constant function on *I*. 1) The ROF minimizer \(u_\lambda \) is again piecewise constant for all \(\lambda >0\). 2) There exists a number \({\bar{\lambda }}>0\) and a piecewise linear function \({\bar{\xi }}\in K\) such that \(\xi _\lambda ={\bar{\xi }}\) for all \(\lambda \), \(0<\lambda \le {\bar{\lambda }}\).

### Proof

*f*is piecewise constant, then there exists nodes \(a=x_0<x_1<\cdots<x_{N-1}<x_N=b\) which partitions the interval \(I=(a,b]\) into

*N*subintervals \(I_i=(x_{i-1},x_i]\) such that

*f*equals the constant value \(f_i\in {\mathbf {R}}\) on \(I_i\) for \(i=1,\ldots ,N\). The derivative of the signal becomes

*x*. We may assume that \(f_{i+1}\ne f_i\) and therefore \(b_i\ne 0\) for all \(i=1,\ldots ,N-1\). From (36), it follows that \(J(f)=\sum _{i=1}^{N-1} |b_i|<\infty \) so

*f*belongs to \(\mathrm{BV}(I)\) and Theorem 2 is applicable:

*f*. (The latter is the “edge-preserving” property of the one-dimensional ROF model.)

*f*, meet either the upper obstacle \(F+\lambda \) or the lower obstacle \(F-\lambda \), and here. The idea of the proof is to guess \({\bar{\xi }}\) and set

*I*and satisfies

*f*’s node is located.) It is now clear that we may find a positive number \({\bar{\lambda }}\) such that if \(0\le \lambda <{\bar{\lambda }}\) then each of the numbers \(b_i-\lambda d_i\), used to represent \({\bar{u}}'_\lambda \), has the same sign as \(b_i\). (In fact \({\bar{\lambda }}=(\max _{1\le i\le N-1}(d_i/b_i))^{-1}\) works). For any \(\lambda \) smaller than \({\bar{\lambda }}\), we have

The proposition implies the strongest approximation rate imaginable:

### Corollary 2

If *f* is a piecewise constant function, then \(\Vert f -u_\lambda \Vert _{L^2(I)} = O(\lambda )\) and \(J(f)-J(u_\lambda )=O(\lambda )\) as \(\lambda \rightarrow 0+\).

### Proof

## 9 ROF Denoising as a Semi-Group

*f*in a natural manner. Therefore, for each \(\lambda \ge 0\), define a mapping \(S_\lambda : L^2(I) \rightarrow L^2(I)\) by setting, for any \(f\in L^2(I)\),

- 1.
\(S_0= {\text {Id}}\), the identity mapping on \(L^2(I)\).

- 2.
For any \(f\in L^2(I)\), \([0,+\infty )\ni \lambda \mapsto S_\lambda (f)\in L^2(I)\) is continuous.

- 3.For any \(\lambda \ge 0\), \(S_\lambda \) is non-expansive;for all \(f_1,f_2\in L^2(I)\).$$\begin{aligned} \Vert S_\lambda (f_2)-S_\lambda (f_1)\Vert _{L^2(I)} \le \Vert f_2-f_1\Vert _{L^2(I)} \end{aligned}$$
- 4.
\(S_\lambda \circ S_\nu = S_{\lambda +\nu }\) for all \(\lambda .\nu \ge 0\)

### Proposition 10

A proof of the semi-group property can be found in [39]. However, the fundamental estimate in Theorem 2 and the characterization of the ROF minimizer in Theorem 3 allow us to present short and very direct proof of this result.

### Proof

*K*such that

*f*.

*K*. Using the above characterizations of \(u_\lambda \) and \({\bar{u}}\), we find that

The last part of the proof yields

### Corollary 3

If \(\lambda > 0 \) then \(J(u_{\lambda }) = \langle u_{\lambda } , \xi _\nu ' \rangle _{L^2(I)}\) for all \(\nu \), \(0<\nu \le \lambda \).

That is, the total variation of \(u_\lambda \) can be computed by taking inner product with any of the previous \(\xi _\nu \)’s.

We now know that ROF denoising defines a family of nonlinear operators \(\{S_\lambda \}_{\lambda \ge 0}\) which forms a contractive semi-group under composition. It is natural to seek the infinitesimal generator of this semi-group.

*f*alone. A first step in this characterization is based on Lemma 3, where the notion of the subgradient of a convex function is used.

*H*be a real Hilbert space and \({\varPhi }:H\rightarrow (-\infty ,\infty ]\) a lower semi-continuous convex functional defined in

*H*such that \({\text {Dom}}{\varPhi } := \{x\in H\, : \, {\varPhi }(x)<\infty \}\) is non-empty. Let \(x_0\in {\text {Dom}}{\varPhi }\) and suppose there is a vector \(y\in H\) such that the following inequality holds

*y*is called a

*subgradient*of \({\varPhi }\) at \(x_0\). The set of all such subgradients is called the

*subdifferential*of \({\varPhi }\) at \(x_0\) and is denoted \(\partial {\varPhi }(x_0)\). The map \(x\mapsto \partial {\varPhi }(x)\) is a set-valued operator. It is possible that \(\partial {\varPhi }(x_0)=\emptyset \). The map \(x\mapsto \partial {\varPhi }(x)\) is

*monotone*in the sense that if \(x,x_0\) are points satisfying \(\partial {\varPhi }(x)\ne \emptyset \) and \(\partial {\varPhi }(x_0)\ne \emptyset \), then for any \(\xi \in \partial {\varPhi }(x)\) and \(\xi _0\in \partial {\varPhi }(x_0)\) we have \(\langle \xi -\xi _0, x-x_0\rangle \ge 0\). This follows immediately from the definition of the subgradient.

If we take \(H=L^2(I)\) and let \({\varPhi }(u)=J(u)\) be the total variation of *u*, then \({\text {Dom}}J=\mathrm{BV}(I)\) and we can characterize the subdifferential \(\partial J\) in the following manner:

### Lemma 3

Let \(u_0\in {\text {Dom}}J\) then \(\eta \in \partial J(u_0)\) if and only if there exists \(\xi _0\in K\) such that \(\eta =\xi _0'\) and \(J(u_0)=\langle u_0,\xi _0'\rangle \).

The total variation *J* is considered as a function on \(L^2(I)\) so \(\eta \in L^2(I)\). Example 3 in Sect. 3 shows that there are cases where \(u_0\in {\text {Dom}}J\) but \(\partial J(u_0)=\emptyset \). The lemma is the one-dimensional instance of a more general multidimensional result, see Alter et al. [1, Lemma 1, p. 335] as well as Bellettini et al. [8]. For completeness of exposition, we provide a proof:

### Proof

*J*at \(u_0\), we get

*K*. For any \(v\in \mathrm{BV}(I)\), we set \(u=u_0+v\). An application of the triangle inequality for

*J*and the inequality (41) implies

*BV*with \(J(v) = \Vert v'\Vert _{L^1}\) and may therefore be used in the above estimate;

*J*then this inequality becomes

*J*, implies \(J(u_0) = \langle u_0,\xi _0'\rangle \), and the proof is complete. \(\square \)

In view of Lemma 3, the necessary and sufficient conditions (19a) and (18) in Theorem 3 can be reformulated as \(\lambda ^{-1}(f-u_\lambda )\in \partial J(u_\lambda )\), i.e. Fermat’s rule for a minimum (for convex functions.) We note in passing that the equation \(u+\partial J(u)\ni f\) has a unique solution (namely \(u_\lambda \)) for each \(f\in L^2(I)\). Hence, \(\partial J\) is a maximal monotone (or \(-\partial J\) is a maximal dissipative) set-valued operator on \(L^2\), cf. Barbu [7, p. 71].

*K*associated with the ROF minimizer \(u_\lambda =S_\lambda (f)\). Clearly, \(\xi _0\in K\). Since the identity

*f*in a dense subset of \(L^2(I)\). To see this, we use that the limit \(\lim _{\lambda \rightarrow 0+}\xi _\lambda := \xi _0\) exists whenever

*f*is a piecewise constant function (by Proposition 9) and that the piecewise constant functions on

*I*are dense in \(L^2(I)\). We have proved (Cf. [7, Th. 1.2, p.175]):

### Theorem 4

The infinitesimal generator of the nonlinear contractive semi-group \(\{ S_\lambda \}_{\lambda \ge 0}\) is the nonlinear set-valued mapping \(f\mapsto -\partial J(f)\).

In view of the characterization of *J*’s subgradient given in Lemma 3, it follows from Example 3 that there exist \(f\in L^2(I)\) such that \(\partial J(f)\) consists of more than one element. If *f* is such a function and the derivative \(-(\mathrm{d}/\mathrm{d}\lambda )S_\lambda (f)|_{\lambda =0}\) exists, then it is reasonable to ask which element of \(\partial J(f)\) this derivative corresponds to. The answer is provided by the following result.

### Proposition 11

Suppose \(f\in L^2(I)\) is such that the derivative \( (\mathrm{d}/\mathrm{d}\lambda )S_\lambda (f)\big |_{\lambda =0} := -\xi _0'\) exists. Then \(\xi _0'\) is the element in \(\partial J(f)\) with the smallest \(L^2\)-norm.

### Proof

*f*implies that the limit \(\xi _\lambda \rightarrow \xi _0\) exists in \(H_0^1(I)\) as \(\lambda \rightarrow 0+\). For each \(\lambda >0\), \(\xi _\lambda \) is characterized as the unique member of

*K*which solves

A very detailed and well-written analysis of the one-dimensional total variation flow can be found in the paper by Bonforte and Figalli [11]. Here, the theory of the flow is developed as a limit of a time-discretized problem (the Crandall–Liggett approach) which leads them to study certain properties of the ROF functional, some of which are close to ours (e.g. Lemma 2.3 in [11] seems to contain the same insight as our Theorem 2.)

## 10 Application to Fused Lasso

*u*to the ROF functional (14):

*f*is a piecewise constant function on \(I=(a,b]\) with equidistant nodes \(a=x_0<x_1<\cdots<x_{N-1}<x_N=b\) and with the constant value \(f_i\) on each subinterval \((x_{i-1},x_i]\) for \(i=1,\ldots ,N\), i.e.

*u*which are piecewise constant with the same nodes as

*f*and the constant value \(u_i\) on the

*i*th subinterval; \(u=\sum _{i=1}^N u_i\chi _{(x_{i-1},x_i]}\). Substitution of such

*f*and

*u*into (42) leads to the minimizing the following discretized functional,

*fused lasso model*(in Lagrange form), was introduced in Tibshirani et al. [43]. The functional is strictly convex and has compact sub-level sets and therefore possesses a unique minimum \(u^*=\sum _{i=1}^N u^*_i\chi _{(x_{i-1},x_i]}\) which is called the

*fused lasso signal approximator*. The idea of the fused lasso model is to simultaneously promote sparsity in \(u^*\) and its (discrete) derivative. As shown by Friedman et al. [22], there is a close relationship between the fused lasso model and the discrete ROF model (i.e. \(E_{\lambda ,\mu }\) with \(\mu =0\)): the fused lasso signal approximator \(u^*\) can be obtained from the discrete ROF minimizer \(u_\lambda = \sum _{i=1}^N u_{\lambda ,i}\chi _{(x_{i-1},x_i]}\) by soft thresholding at level \(\mu \),

### Theorem 5

### Proof

*I*is bounded.) This formula holds because \(L^\infty \) is the dual of \(L^1\), cf. [38, Thm. 6.16]. It follows that (42) may be written as

*B*are both closed in \(L^2(I)\). The same is true for the dilated sets \(\lambda K'\) and \(\mu B\) and for their Minkowski sum

## 11 Application to Isotonic Regression

*f*in Fig. 3.

*J*of the ROF functional by regularization term \(J_\uparrow \) that can distinguish between functions that are non-decreasing or not. To achieve this, we set

Again, we may assume that *f* mean value equal to zero so that the cumulative function \(F(x):=\int _a^xf(t)\,\mathrm{d}t\) belongs to \(H_0^1(I)\). This will be used below.

Notice that if the two conditions of (57) are combined, the solution to the isotonic regression problem (54) can be characterized by the conditions \(u_\uparrow \in L^2_\uparrow (I)\) and \(f-u_\uparrow \in K_+':=\{ \xi '\,:\, \xi \in K_+\}\) and \(\langle u_\uparrow \, ,\, f-u_\uparrow \rangle =0\). Thus, \(K_+'\) is the dual cone of \(L^2_\uparrow (I)\) and the pair \(u_\uparrow , f-u_\uparrow \) is the Moreau decomposition of *f*.

### Example 7

*f*.

The solution \(W_\uparrow \) of the obstacle problem satisfies \(W_\uparrow ''\ge 0\) (this is the “easy” part of the original Lewy–Stampacchia inequality, \(0\le W_\uparrow ''\le (F'')^+\)) and is therefore automatically a convex function. In fact, by optimality, \(W_\uparrow \) is the maximal convex function lying below *F*, i.e. it is the *lower convex envelope* of *F*. Similar problems are considered in the multidimensional case, using higher-order methods (the space of functions with bounded Hessians), in Hinterberger and Scherzer [25].

## 12 Higher-Order Total Variation Regularization

*n*th order ROF model is defined as minimization of the functional

*n*th derivative of \(u \in C^n(a,b)\) and \(\lambda >0\) is the regularization parameter. Here, we shall treat only the case \(n=2\), i.e. the second-order ROF model. In the multidimensional setting, second- and higher-order regularizations have been considered early on by Pöschl and Scherzer [36] as well as by Breides et al. [12]. They have subsequently found applications in restoration of MRI [27] and image inpainting [35], to mention just two examples. A detailed account of the second-order regularization in image restoration, as well as additional references, can be found in Begrounioux [10]. The one-dimensional case has be studied in a purely discrete setting in Steidl et al. [41], and one-dimensional examples are found in [34, 36], among others.

*J*. It can be shown that any \(u\in BV_2(I)\) is automatically a member of \(H^1(I)\) (in fact \(u\in W^{1,\infty }(I)\) holds). In particular, \(u\in L^2(I)\) and therefore the definition of the second-order total variation \(J_2\) may be rephrased as

*f*satisfies

Using the methods introduced in the study of the (ordinary) ROF model, it is possible to prove the following result:

### Theorem 6

*f*. Since \(f\in L^2(I)\), we see that \(F\in H_0^2(I)\). If we introduce the new variable \(W=F-\lambda \xi \), where \(\xi \in K_2\), then the above minimization problem becomes:

The above procedure for finding \(u_\lambda \) has a mechanical interpretation: The second-order regularized denoising \(u_\lambda \) of *f* is the second (weak) derivative of the function \(W_\lambda \) which in turn corresponds to the energy minimizing shape of an ideal elastic beam (a cubic spline), clamped at the end points of *I* and forced to lie between a pair of parallel walls at a uniform distance \(\lambda \) from the graph of the bi-cumulative signal *F*. Denoising using *n*th order total variation regularization may be analyzed in a similar manner, but for \(n>2\) there is no obvious mechanical interpretation.

This “restricted spline” interpretation of the second-order ROF model can be used to “guess” the analytical solution of the second-order ROF model for simple *f*. The following example is probably the simplest imaginable non-trivial example.

### Example 8

*f*satisfies \(\int _If =\int _Ixf = 0\) it follows that \(\xi _\lambda = \min (12,\lambda ^{-1})F\in H_0^2(I)\). Moreover, \(\Vert \xi _\lambda \Vert _\infty \le 1\) with equality if and only if \(0\le \lambda \le 1/12\). It is now easy to verify that (59b) holds:

The above explicit example was first found by Papafitsoros and Breides [34], see their Fig. 8d, but the derivation presented here is considerably shorter. Their paper focuses on a regularization term which is a kind of weighted combination of the first- and second-order total variation and contains only the latter as a special case. The example may be viewed as the second-order analogue to Example 4 where which considered denoising of a simple piecewise constant signal in the ROF model; in both cases, the restored signal is obtained by a simple scaling of the data.

Theorem 6 and its “restricted spline” interpretation make it clear that certain results for the (ordinary) ROF model carries over to the second-order case. For instance, parts (a) and (b) of Proposition 4 still hold: \(u_\lambda =0\) if and only if \(\Vert F\Vert _\infty <\lambda \) and that \(\Vert W_\lambda -F\Vert _\infty = \lambda \) also holds in that case, where *F* is the bi-cumulative function and \(W_\lambda \) the optimal spline.

## 13 Concluding Remarks

*N*-dimensional ROF model can be developed along almost the same lines, it seems, with exception for the uniqueness of the dual variables and the fundamental estimate; when \(N\ge 2\) the dual variables are vector fields \(\xi =(\xi _1,\ldots ,\xi _N)\) whose magnitudes are bounded by one. As is well known, only the divergence of this vector field is uniquely determined, not the vector field itself. The natural generalization of the fundamental estimate does not seem to hold for \(N\ge 2\). The reason why it fails is that if

*f*is the characteristic function of the unit square in the plane, then the denoising \(u_\lambda \) has level curves which looks like squares with the corners rounded of, see [18, Sect. 2.2.3]. This means that the support of the gradient of denoised signal is not contained in the support of the gradient of the original signal (rather, the support shifts inward, inside the square), and this is incompatible with a bound like the fundamental estimate. It would be interesting to know if there exists some alternative estimate on the denoised signal which could replace the fundamental estimate in higher dimensions.

## Notes

### Acknowledgements

Open access funding provided by Lund University. The author wishes to thank Viktor Larsson, now at the Department of Computer Science at ETH Zürich, for reading and commenting on early drafts of this paper.

## References

- 1.Alter, F., Caselles, V., Chambolle, A.: A characterization of convex calibrable sets in \({\mathbf{R}}^{N}\). Math. Ann.
**332**, 329–366 (2005)MathSciNetzbMATHGoogle Scholar - 2.Ambrosio, L., Fusco, N., Pallara, D.: Functions of Bounded Variation and Free Discontinuity Problems. Clarendon Press, Oxford (2000)zbMATHGoogle Scholar
- 3.Andreu, F., Ballester, C., Caselles, V., Mazón, J.M.: Minimizing total variation flow. Differ. Integral Equ.
**14**(3), 321–360 (2001)MathSciNetzbMATHGoogle Scholar - 4.Andreu, F., Caselles, V., Diaz, J.I., Mazón, J.M.: Qualitative properties of the total variation flow. J. Funct. Anal.
**188**(2), 516–547 (2002)MathSciNetzbMATHCrossRefGoogle Scholar - 5.Anevski, D., Soulier, P.: Monotone spectral density estimation. Ann. Stat.
**39**, 418–438 (2011)MathSciNetzbMATHCrossRefGoogle Scholar - 6.Attouch, H., Buttazzo, G., Michaille, G.: Variational Analysis in Sobolev and BV Spaces; Applications to PDEs and Optimization, 2nd edn. SIAM, New York (2015)zbMATHGoogle Scholar
- 7.Barbu, V.: Nonlinear Semigroups and Differential Equations in Banach Spaces. Nordhoff International Publishing, Leyden (1976)zbMATHCrossRefGoogle Scholar
- 8.Bellettini, G., Caselles, V., Novaga, M.: The total variation flow in \({\mathbf{R}}^{N}\). J. Differ. Geom.
**144**, 475–525 (2002)zbMATHGoogle Scholar - 9.Bellettini, G., Caselles, V., Novaga, M.: Explicit solution of the eigenvalue problem \(-\text{ div }(du/|du|)=u\) in \({\mathbf{R}}^2\). SIAM J. Math. Anal.
**4**, 1095–1129 (2005)zbMATHGoogle Scholar - 10.Bergounioux, M.: Second order variational models for image texture analysis. Adv. Imaging Electron Phys.
**181**, 35–124 (2014). https://doi.org/10.1016/B978-0-12-800091-5.00002-1 CrossRefGoogle Scholar - 11.Bonforte, M., Figalli, A.: Total variation flow and sign fast diffusion in one dimension. J. Differ. Equ.
**252**, 4455–4480 (2012). https://doi.org/10.1016/j.jde.2012.01.003 MathSciNetzbMATHCrossRefGoogle Scholar - 12.Bredies, K., Kunisch, K., Pock, T.: Total generalized variation. SIAM J. Imaging Sci.
**3**(3), 492–526 (2010). https://doi.org/10.1137/090769521 MathSciNetzbMATHCrossRefGoogle Scholar - 13.Brézis, H.: Analyse fonctionelle–Théorie et applications. Dunod, Paris (1999)Google Scholar
- 14.Briani, A., Chambolle, A., Novaga, M., Orlandi, G.: On the graient flow of a one-homogeneous functional. Conflu. Math.
**3**(4), 617–635 (2011). https://doi.org/10.1016/j.jde.2012.01.003 zbMATHCrossRefGoogle Scholar - 15.Burger, M., Osher, S.: Convergence rates of convex variational regularization. Inverse Probab.
**20**, 1411–1421 (2004)MathSciNetzbMATHCrossRefGoogle Scholar - 16.Caselles, V., Chambolle, A., Novaga, M.: The discontinuity set of solutions of the TV denoising problem and some extensions. Multiscale Model. Simul.
**6**(3), 879–894 (2007)MathSciNetzbMATHCrossRefGoogle Scholar - 17.Chambolle, A.: An algorithm for total variation minimization and applications. J. Math. Imaging Vis.
**20**, 89–97 (2004)MathSciNetzbMATHCrossRefGoogle Scholar - 18.Chambolle, A., Caselles, V., Cremers, D., Novaga, M., Pock, T.: An introduction to total variation for image analysis. Theor. Found. Numer. Methods Sparse Recovery
**9**, 263–340 (2010)MathSciNetzbMATHGoogle Scholar - 19.Chambolle, A., Lions, P.-L.: Image recovery via total variation minimization and related problems. Numer. Math.
**76**, 167–188 (1997)MathSciNetzbMATHCrossRefGoogle Scholar - 20.Davies, P., Kovac, A.: Local extremes, runs, strings and multiresolution. Ann. Stat.
**29**, 1–65 (2001)MathSciNetzbMATHCrossRefGoogle Scholar - 21.Dümbgen, L., Kovac, A.: Extension of smoothing via taut strings. Electron. J. Stat.
**3**, 41–75 (2009)MathSciNetzbMATHCrossRefGoogle Scholar - 22.Friedman, J., Hastie, T., Höfling, H., Tibshirani, R.: Sparsity and smoothness via the fused lasso. Ann. Appl. Stat.
**1**(2), 302–332 (2007)MathSciNetCrossRefGoogle Scholar - 23.Gigli, N., Mosconi, S.: The abstract Lewy–Stampacchia inequality and applications. J. Math. Pures Appl.
**104**, 258–275 (2015)MathSciNetzbMATHCrossRefGoogle Scholar - 24.Grassmair, M.: The equivalence of the taut string algorithm and BV-regularization. J. Math. Imaging Vis.
**27**, 56–66 (2007)MathSciNetCrossRefGoogle Scholar - 25.Hinterberger, W., Scherzer, O.: Variational methods on the space of functions of bounded Hessian for convexification and denoising. Computing
**76**, 109–133 (2006)MathSciNetzbMATHCrossRefGoogle Scholar - 26.Hintermüller, W., Kunisch, K.: Total bounded variation regularization as a bilaterally constrained optimization problem. SIAM J. Appl. Math.
**64**(4), 1311–1333 (2004)MathSciNetzbMATHCrossRefGoogle Scholar - 27.Knoll, F., Bredies, K., Pock, T., Stollberger, R.: Second order total generalized variation (TGV) for MRI. Magn. Reson. Med.
**65**, 480–480 (2011)CrossRefGoogle Scholar - 28.Lewy, H., Stampacchia, G.: On the smoothness of superharmonics which solve a minimum problem. J. Anal. Math.
**23**, 227–236 (1970)MathSciNetzbMATHCrossRefGoogle Scholar - 29.Mammen, E., van de Geer, S.: Locally adaptive regression splines. Ann. Stat.
**25**(1), 387–413 (1997)MathSciNetzbMATHCrossRefGoogle Scholar - 30.Meyer, Y.: Oscillating Patterns in Image Processing and Nonlinear Evolution Equations. American Mathematical Society, New York (2000)Google Scholar
- 31.Niyobuhungiro, J.: Exact minimizers in real interpolation–characterization and applications. Linkoping Studies in Science and Technology Dissertations, vol. 1650 (2014)Google Scholar
- 32.Overgaard, N.C.: On the taut string interpretation of the one-dimensional Rudin–Osher–Fatemi model: a new proof, a fundamental estimate and some applications, pp. 1–19. arXiv:1710.10985 [eess.IV] (2017)
- 33.Overgaard, N.C.: On the taut-string interpretation of the one-dimensional Rudin–Osher–Fatemi model. Proc. ICPRAM
**2018**, 233–244 (2018)Google Scholar - 34.Papafitsoros, K., Bredies, K.: A study of the one dimensional total generalised variation regularisation problemGoogle Scholar
- 35.Papafitsoros, K., Schönlieb, C.B., Sengul, B.: Combined first and second order total variation inpainting unsin split Bregman. Image Process. On Line
**3**, 112–135 (2013)CrossRefGoogle Scholar - 36.Pöschl, C., Scherzer, O.: Characterization of minimizers of convex regularization functionals. Contemp. Math. (2008). https://doi.org/10.1090/conm/451/08784 zbMATHCrossRefGoogle Scholar
- 37.Rudin, L., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Physica D
**60**, 259–268 (1992)MathSciNetzbMATHCrossRefGoogle Scholar - 38.Rudin, W.: Real and Complex Analysis, 3rd edn. McGraw-Hill, New York (1986)zbMATHGoogle Scholar
- 39.Scherzer, O., Grasmair, M., Grossauer, H., Haltmeier, M., Lenzen, F.: Variational Methods in Imaging. Springer, New York (2009)zbMATHGoogle Scholar
- 40.Setterqvist, E.: Taut strings and real interpolation. Linkoping Studies in Science and Technology Dissertations, vol. 1801 (2016)Google Scholar
- 41.Steidl, G., Didas, S., Neumann, J.: Splines in higher order TV regularization. Int. J. Comput. Vis.
**70**(3), 241–255 (2006)CrossRefGoogle Scholar - 42.Strong, D., Chan, T.: Exact solutions to total variation regularization problems. CAM Rep.
**96–41**, 259–268 (1996)Google Scholar - 43.Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., Knight, K.: Sparsity and smoothness via the fused lasso. J. R. Stat. Soc. Ser. B
**67**(1), 91–108 (2005)MathSciNetzbMATHCrossRefGoogle Scholar - 44.Zhu, M., Wright, S.J., Chan, T.F.: Duality-based algorithms for total variation image restoration. Comput. Optim. Appl.
**47**(3), 377–400 (2010)MathSciNetzbMATHCrossRefGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.