# Infimal Convolution Regularisation Functionals of BV and \(\varvec{\mathrm {L}}^{\varvec{p}}\) Spaces

- First Online:

- Received:
- Accepted:

## Abstract

We study a general class of infimal convolution type regularisation functionals suitable for applications in image processing. These functionals incorporate a combination of the total variation seminorm and \(\mathrm {L}^{p}\) norms. A unified well-posedness analysis is presented and a detailed study of the one-dimensional model is performed, by computing exact solutions for the corresponding denoising problem and the case \(p=2\). Furthermore, the dependency of the regularisation properties of this infimal convolution approach to the choice of *p* is studied. It turns out that in the case \(p=2\) this regulariser is equivalent to the Huber-type variant of total variation regularisation. We provide numerical examples for image decomposition as well as for image denoising. We show that our model is capable of eliminating the staircasing effect, a well-known disadvantage of total variation regularisation. Moreover as *p* increases we obtain almost piecewise affine reconstructions, leading also to a better preservation of hat-like structures.

### Keywords

Total Variation Infimal convolution Denoising Staircasing \(\mathrm {L}^{p}\) norms Image decomposition## 1 Introduction

*p*and apply it successfully to image denoising.

### 1.1 Context

*regulariser*is denoted here by \(\Psi \). We assume that the data

*f*, defined on an open, bounded and connected domain \(\Omega \subset \mathbb {R}^{2}\), have been corrupted through a bounded, linear operator

*T*and additive (random) noise. Different values of

*s*can be considered for the first term of (1.2), the

*fidelity term*. For example, models incorporating an \(\mathrm {L}^{2}\) fidelity term (resp. \(\mathrm {L}^{1}\)) have been shown to be efficient for the restoration of images corrupted by Gaussian noise (resp. impulse noise). Of course, other types of noise can also be considered and in those cases the form of the fidelity term is adjusted accordingly. Typically, one or more parameters within \(\Psi \) balance the strength of regularisation against the fidelity term in the minimisation (1.2).

*staircasing*. Recall at this point that for two-dimensional images \(u\in \mathrm {L}^{1}(\Omega )\), the definition of the total variation functional reads

*Du*is a finite Radon measure and \(\mathrm {TV}(u)=\Vert Du\Vert _{\mathcal {M}}\). Moreover if \(u\in \mathrm {W}^{1,1}(\Omega )\) then \(\mathrm {TV}(u)=\int _{\Omega }|\nabla u|\,\mathrm{d}x\), i.e. the total variation is the \(\mathrm {L}^{1}\) norm of the gradient of

*u*. Higher-order extensions of the total variation functional are widely explored in the literature e.g. [4, 5, 9, 11, 12, 27, 29, 30, 34]. The incorporation of second-order derivatives is shown to reduce or even eliminate the staircasing effect. The most successful regulariser of this kind is the second-order total generalised variation (TGV) introduced by Bredies et al. [5]. Its definition reads

*w*, whose symmetrised distributional derivative \(\mathcal {E}w\) is a finite Radon measure. This is a less regular space than the usual space of functions of bounded variation \(\mathrm {BV}(\Omega )\) for which the full gradient

*Du*is required to be a finite Radon measure. Note that if the variable

*w*in the definition (1.4) is forced to be the gradient of another function then we obtain the classical infimal convolution regulariser of Chambolle–Lions [9]. In that sense \(\mathrm {TGV}\) can be seen as a particular instance of infimal convolution, optimally balancing first and second-order information.

In the discrete formulation of \(\mathrm {TGV}\) (as well as for \(\mathrm {TV}\)) the Radon norm is interpreted as an \(\mathrm {L}^{1}\) norm. The motivation for the current and the follow-up paper [8] is to explore the capabilities of \(\mathrm {L}^{p}\) norms within first-order regularisation functionals designed for image processing purposes. The use of \(\mathrm {L}^{p}\) norms for \(p> 1\) has been exploited in different contexts—infinity and *p*-Laplacian (cf. e.g. [16] and [26] respectively).

### 1.2 Our Contribution

Comparing the definition (1.1) with the definition of \(\mathrm {TGV}\) in (1.4), we see that the Radon norm of the symmetrised gradient of *w* has been substituted by the \(\mathrm {L}^{p}\) norm of *w*, thus reducing the order of regularisation. Up to our knowledge, this is the first paper that provides a thorough analysis of \(\mathrm {TV}\)–\(\mathrm {L}^{p}\) infimal convolution models (1.1) in this generality. We show that the minimisation in (1.1) is well-defined and that \(\mathrm {TVL}_{\alpha ,\beta }^{p}(u)<\infty \) if and only if \(\mathrm {TV}(u)<\infty \). Hence \(\mathrm {TVL}_{\alpha ,\beta }^{p}\) regularised images belong to \(\mathrm {BV}(\Omega )\) as desired.

*Huber*\(\mathrm {TV}\) [24], with the functional (1.6) having a close connection to (1.1) itself. Huber total variation is a smooth approximation of total variation and even though it has been widely used in the imaging and inverse problems community, it has not been analysed adequately. Hence, as a by-product of our analysis, we compute exact solutions of the one-dimensional Huber TV denoising problem. An analogous connection of the \(\mathrm {TVL}_{\alpha ,\beta }^{p}\) functional with a generalised Huber \(\mathrm {TV}\) regularisation is also established for general

*p*.

We proceed with exhaustive numerical experiments focusing on (1.5). Our analysis is confirmed by the fact that the analytical results coincide with the numerical ones. Furthermore, we observe that even though a first-order regularisation functional is used, we are capable of eliminating the staircasing effect, similar to Huber \(\mathrm {TV}\). By using the *Bregman iteration* version of our method [32], we are also able to enhance the contrast of the reconstructed images, obtaining results very similar in quality to the \(\mathrm {TGV}\) ones. We observe numerically that high values of *p* promote almost affine structures similar to second-order regularisation methods. We shed more light of this behaviour in the follow-up paper [8] where we study in depth the case \(p=\infty \). Let us finally note that we also consider a modified version of the functional (1.1) where *w* is restricted to be a gradient of another function leading to the more classical infimal convolution setting. Even though, this modified model is not so successful in staircasing reduction, it is effective in decomposing an image into piecewise constant and smooth parts.

### 1.3 Organisation of the Paper

After the introduction we proceed with the introduction of our model in Sect. 2. We prove the well-posedness of (1.1), we provide an equivalent definition and we prove its Lipschitz equivalence with the \(\mathrm {TV}\) seminorm. We finish this section with a well-posedness result of the corresponding \(\mathrm {TVL}_{\alpha ,\beta }^{p}\) regularisation problem using standard tools.

In Sect. 3 we establish a link between the \(\mathrm {TVL}_{\alpha ,\beta }^{p}\) functional and its *p*-homogeneous analogue (using the *p*-th power of \(\Vert \cdot \Vert _{\mathrm {L}^{p}(\Omega )}\)). The *p*-homogeneous functional (for \(p=2\)) is further shown to be equivalent to Huber total variation, while analogous results are obtained for \(p\ne 2\).

We study the corresponding one-dimensional model in Sect. 4 focusing on the \(\mathrm {L}^{2}\) fidelity denoising case. More specifically, after deriving the optimality conditions using Fenchel–Rockafellar duality in Sect. 4.1, we explore the structure of solutions in Sect. 4.2. In Sect. 4.3 we compute exact solutions for the case \(p=2\), considering a simple step function as data.

In Sect. 5 we present a variant of our model suitable for image decomposition purposes, i.e. geometric decomposition into piecewise constant and smooth structures.

Section 6 focuses on numerical experiments. Confirmation of the obtained one-dimensional analytical results is done in Sect. 6.2, while two-dimensional denoising experiments are performed in Sect. 6.3 using the split Bregman method. There, we show that our approach can lead to elimination of the staircasing effect and we also show that by using a Bregmanised version we can also enhance the contrast, achieving results very close to \(\mathrm {TGV}\), a method considered state of the art in the context of variational regularisation. We finish the section with some image decomposition examples and we summarise our results in Sect. 7.

In the appendix, we remind the reader of some basic facts from the theory of Radon measures and \(\mathrm {BV}\) functions.

## 2 Basic Properties of the \(\mathrm {TVL}_{\alpha ,\beta }^{p}\) Functional

*p*case, the results of this section are stated and proved for \(p=\infty \) as well, since the proofs are similar.

The next proposition asserts that the minimisation in (1.1) is indeed well-defined. We omit the proof, which is based on standard coercivity and weak lower semicontinuity techniques:

**Proposition 2.1**

Let \(u\in \mathrm { BV(\Omega )}\) with \(1< p\le \infty \) and \(\alpha ,\beta >0\). Then the minimum in the definition (1.1) is attained.

*dual*formulation:

*q*denotes here the conjugate exponent of

*p*, see (8.4). The following proposition shows that the two expressions coincide indeed. Recall first that for a functional \(F:X\rightarrow \overline{\mathbb {R}}\) the effective domain is defined as \(\mathrm{dom}F = \left\{ x\in X: F(x)<\infty \right\} \), while the indicator and characteristic functions of \(A\subseteq X\) are defined as

*X*and its dual \(X^{*}\). Finally, recall that the convex conjugate \(F^{*}:X^{*}\rightarrow \overline{\mathbb {R}}\) of

*F*is defined as \(F^{*}(x^{*})=\sup \limits _{x\in X}\left\langle x^{*},x\right\rangle _{}-F(x)\).

**Proposition 2.2**

*Proof*

*Remark 2.3*

Note that using the dual formulation of \(\mathrm {TVL}^{p}_{\alpha ,\beta }\) one can easily derive that the functional is lower semicontinuous with respect to the strong \(\mathrm {L}^{1}\) topology since it is a pointwise supremum of continuous functions.

The following lemma shows that the \(\mathrm {TVL}^{p}_{\alpha ,\beta }\) functional is Lipschitz equivalent to the total variation.

**Proposition 2.4**

*Proof*

*w*, yields the left-hand side inequality in (2.2).

**Theorem 2.5**

Let \(1<p\le \infty \) and \(f\in \mathrm {L}^{s}(\Omega )\). If \(T(\mathcal {X}_{\Omega })\ne 0\) then there exists a solution \(u\in \mathrm {L}^{s}(\Omega )\cap \mathrm {BV}(\Omega )\) for the problem (2.7). If \(s>1\) and *T* is injective then the solution is unique.

*Proof*

The proof is a straightforward application of the direct method of calculus of variations. We simply take advantage of the inequality (2.2) and the compactness theorem in \(\mathrm {BV}(\Omega )\), see Appendix, along with the lower semicontinuity property of \(\mathrm {TVL}^{p}_{\alpha ,\beta }\). We also refer the reader to the corresponding proofs in [34, 39].\(\square \)

*T*is the identity function (denoising task) where rigorous analysis can be carried out. From now on, we also focus on the case where

*p*is finite, as the case \(p=\infty \) is studied in the follow-up paper [8]. We thus define the following problemor equivalently

## 3 The *p*-Homogeneous Analogue and Relation to Huber TV

*p*-homogeneous analogue We show in Proposition 3.2 that there is a strong connection between the models \((\mathcal {P})\) and \((\mathcal {P}_{p-hom})\). The reason for the introduction of \((\mathcal {P}_{p-hom})\) is that, in certain cases, it is technically easier to derive exact solutions for \((\mathcal {P}_{p-hom})\) rather than for \((\mathcal {P})\) straightforwardly, see Sect. 4.3. Moreover, we can guarantee the uniqueness of the optimal \(w^{*}\) in \((\mathcal {P}_{p-hom})\), sinceand thus \(w^{*}\) is unique as a minimiser of a strictly convex functional. The next proposition states that, unless

*f*is a constant function then the optimal \(w^{*}\) in \((\mathcal {P}_{p-hom})\) cannot be zero but nonetheless converges to zero as \(\beta \rightarrow \infty \). In essence, this means that one cannot obtain \(\mathrm {TV}\) type solutions with the

*p*-homogeneous model.

**Proposition 3.1**

Let \(1<p<\infty \), \(f\in \mathrm {L}^{2}(\Omega )\) and let \((w^{*},u^{*})\) be an optimal solution pair of the *p*-homogeneous problem \((\mathcal {P}_{p-hom})\). Then \(w^{*}=0\) if and only if *f* is a constant function. For general data *f*, we have that \(w^{*}\rightarrow 0\) in \(\mathrm {L}^{p}(\Omega )\) when \(\beta \rightarrow \infty \).

*Proof*

*f*is constant then (0,

*f*) is the optimal pair for \((\mathcal {P}_{p-hom})\). Suppose that \((w^{*},u^{*})\) solve \((\mathcal {P}_{p-hom})\). Notice that in this case we also have

*f*is a constant function.

We further establish a connection between the 1-homogeneous \((\mathcal {P})\) and the *p*-homogeneous model \((\mathcal {P}_{p-hom})\):

**Proposition 3.2**

Let \(1<p<\infty \) and \(f\in \mathrm {L}^{2}(\Omega )\) not a constant. A pair \((w^{*},u^{*})\) is a solution of \((\mathcal {P}_{p-hom})\) with parameters \((\alpha ,\beta _{p-hom})\) if and only if it is also a solution of \((\mathcal {P})\) with parameters \((\alpha ,\beta _{1-hom})\) where \(\beta _{1-hom}=\beta _{p-hom}\Vert w^{*}\Vert _{\mathrm {L}^{p}(\Omega )}^{p-1}\).

*Proof*

*f*is not a constant by the previous proposition we have that \(w^{*}\ne 0\). Note that for an arbitrary function \(u\in \mathrm {BV}(\Omega )\):This means that \(w^{*}\) is an admissible solution for both problems \((\mathcal {P})\) and \((\mathcal {P}_{p-hom})\), with the corresponding set of parameters \((\alpha ,\beta _{1-hom})\) and \((\alpha ,\beta _{p-hom})\), respectively. The fact that the same holds for \(u^{*}\) as well, comes from the fact that in both problems we have

Finally, it turns out that for \(p=2\), problem \((\mathcal {P}_{p-hom})\) is essentially equivalent to the widely used Huber total variation regularisation, [24]. In fact we can show that for \(1<p<\infty \)\((\mathcal {P}_{p-hom})\) is equivalent to a generalised Huber total variation regularisation, see also [23]. This is proved in the next proposition.

**Proposition 3.3**

*Du*, cf. Appendix.

*Proof*

*p*–power penalisation for the general \(1<p<\infty \)) and linear penalisation for large gradients.

*w*is converging to 0 when \(\beta \rightarrow \infty \). On the other hand when

*p*is getting large, Fig. 1b, small gradients are essentially not penalised at all, allowing the gradient to be almost constant, equal to its maximum value, leading to piecewise affine structures. We refer to some of the numerical examples in Sect. 6.2 and also the second part of this paper [8] where the case \(p=\infty \) is examined in detail.

## 4 The One-Dimensional Case

In order to get more insights into the structure of solutions of the problem \((\mathcal {P})\), in this section we study its one-dimensional version. As above, we focus on the finite *p* case, i.e. \(1<p<\infty \). The case \(p=\infty \) leads to several additional complications and will be subject of a forthcoming paper [8]. For this section \(\Omega \subset \mathbb {R}\) is an open and bounded interval, i.e. \(\Omega =(a,b)\). Our analysis follows closely the ones in [6] and [33] where the one dimensional \(\mathrm {L}^{1}\)–\(\mathrm {TGV}\) and \(\mathrm {L}^{2}\)–\(\mathrm {TGV}\) problems are studied, respectively.

### 4.1 Optimality Conditions

In this section, we derive the optimality conditions for the one-dimensional problem \((\mathcal {P})\). We initially start our analysis by defining the predual problem \((\mathcal {P}^{*})\), proving existence and uniqueness for its solutions. We employ again the Fenchel–Rockafellar duality theory in order to find a connection between the solutions of the predual and primal problems.

*q*is the conjugate exponent of

*p*. Existence and uniqueness for the solutions of \((\mathcal {P}^{*})\) can be verified by standard arguments:

**Proposition 4.1**

For \(f \in \mathrm {L}^2(\Omega )\), the predual problem \((\mathcal {P}^{*})\) admits a unique solution in \(\mathrm {H}_{0}^{1}(\Omega ).\)

The next proposition justifies the term predual for the problem \((\mathcal {P}^{*})\).

**Proposition 4.2**

The dual problem of \((\mathcal {P}^{*})\) is equivalent to the problem \((\mathcal {P})\) in the sense that (*w*, *u*) is a solution of the dual of \((\mathcal {P}^{*})\) if and only if \((w,u)\in \mathrm {L}^{p}(\Omega )\times \mathrm {BV}(\Omega )\) and solves \((\mathcal {P})\).

*Proof*

*K*. Let \((\sigma ,\tau )\) be elements of \(X^{*}=\mathrm {H}_{0}^{1}(\Omega )^{*}\times \mathrm {H}_{0}^{1}(\Omega )^{*}\). The convex conjugate of \(F_{1}\) can be written as

We next verify that we have no duality gap between the two minimisation problems \((\mathcal {P})\) and \((\mathcal {P}^{*})\). The proof of the following proposition follows the proof of the corresponding proposition in [6]. We slightly modify it for our case.

**Proposition 4.3**

*Proof*

Since there is no duality gap, we can find a relationship between the solutions of \((\mathcal {P}^{*})\) and \((\mathcal {P})\) via the following optimality conditions.

**Theorem 4.4**

*Proof*

*w*,

*u*) that solve \((\mathcal {P}^{*})\) and \((\mathcal {P})\), respectively. Hence, for every \((\sigma ,\tau )\in X^{*}\), we have the following:

*Remark 4.5*

*p*and as we will see later it allows a certain degree of smoothness in the final solution

*u*.

### 4.2 Structure of the Solutions

The optimality conditions (4.11) and (4.12) can help us explore the structure of the solutions for the problem \((\mathcal {P})\) and how this structure is determined by the regularising parameters \(\alpha , \beta \) and the value of *p*.

We initially discuss the cases where the solution *u* of \((\mathcal {P})\) is a solution of a corresponding ROF minimisation problem i.e. \(w=0\). Note that the following proposition holds for \(p=\infty \) as well.

**Proposition 4.6**

*Proof*

The proof follows immediately from Proposition 2.4.

Proposition 4.6 is valid for any dimension \(d\ge 1\). It provides a rough threshold for obtaining ROF type solutions in terms of the regularising parameters \(\alpha ,\beta \) and the image domain \(\Omega \). However, the condition is not sharp in general since as we will see in the following sections we can obtain a sharper estimate for specific data *f*.

The following proposition in the spirit of [6, 33] gives more insight into the structure of solutions of \((\mathcal {P})\).

**Proposition 4.7**

Let \(f\in \mathrm {BV(\Omega )}\) and suppose that \((w,u)\in \mathrm {L}^{p}(\Omega )\times \mathrm {BV}(\Omega )\) is a solution pair for \((\mathcal {P})\) with \(p\in (1,\infty ]\). Suppose that \(u>f\) (or \(u<f\)) on an open interval \(I\subset \Omega \) then \((Du-w)\lfloor I = 0\), i.e. \(u'=w\) on *I* and \(|D^{s}u|(I)=0\).

The above proposition is formulated rigorously via the use of *good representatives* of \(\mathrm {BV}\) functions, see [1], but for the sake of simplicity we rather not get into the details here. Instead we refer the reader to [6, 33] where the analogue propositions are shown for the \(\mathrm {TGV}\) regularised solutions and whose proofs are similar to the one of Proposition 4.7.

*f*:

**Proposition 4.8**

*Proof*

Clearly, if *u* is a constant solution of \((\mathcal {P})\), then \(Du=0\) and from inequality (2.2) we get \(\mathrm {TVL}_{\alpha ,\beta }^{p}(u)=0\). Hence, we have \(u=\tilde{f}\).

Propositions 4.6 and 4.8 tell us how the solutions *u* behave when \(\beta /\alpha \) or one of the parameters \(\alpha \) and \(\beta \) is large. The other limiting case is also of interest, i.e. when the parameters are small. The analogous questions have been examined in [35] for the \(\mathrm {TGV}\) case in arbitrary dimension. There it is shown that whenever \(\beta \rightarrow 0\) while keeping \(\alpha \) fixed or \(\alpha \rightarrow 0\) while keeping \(\beta \) fixed, the corresponding \(\mathrm {TGV}\) solutions converge to the data *f* strongly in \(\mathrm {L}^{2}\). The same result holds for the \(\mathrm {TVL}^{p}\) regularisation. The proof from [35] can be straightforwardly adapted to our case.

The following proposition reveals more information about the structure of solutions in the case \(w \ne 0\).

**Proposition 4.9**

*u*of \((\mathcal {P})\) is obtained by

*Proof*

Let us make a few remarks regarding equation (4.21) which is in fact the *p*-Laplace equation. One cannot write down a priori the boundary conditions associated with this equation on an interval *I* where \(u>f\) (or \(u<f\)) as it depends on the data and the type of solution we are looking for. For instance see (4.31) for the kind of boundary conditions that might arise when we are seeking a particular exact solution. A general statement about the solvability of the equation cannot be made either. If the equation coupled with the boundary conditions (that arise when looking for a specific solution *u*) has a solution then indeed *u* can possibly solve the minimisation problem. On the other hand, if the *p*-Laplace equation does not have a solution then the function *u* that imposed the corresponding boundary conditions cannot be a minimiser. For more details on the *p*-Laplace equation and its solvability we refer the reader to [28] and the references therein.

### 4.3 Exact Solutions of \((\mathcal {P})\) for a Step Function

#### 4.3.1 ROF Type Solutions

*u*is piecewise constant. The first condition of (4.12) implies that Open image in new window and provides a necessary and sufficient condition that needs to be fulfilled in order for

*u*to be piecewise constant, that is to say

*u*is constant, i.e. when \(u=\tilde{f}\), the mean value of

*f*. We define \(\phi (x)=\frac{h}{2}(L-|x|)\) and in that case we have that Open image in new window and Open image in new window. This implies that

#### 4.3.2 \(\mathrm {TVL}^2\) Type Solutions

*C*depends on the solution

*w*, creates a difficult computation in order to recover

*u*analytically. In order to overcome this obstacle, we consider the one-dimensional version of the 2-homogeneous analogue of \((\mathcal {P})\) that was introduced in Sect. 3:

*w*,

*u*) is a solution of (4.27) if and only if there exists a function \(\phi \in \mathrm {H}^{1}_{0}(\Omega )\) such that

*L*) is given by \(u(x)=h-u(-x)\). The optimality condition (4.28) results to

*g*at \(\alpha =\frac{hL}{2}\). Although, we know the form of the inverse function of the hyperbolic tangent, we cannot compute analytically the inverse \(f^{-1}\). However, we can obtain an approximation using a Taylor expansion which leads to

## 5 An Image Decomposition Approach

*p*. In the one-dimensional setting, we can prove that the problems \((\mathcal {P})\) and (5.1) are equivalent.

**Proposition 5.1**

Let \(\Omega =(a,b)\subset \mathbb {R}\) and \(1<p\le \infty \). Then a pair \((v^{*},u^{*})\in \mathrm {W}^{1,p}(\Omega )\times \mathrm {BV}(\Omega ) \) is a solution of (5.1) if and only if \((\nabla v^{*}, u^{*}+v^{*})\in \mathrm {L}^p(\Omega )\times \mathrm {BV}(\Omega )\) is a solution of \((\mathcal {P})\).

*Proof*

Even though for \(d=1\) it is true that every \(\mathrm {L}^{p}\) function can be written as a gradient, this is not true in higher dimensions. In fact, as we show in the following sections, this constraint is quite restrictive and for example the staircasing effect cannot be always eliminated in the denoising process, see for instance Fig. 20.

*u*,

*v*) are unique in general. Yet, one can say something more about this issue. if \((u_{1},v_{1}), (u_{2},v_{2})\) are two minimisers of (5.1), then from the convexity of

*L*(

*u*,

*v*) we have for \(0\le \lambda \le 1\)

**Proposition 5.2**

## 6 Numerical Experiments

In this section we present our numerical simulations for the problem \((\mathcal {P})\). We begin with the one-dimensional case where we verify numerically the analytical solutions obtained in Sect. 4.3. Through examples we also investigate the type of structures that are promoted for different values of *p*. Finally, we proceed to the two-dimensional case where we focus on image denoising tasks and in particular on the elimination of the staircasing effect.

### 6.1 Split Bregman for L\(\mathbf {^{2}}\)–TVL\(\mathbf {^{p}}\)

*u*,

*z*and

*w*. This yields the split Bregman iteration for our method:

*A*is a sparse, symmetric, positive definite and strictly diagonal dominant matrix, thus we can easily solve (6.13) with an iterative solver such as conjugate gradients or Gauss–Seidel. However, due to the zero Neumann boundary conditions, the matrix

*A*can be efficiently diagonalised by the two-dimensional discrete cosine transform,

*A*. In that case,

*A*has a particular structure of a block symmetric

*Toeplitz-plus-Hankel*matrix with

*Toeplitz-plus-Hankel*blocks and one can obtain the solution of (6.9) by three operations involving the two-dimensional discrete cosine transform [20] as follows: Firstly, we calculate the eigenvalues of

*A*by multiplying (6.14) with \(e_{1}=(1,0,\cdots ,0)^{\intercal }\) from both sides and using the fact that \(W_{nm}^{\intercal }W_{nm}=W_{nm}W_{nm}^{\intercal }=I_{nm}\), we get

*p*-homogenous analogue \((\mathcal {P}_{p-hom})\), where for certain values of

*p*, e.g. \(p=2\), we can solve exactly the corresponding version of (6.11), since in that case \(w_{i}^{k+1}=\frac{\eta _{i}}{\kappa +1}\). However, we have observed empirically that there is no significant computational difference between these two methods.

*p*. Note that the split Bregman algorithm is significantly slower for large values of

*p*, e.g. \(p=7\), see fourth line in Table 1, mainly due to the fixed-point iteration in the subproblem (6.19). We would like to point out that the computational speed can be significantly reduced in the \(p=\infty \) case, since the corresponding subproblem is solved exactly, see [8, 36] and in the same time we can obtain similar results to the ones obtained for high values of

*p*.

Split Bregman (s) | Iterations | Relative error | CVX (s) | |
---|---|---|---|---|

\(p=1.5\) | 3.58 | 147 | \(8.72\times 10^{-6}\) | 3433 |

\(p=2\) | 1.99 | 111 | \(9.31\times 10^{-6}\) | 193.14 |

\(p=3\) | 1.58 | 92 | \(9.49\times 10^{-6}\) | 3418 |

\(p=7\) | 2266 | 39518 | \(9.68\times 10^{-6}\) | 3532 |

### 6.2 One-Dimensional Results

*p*-homogeneous problems are equivalent modulo an appropriate rescaling of the parameter \(\beta \), see Proposition 3.2. Indeed, as it is described in Fig. 4, in order to obtain solutions from the purple region, it suffices to seek for solutions of the 2-homogeneous (4.27). Recall also that these solutions are exactly the solutions obtained solving a Huber TV problem, see Proposition 3.3. The analytical solutions are given in (4.30) and (4.32) and are compared to the numerical ones in Fig. 7, where we observe that they coincide. We also verify the equivalence between the 1-homogeneous and 2-homogeneous problems where \(\alpha \) is fixed and \(\beta \) is obtained from Proposition 3.2, see Fig. 7c.

*p*focusing on the structure of the solutions as

*p*increases. In order to compare the solutions for different values \(p\in (1,\infty )\), we fix the parameter \(\alpha \) and choose appropriate values of \(\beta \). Since we are mainly interested in non-ROF solutions, we choose \(\alpha \) and \(\beta \) so that they belong to the purple region of Fig. 4, i.e. \(\beta <(\frac{2L}{q+1})^{\frac{1}{q}}\alpha \) and \(\beta <\frac{h}{2} (\frac{2L^{q+1}}{q+1})^{\frac{1}{q}}\). We set \(p=\{\frac{4}{3}, \frac{3}{2}, 2, 3, 4, 10\}\) and in order to get solutions that preserve the discontinuity we set \(\beta =\{72, 140, 430, 1350, 2400, 6800\}\) with fixed \(\alpha =20\), see Fig. 8a. In order to obtain continuous solutions, we set \(\alpha =60\) and \(\beta =\{50, 110, 430, 1700, 3000, 9500\}\), see Fig. 8b. We observe that for \(p=\frac{4}{3}\), the solution has a similar behaviour to \(p=2\), but with a steeper gradient at the discontinuity point and the solution becomes almost constant near the boundary of \(\Omega \). On the other hand, as we increase

*p*, the slope of the solution near the discontinuity point reduces and it becomes almost linear with a relative small constant part near the boundary.

*p*motivates us to examine the case of piecewise affine data

*f*defined as

*v*,

*u*) and \((w,\overline{u})\) are the solutions of (5.1) and \((\mathcal {P})\), respectively. We also compare the decomposed parts

*u*,

*v*for \(p=\frac{4}{3}\) and \(p=10\). In order to have a reasonable comparison on the corresponding solutions, the parameters \(\alpha , \beta \) are selected such that the residual Open image in new window is the same for both values of

*p*. As we observe, the

*v*decomposition with \(p=\frac{4}{3}\) exhibits some

*flatness*compared to \(p=2\), compare Figs. 10b and 11a. On the other hand for \(p=10\), the

*v*component consists again of almost affine structures, Fig. 11b. Notice, that in both cases the

*v*components are continuous. This is expected since in dimension one, we have \(\mathrm {W}^{1,p}(\Omega )\subset C(\overline{\Omega })\) for every \(1<p<\infty \).

### 6.3 Two-Dimensional Results

In this section we consider the two-dimensional case where \(u\in \mathbb {R}^{n\times m}\), \(w\in (\mathbb {R}^{n\times m})^2\) with \(m>1\) and \(\Omega \) is a rectangular image domain. We focus on image denoising tasks and on eliminating the staircasing effect for different values of *p*.

*Peak Signal to Noise Ratio*(PSNR) and the

*Structural Similarity Index*(SSIM), see [41] for the definition of the latter. In each case, the values of \(\alpha \) and \(\beta \) are selected appropriately for optimal PSNR and SSIM. We use here the split Bregman algorithm as this is described in Sect. 6.1. Our stopping criterion is the relative residual error becoming less than \(10^{-6}\), i.e.

Observe that the best reconstructions in terms of the PSNR have no visual difference for \(p=\frac{3}{2}\), 2 and 3 and staircasing is present, Fig. 13a–c. This is one more indication that the PSNR—which is based on the squares of the pointwise differences between the ground truth and the reconstruction—does not correspond to the optimal visual results. The best reconstructions in terms of SSIM are visually better. They exhibit significantly reduced staircasing for \(p=\frac{3}{2}\) and \(p=3\) and is essentially absent in the case of \(p=2\), see Fig. 13d–f.

*p*, almost affine structures are promoted—see the middle row profiles in Fig. 14—and on the other hand these choices of \(\alpha , \beta \) produce a serious loss of contrast that however can be easily treated via the

*Bregman iteration*that we briefly discuss next.

*p*, see Fig. 17. In fact, as we increase

*p*, we obtain results that preserve the spike in the centre of the circle, see the corresponding middle row slice in Fig. 17d. This provides us with another motivation to examine the \(p=\infty \) case in [8]. The loss of contrast can be again treated using the Bregman iteration (6.23). The best results of the latter in terms of SSIM are presented in Fig. 18, for \(p=2\), 4 and 7 and they are also compared to the corresponding Bregman iteration version of \(\mathrm {TGV}\). We observe that we can obtain reconstructions that are visually close to the \(\mathrm {TGV}\) ones and in fact notice that for \(p=7\), the spike on the centre of the circle is better reconstructed compared to \(\mathrm {TGV}\), see also the surface plots in Fig. 19.

## 7 Conclusion

We have introduced a novel first-order, one-homogeneous \(\mathrm {TV}\)–\(\mathrm {L}^{p}\) infimal convolution type functional suitable for variational image regularisation. The \(\mathrm {TVL}^{p}\) functional constitutes a very general class of regularisation functionals exhibiting diverse smoothing properties for different choices of *p*. In the case \(p=2\) the well-known Huber \(\mathrm {TV}\) regulariser is recovered.

We studied the corresponding one-dimensional denoising problem focusing on the structure of its solutions. We computed exact solutions of this problem for the case \(p=2\) for simple one-dimensional data. Hence, as an additional novelty in our paper we presented exact solutions of the one-dimensional Huber \(\mathrm {TV}\) denoising problem.

Numerical experiments for several values of *p* indicate that our model leads to an elimination of the staircasing effect. We show that we can further enhance our results by increasing the contrast via a Bregman iteration scheme and thus obtaining results of similar quality to those of \(\mathrm {TGV}\). Furthermore, as *p* increases the structure of the solutions changes from piecewise smooth to piecewise linear and the model, in contrast to \(\mathrm {TGV}\), is capable of preserving sharp spikes in the reconstruction. This observation motivates a more detailed study of the \(\mathrm {TVL}^{p}\) functional for large *p* and in particular for the case \(p=\infty \).

This concludes the first part of the study of the \(\mathrm {TVL}^{p}\) model for \(p< \infty \). The second part [8], is devoted to the \(p=\infty \) case. There we explore further, both in an analytical and an experimental level, the capability of the \(\mathrm {TVL}^{\infty }\) model to promote affine and spike-like structures in the reconstructed image and we discuss several applications.

## Acknowledgments

The authors would like to thank the anonymous reviewers for their interesting comments and suggestions which especially motivated our more detailed discussion on the generalised Huber total variation functional. The authors acknowledge support of the Royal Society International Exchange Award No. IE110314. This work is further supported by the King Abdullah University for Science and Technology (KAUST) Award No. KUK-I1-007-43, the EPSRC first Grant No. EP/J009539/1 and the EPSRC Grant No. EP/M00483X/1. MB acknowledges further support by ERC via Grant EU FP 7-ERC Consolidator Grant 615216 LifeInverse. KP acknowledges the financial support of EPSRC and the Alexander von Humboldt Foundation while in UK and Germany, respectively. EP acknowledges support by Jesus College, Cambridge and Embiricos Trust Scholarship.

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.