1 Introduction

Segmentation is the task of partitioning an object into its constituent parts. In signal or image processing, it consists in decomposing the domain \(\Omega \) of a given input: a signal (an interval in which case \(\Omega =[a,b]\subset {{\mathbb {R}}}\)) or an image (\(\Omega \subset {{\mathbb {R}}}^2\)) into some regions of interest. In the particular case of two-phase segmentation, the aim is to find an optimal partition into two disjoint subsets, the foreground domain \(\Omega _F\) and the background domain \(\Omega _B\), such that \(\Omega =\Omega _F \cup \Omega _B\).

After the seminal work by Mumford and Shah  [17], in which the authors introduced a celebrated variational model for image segmentation, Chan and Vese rewrote it in the two–phase framework  [7]. They propose to obtain the optimal partition by minimizing the following energy functional:

$$\begin{aligned} \min _{E,c_1,c_2} \mathrm{Per} (E;\Omega )+\lambda _1 \int _E (c_1-f)^2\, \mathrm{d}x+\lambda _2\int _{\Omega {\setminus } E} (c_2-f)^2\,\mathrm{d}x \end{aligned}$$
(1.1)

among all sets of finite perimeter \(E\subset \Omega \) and all constants \(c_1,c_2\in [0,1]\), for some given parameters \(\lambda _1,\lambda _2\ge 0\). From now on, we do not distinguish between the weights of the foreground and background, and thus, we take \(\lambda _1=\lambda _2=\lambda \). A minimizer \(E\subset \Omega \) can be considered as the foreground domain and \(\Omega {\setminus } E\) as the background domain. In this case, the constants \(c_1\) and \(c_2\) turn out to be the average of f in E and the average in \(\Omega {\setminus } E\), respectively. The authors proposed in [7] an iterative two-step algorithm for finding the minimizers of the energy based on the level-set formulation developed by Osher and Sethian [18]. Basically, after initialization of the constants, the first step consists in finding the corresponding minimizing set with fixed constants as a steady-state solution of the correspondent \(L^2\)-gradient flow of the functional associated with the level-set formulation. Then, one recomputes the constants and comes back to the first step until convergence has been reached.

The main problem with this algorithm is that the energy functional is not convex. Therefore, the gradient descent scheme is prone to get stuck at critical points other than global minima. This issue was fixed by Chan–Esedoglou and Nikolova, who proved that minimizers to Chan–Vese’s level-set functional with fixed constants (i.e., solutions to Step 2) are solutions to the following constraint convex energy minimization problem (and vice versa):

$$\begin{aligned} \min _{\{ 0\le u\le 1\}} |Du|(\Omega )+\lambda \int _\Omega \left( u(c_1-f)^2+(1-u)(c_2-f)^2\right) \, \mathrm {d}x. \end{aligned}$$
(1.2)

Observe that, if the solution u is the characteristic function of a set with finite perimeter; \(u=\chi _E\), then, the energy in (1.2) coincides with that in (1.1).

There are still two main problems: the main one remains at the nonconvex nature of the original energy functional, i.e., convergence of the algorithm to a global minima is not ensured, and it heavily depends on the initialization. Moreover, it is not known if Chan–Vese’s algorithm (with Chan–Esedoglu–Nikolova modification) could lead to non-binary solutions (see [9]).

The main objective of this work is to give another approach to original Chan–Vese’s minimizers in the easiest possible case, the one-dimensional case, which corresponds to signal segmentation. In the context of signal segmentation, Chan–Vese’s algorithm was already proposed in [8] and has been used in some works (see [15] or [16]). Our starting point is the functional appearing in Problem (1.2). We aim at minimizing, simultaneously the function u and the constants \(c_1,c_2\). To do that, we introduce an augmented Lagrangian version of the functional, coupled with the constraint \(0\le u\le 1\). In this new functional, we replace the constants by BV functions while highly penalizing their variation. For any \(\varepsilon >0\), we define the functional \(F_\varepsilon : (L^2(0,1))^3\rightarrow [0,+\infty [\) by letting

$$\begin{aligned} F_\varepsilon (u,v_1,v_2)=&|Du|(\Omega ) + \frac{1}{\varepsilon }\left( |Dv_1|(\Omega ) + |Dv_2|(\Omega )\right) + \lambda \int _\Omega u\left( v_1 - f\right) ^2\,\mathrm {d}x\nonumber \\&+ \lambda \int _\Omega (1-u)\left( v_2 - f\right) ^2\,\mathrm {d}x + \int _\Omega {\mathbb {I}}_{[0,1]}(u)\,\mathrm {d}x, \end{aligned}$$
(1.3)

where \({\mathbb {I}}_{[0,1]}(\cdot )\) denotes the indicator function of the interval [0, 1], that is

$$\begin{aligned} {\mathbb {I}}_{[0,1]}(x)=\left\{ \begin{array} {c@{\qquad }c} 0 &{} \mathrm{if \ } x\in [0,1] \\ +\infty &{} \mathrm{otherwise. \ } \end{array}\right. \end{aligned}$$

The second term is implemented for penalizing the variation of the pair of functions \((v_1, v_2)\). Observe that, letting \(\varepsilon \rightarrow 0\), we are forcing \(v_1,v_2\) to become constants. With this addition, the functional \(F_\varepsilon \) fails to be convex. However, we can use standard PDE methods to obtain some features of the set of minimizers via its correspondent system of Euler–Lagrange equations.

In particular, we prove the following results in the case that \(\Omega \) is an interval of \({{\mathbb {R}}}\):

Theorem 1.1

Let \(\varepsilon <{\frac{1}{4\lambda }}\). Then, if \((u,v_1,v_2)\) is a minimizer of \(F_\varepsilon \), then \(v_1,v_2\) are constants.

To characterize the first component of the minimizer, we need to assume further that the datum is not too oscillatory in the following sense:

(H) \({f\in BV(0,1)}\) satisfies that for every \(c\in (0,1)\)

$$\begin{aligned} {\mathcal {L}}^1(\partial \{x\in (0,1){\setminus } J_f : f(x) = c \}) = 0. \end{aligned}$$

Observe that with this assumption, we exclude some pathologies on the data such as having a fat Cantor set as a level set.

Theorem 1.2

Given \(\varepsilon <{\frac{1}{4\lambda }}\), for \(f\in BV(\Omega )\) satisfying (H), any minimizer \((u,v_1,v_2)\) of \(F_\varepsilon \) is independent of \(\varepsilon \) and it satisfies that either u is constant or \(u(\Omega )\subset \{0,1\}\), i.e., u is a binary function in \(BV(\Omega )\).

With this last result, we can show that the set of minimizers of Chan–Vese’s problem (1.1) coincides with the set of minimizers of \(F_\varepsilon \) (independent of \(\varepsilon \) under the size condition above expressed). As a by-product of our analysis, we obtain two important properties of solutions to (1.1)

(a):

The jump set of any solution is concentrated in the topological boundary of a sole level set (in a multivalued sense, see (4.2) for the proper statement).

(b):

If f is piecewise constant, then the jump set of any solution is contained in the jump set of f.

These two properties, though quite intuitive in the one-dimensional setting, were not known in the literature, to the best of our knowledge. Moreover, property (a) allows to build a trivial algorithm to find the minimizers of Chan–Vese’s problem in the one-dimensional case.

The plan of the paper is the following one: In Sect. 2, we obtain the system of PDE’s that minimizers to \(F_\varepsilon \) satisfy, i.e., the corresponding Euler–Lagrange equations in this non-smooth case. In Sect. 3, we prove Theorems 1.1 and 1.2. Section 4 is devoted to the proof of properties (a) and (b). In Sect. 4.2, we explain the trivial algorithm to compute the minimizers. We finish the paper with some conclusions and with an Appendix, in which we collect the existence of minimizers to \(F_\varepsilon \) as well as the proof of an auxiliary result we need.

Notations. Throughout the paper, \(\Omega \) denotes an open bounded set in \({{\mathbb {R}}}^N\) with Lipschitz boundary and \({\mathcal {L}}^N\) denotes the Lebesgue measure in \({{\mathbb {R}}}^N\). We denote by \(L^p(\Omega )\), \(1\le p\le \infty \) the Lebesgue space of functions which are integrable with power p with respect to \({\mathcal {L}}^N\). We use the notation \(\langle \cdot ,\cdot \rangle \) to denote the scalar product between two \(L^2\) functions. We denote by \(H^1(\Omega )\) the Hilbert space \(W^{1,2}(\Omega )\) and by \(H_0^1(\Omega )\) the completion in \(H^1(\Omega )\) of smooth functions with compact support in \(\Omega \). We use standard notation for functions of bounded variation (BV functions) as in [2]. In particular, given \(u\in BV(\Omega )\), we write \(u_x\), \(D^c u\) and \(D^j u\) for the absolutely continuous part of the measure Du with respect to \({\mathcal {L}}^N\), for the Cantor part of Du and for the jump part of Du, respectively. We use the notation \(u^\pm (x)\) for the left and right approximate limits of u at \(x\in \Omega \) (we use only this convention for the case of \(\Omega =(a,b)\subset {{\mathbb {R}}}\), \(a\le b\in {{\mathbb {R}}}\)), \(J_u\) for its jump set and \(\nu ^u\) will denote the Radon–Nikodym derivative of Du with respect to |Du|. Given a set \(E\subseteq \Omega \), we say that it is a set of finite perimeter in \(\Omega \) if \(\chi _E\in BV(\Omega )\), where \(\chi _E\) denotes the characteristic function of the set E. In this case, its perimeter is defined as \(Per(E;\Omega ):=|D\chi _E|\). Finally, unless otherwise specified, we always identify a function (in \(H^1_0(0,1)\) or in \(BV(\Omega )\)) by its precise representative.

2 System of Euler–Lagrange Equations

In this section, we derive the system of equations that minimizers of \(F_\varepsilon \) must satisfy. Although \(F_\varepsilon \) is not a convex functional in \((L^2(\Omega ))^3\), it is a convex functional in each of their coordinates when one fixes the other two ones. Therefore, by standard results in convex analysis, we obtain that the Euler–Lagrange system of PDE’s is the following one:

$$\begin{aligned} {\left\{ \begin{array}{ll} \begin{aligned} \lambda \left( (v_1 - f)^2 - (v_2 -f)^2 \right) + \partial (\Phi +\Psi )(u)\ni 0 \\ 2\varepsilon \lambda u(v_1 -f) +\partial \Phi (v_1) \ni 0 \\ 2\varepsilon \lambda (1-u)(v_2 -f) +\partial \Phi (v_2) \ni 0, \end{aligned} \end{array}\right. } \end{aligned}$$
(2.1)

where the symbol \(\partial \) denotes the subdifferential (in \(L^2(\Omega )\)) of the following two extended real-valued convex functions:

$$\begin{aligned} \Phi (g) :=\left\{ \begin{array} {c@{\qquad }c} |Dg|(\Omega ) &{} \mathrm{if \ } g\in L^2(\Omega )\cap BV(\Omega ) \\ +\infty &{} \mathrm{if \ } g\in L^2(\Omega ){\setminus } BV(\Omega ) \end{array}\right. , \end{aligned}$$

and

$$\begin{aligned} \Psi (g) = \int _\Omega {\mathbb {I}}_{[0,1]}(g)\,\mathrm{d}x. \end{aligned}$$

We easily note that a.e. \(\partial \Psi (g) = \partial {\mathbb {I}}_{[0,1]}(g)\), for any \(g\in L^2(\Omega )\). Next result is crucial and permits to reformulate \(\partial (\Phi + \Psi )\)

Theorem 2.1

Let \(g\in L^2(\Omega )\). Then,

$$\begin{aligned} \partial \left( \Phi + \Psi \right) (g) = \partial \Phi (g) + \partial \Psi (g). \end{aligned}$$

Proof

We point out that standard results to decompose the subdifferential of the sum, such as those in  [6], cannot be applied, since the interior of the domains of both functionals is empty. We follow the strategy of [19, Thm. 3.1], consisting in an ad-hoc proof by approximating the subdifferential of the indicator function by its Yoshida’s regularization. First of all, we state the following claim, whose proof we postpone to the Appendix.

Claim

Let \(h:{{\mathbb {R}}}\rightarrow {{\mathbb {R}}}\) be a Lipschitz non decreasing function. Then, it holds

$$\begin{aligned} \langle v, h(u)\rangle = |Dh(u)|(\Omega ),\quad \forall v\in \partial \Phi (u).\end{aligned}$$

Since we know that the inclusion

$$\begin{aligned} \partial \left( \Phi + \Psi \right) (u) \supseteq \partial \Phi (u) + \partial \Psi (u)\,, \qquad \forall u\in L^2(\Omega ) \end{aligned}$$

is satisfied, it is sufficient to show the converse inclusion. Let \(u\in L^2(\Omega )\) be such that \(\partial (\Phi +\Psi )(u)\ne \emptyset \), and let \(v\in \partial (\Phi +\Psi )(u)\). We define \(\varphi _u:L^2(\Omega )\rightarrow {\mathbb {R}}\cup \{+\infty \}\) by

$$\begin{aligned} \varphi _u(w) = (\Phi + \Psi )(w) + \frac{1}{2}\Vert {w}\Vert _2^2 -\langle v + u, w\rangle \,. \end{aligned}$$

We note that \(\varphi _u\) is coercive, strictly convex and lower semicontinuous. Then, it is easy to see that u is the unique minimizer of \(\varphi _u\).

Now, for any \(0< \eta <1\), let \(\beta _ \eta \) be the Yoshida regularization of \(\partial {\mathbb {I}}_{[0,1]}\), that is

$$\begin{aligned} \beta _\eta (t) = \frac{(t-1)_+ - t_-}{\eta }\,, \qquad \forall t\in {\mathbb {R}}; \end{aligned}$$

here, subindexes ± represent, respectively, the positive and the negative part of the function. We next consider \(\gamma : L^2(\Omega )\rightarrow {\mathbb {R}}\cup +\infty \) defined by

$$\begin{aligned} \gamma (w) = { \Phi (w) +\frac{1}{2}\Vert w\Vert _2^2-\langle v+u,w\rangle } + \int _\Omega \overline{\beta _\eta }(w)\,\mathrm{d}x\,, \end{aligned}$$

where \(\overline{\beta _\eta }\) is a primitive of \(\beta _\eta \). We note that \(\gamma \) is coercive, strictly convex over \(\text {Dom}(\Phi + \Psi )\) and lower semicontinuous. Thus, \(\gamma \) has a unique minimizer \(u_\eta \) which satisfies the corresponding Euler–Lagrange equation

$$\begin{aligned} -\beta _\eta (u_\eta ) - u_\eta + (v + u)\in \partial \Phi (u_\eta )\,. \end{aligned}$$

In consequence, we know that the following equation has a unique solution:

$$\begin{aligned} v_\eta + \beta _\eta (u_\eta ) + u_\eta = v + u\,, \qquad \text {where } {v_\eta \in \partial \Phi (u_\eta )}\,. \end{aligned}$$

Multiplying by \(\beta _\eta (u_\eta )\) both sides of the previous equation, we have, for any \(0<\eta <1\)

$$\begin{aligned} \langle v_\eta , \beta _\eta (u_\eta ) \rangle + \Vert \beta _\eta (u_\eta )\Vert _2^2 + \langle u_\eta , \beta _\eta (u_\eta ) \rangle = \langle v + u, \beta _\eta (u_\eta )\rangle . \end{aligned}$$

Here, since \(t\beta _\eta (t)\ge 0\) for any \(t\in {\mathbb {R}}\), we note that \(\langle u_\eta , \beta _\eta (u_\eta )\rangle \ge 0\). Moreover, by the previous Claim, we have that

$$\begin{aligned} \langle v_\eta , \beta _\eta (u_\eta )\rangle \ge 0\,. \end{aligned}$$

Then, we see that \(\{\beta _\eta (u_\eta ) \}_\eta \) and \(\{ v_\eta \}_\eta \) are bounded in \(L^2(\Omega )\) and \(\{u_\eta \}_\eta \) is bounded in \(L^2(\Omega )\) and \(BV(\Omega )\). Therefore, there is a sequence \(\{\eta _n \}_{n}\subset (0,1)\), such that \(\eta _n\rightarrow 0\) and there exists a function \(u_0\in L^2(\Omega )\), such that

$$\begin{aligned} u_{\eta _n}{\mathop {\longrightarrow }\limits ^{n\rightarrow \infty }} u_0 \quad \text {in } L^1(\Omega )\,. \end{aligned}$$

Moreover, we can assume that there exist \(v_0, \xi \in L^2(\Omega )\), such that \(u_{\eta _n}, v_{\eta _n}\) and \(\beta _{\eta _n}(u_{\eta _n})\) converge weakly in \(L^2(\Omega )\) to \(u_0,v_0\) and \(\xi \), respectively.

In addition, we note that

$$\begin{aligned} \frac{1}{\eta }(u_\eta -1)_+ \le |\beta _\eta (u_\eta )|\quad \text {and}\quad \frac{1}{\eta }(u_\eta )_-\le |\beta _\eta (u_\eta )|\,, \end{aligned}$$

for any \(0<\eta <1\), and thus, we have

$$\begin{aligned} (u_\eta -1)_+\rightarrow 0 \quad \text {and} \quad (u_\eta )_- \rightarrow 0 \quad \text {in } L^2(\Omega ) \text { as }\eta \rightarrow 0\,. \end{aligned}$$

Hence, we can show that \(u_0 (x)\in [0,1]\) a.e. in \(\Omega \). We conclude that

$$\begin{aligned} {\left\{ \begin{array}{ll} &{}\xi \in \partial {\mathbb {I}}_{[0,1]}(u_0)\quad \text {a.e. in }\Omega ,\\ &{}v_0 \in \partial \Phi (u_0) \ \ \text {and} \ \ v_0 + \xi + u_0 = v + u \ \text {in }L^2(\Omega )\,. \end{array}\right. } \end{aligned}$$

This implies that \(u_0\) is a minimizer of \(\varphi _u\). Then, \(u_0 = u\), and consequently, \(v = \xi + v_0\) in \(L^2(\Omega )\), which finishes the proof. \(\square \)

Up to this point, we have worked without imposing any restriction in the dimension of the domain. Now, we introduce the characterization of the subdifferential of the total variation in \(L^2(\Omega )\), proposed by Andreu, Ballester, Caselles, and Mazón in  [3] (see also  [4]), in the specific case of the domain being an interval in 1D, which we take as (0, 1) without loose of generality (see  [10] for a proof).

Theorem 2.2

(TV characterization in 1D) Let u be in BV(0, 1), such that \(\partial \Phi (u) \ne \emptyset \). Then, \(v\in \partial \Phi (u)\) if and only if there exists \(\varvec{z_u}\in H_0^1 (0,1)\), such that a.e. \(|\varvec{z_u}|\le 1\) and

$$\begin{aligned} v = -(\varvec{z_u})_x\,, \qquad |Du|=\varvec{z_u}\cdot Du, \end{aligned}$$

where the measure \(\varvec{z_u}\cdot Du\in {\mathcal {M}}(0,1)\) is defined as

$$\begin{aligned} (\varvec{z_u}\cdot Du) (U):= & {} \int _U \varvec{z_u}\cdot u_x \,\mathrm{d}x+\int _U {\varvec{z_u}}\cdot \nu ^u \,d |D^c u| \\&+ \sum _{x_\in J_u\cap U} (u^+(x)-u^-(x)) {\varvec{z_u}}(x)\cdot \nu ^u(x), \end{aligned}$$

for any Borel set \(U\subset (0,1)\).

With this characterization in mind, the system of Euler–Lagrange equations can be rephrased in the following way:

Proposition 2.3

Let \((u,v_1,v_2)\) be a minimizer of \(F_\varepsilon \). Then, there exist \(\varvec{z_u}\), \(\varvec{z_{v_1}}\),\(\varvec{z_{v_2}}\in H^1_0(0,1)\), corresponding to \(\partial \Phi (u)\), \(\partial \Phi (v_1)\) and \(\partial \Phi (v_2)\), respectively, as given by Theorem 2.2, and \(g\in \partial {\mathbb {I}}_{[0,1]}(u)\), such that

$$\begin{aligned} {\left\{ \begin{array}{ll} \begin{aligned} (\varvec{z_u})_x =\lambda \left( (v_1 - f)^2 - (v_2 -f)^2 \right) + g \\ (\varvec{z_{v_1}})_x= 2\varepsilon \lambda u(v_1 -f) \\ (\varvec{z_{v_2}})_x =2\varepsilon \lambda (1-u)(v_2 -f)\,. \end{aligned} \end{array}\right. } \end{aligned}$$
(2.2)

3 Proofs of the Main Results

3.1 Proof of Theorem 1.1

We need to show first the following auxiliary result.

Lemma 3.1

(Behaviour of \(\varvec{z_u})\) Let \(u\in BV(0,1)\) and \(\varvec{z_u}\in H^1_0(0,1)\), corresponding to \(\partial \Phi (u)\) as provided by Theorem 2.2. Then

$$\begin{aligned} |\varvec{z_u}| = 1 \,,\quad |Du|-\text { a.e}\,. \end{aligned}$$

Proof

We decompose both measures |Du| and \(\varvec{z_u}\cdot Du\) in the following way:

$$\begin{aligned} \qquad \varvec{z_u}\cdot Du&= {(\varvec{z_u}\cdot u_x)} {\mathcal {L}}^1 + \varvec{z_u}\cdot D^ju + \varvec{z_u}\cdot D^cu \\ |Du|&= |u_x|{\mathcal {L}}^1 + |{ D^j}u| + |D^cu|. \end{aligned}$$

Since both decompositions are mutually singular, we have

$$\begin{aligned} \varvec{z_u} = {\left\{ \begin{array}{ll} \quad \displaystyle \frac{u_x}{|u_x|}\,,&{} \quad |u_x|{\mathcal {L}}^1\text { -a.e}\\ \quad \displaystyle \frac{u^+ - u^-}{|u^+ - u^-|}\,,&{} \quad |D^j u|\text { -a.e}\\ \quad \displaystyle \frac{D^c u}{|D^c u|}\,,&{} \quad |D^c u| \text { -a.e}, \end{array}\right. } \end{aligned}$$

and thus, we have \(|\varvec{z_u}| = 1\,,\quad |Du|-a.e\). \(\square \)

Proof of Thm. 1.1

First, we note that if \((u,v_1,v_2)\) is a minimizer of \(F_\varepsilon \), then it is easy to see that all variables take values in [0, 1] a.e in (0, 1) as shown in Lemma A.1 in the Appendix. Suppose that there exists a Borel set \(U\subset (0,1)\), such that \(Dv_1 (U)\ne 0\). By Lemma 3.1, we know that there is \(x_1\in U\), such that \(\varvec{z_{v_1 }}(x_1)\in \{-1,+1\}\), where \(\varvec{z_{v_1}}\) is given by Proposition 2.3. Then, by (2.2)\(_2\), we have the next inequality

$$\begin{aligned} \quad 1 = \left| \int _0^{x_1} (\varvec{z_{v_1}})_x \,\mathrm{d}x\right| = \,&\,2\varepsilon \lambda \left| \int _0^{x_1} u(v_1 - f)\,\mathrm{d}x \right| \\ \le \,&\, 2\varepsilon \lambda \int _0^1 |u||v_1 - f|\,\mathrm{d}x \le 4\varepsilon \lambda , \end{aligned}$$

and in consequence, \(\varepsilon \ge 1/(4\lambda )\) and, thus, a contradiction by hypothesis.

To get that \(v_2\) is also constant, we apply the same argument to (2.2)\(_3\).

\(\square \)

Under the size condition \(\varepsilon <\frac{1}{4\lambda }\), which we assume from now on, we integrate the second and third Eq. (2.2) in (0, 1) and we obtain

$$\begin{aligned} v_1 = \frac{\displaystyle \int _0^1 u f\,\mathrm{d}x }{\displaystyle \int _0^1 u\,\mathrm{d}x}\,, \qquad v_2 = \frac{\displaystyle \int _0^1 (1-u) f \,\mathrm{d}x}{\displaystyle \int _0^1 (1-u)\,\mathrm{d}x}, \end{aligned}$$
(3.1)

in the case that u is not constant with values 0 or 1. If \(u\equiv 0\) (resp. \(u\equiv 1\)), then \(v_1\) (resp. \(v_2\)) can be any constant value in (0, 1). We finally note that, being \(v_1,v_2\) constants, the energy functional does not depend on \(\varepsilon \). From now on, we will assume the size condition \(\varepsilon <\frac{1}{4\lambda }\). We rename the constants as \(c_1,c_2\) for consistency and remove the \(\varepsilon \) dependence by letting

$$\begin{aligned} F(u,c_1,c_2) =&|Du|((0,1)) + \int _0^1 {\mathbb {I}}_{[0,1]} (u)\,\mathrm{d}x\, \\&+\lambda \int _0^1 (u(c_1-f)^2+(1-u)(c_2-f)^2)\,\mathrm{d}x\,. \end{aligned}$$

3.2 Proof of Theorem 1.2

This section is devoted to prove that the first coordinate of the minimizer is necessarily a binary BV function. Hereinafter, we assume that the datum f satisfies (H). The proof will be done in two different steps. First of all, we show that if \((u,c_1,c_2)\) is a minimizer, then there is a “quasi–piecewise constant”competitor \({\overline{u}}\) with lower energy than (or equal to) the corresponding to u. Then, we prove that the competitor cannot be a minimizer in case it is not binary using the PDE system (2.2).

We start by defining our concept of quasi–piecewise constant function.

Definition 3.2

We say that \(u\in BV(0,1)\) is quasi–piecewise constant if there exist a piecewise constant function \(u_s\) and an a.e binary function \(u_b\) which fulfill that they are not non-simultaneously and

$$\begin{aligned} u(x) = u_s(x) + u_b(x)\,, \quad \text {a.e. }x \in (0,1)\,. \end{aligned}$$

Theorem 3.3

Let \(c_1,c_2\in [0,1]\). Given u a minimizer of \(F(\cdot ,c_1,c_2)\), there exists a quasi–piecewise constant \({\overline{u}}\in BV(0,1)\) satisfying

$$\begin{aligned} F({\overline{u}},c_1, c_2) \le F (u, c_1, c_2)\,. \end{aligned}$$
(3.2)

Proof

Let \(x_0\) be in (0, 1), such that \(u(x_0)\notin \{0,1\}\) and \(x_0\notin J_{u}\), and let \(\varvec{z_{u}}\in H_0^1(0,1)\) be the vector field associated with \(\partial \Phi (u)\) as given by Theorem 2.2. We distinguish two cases:

  1. (i)

    The case \(x_0\in (|\varvec{z_{u}}|)^{-1}(\{1\})\): We assume without loss of generality that \(\varvec{z_{u}}(x_0)=1\) (for \(\varvec{z_{u}}(x_0)=-1\), the argument is analogous). Under this assumption, we know that u is continuous at \(x_0\) (remember that we always identify a BV function with its precise representative) and thus, \({u} \notin \{0,1\}\) in a neighbourhood of \(x_0\) denoted by \({\mathcal {E}}_{x_0}\). Hence, using (2.2)\(_1\), we know that

    $$\begin{aligned} (\varvec{z_{u}})_x = \lambda \left( (c_1 - f)^2 - (c_2 - f)^2\right) \qquad \text {in } {\mathcal {E}}_{x_0}. \end{aligned}$$
    (3.3)

    Since \(f\in BV(0,1)\), we have by the above expression that \((\varvec{z_{u}})_x\in BV({\mathcal {E}}_{x_0})\), and thus, its lateral traces \((\varvec{z_{u}})_x^+(x_0)\) and \((\varvec{z_{u}})_x^-(x_0)\) are well defined. In addition, since \(\varvec{z_u}(x_0)=1\) and \(\Vert \varvec{z_u}\Vert _{{L^\infty (\Omega )}} \le 1\), this implies that

    $$\begin{aligned} (\varvec{z_{u}})_x^+ (x_0)\le 0 \le (\varvec{z_{u}})_x^- (x_0), \end{aligned}$$

    and we have by (3.3) that

    $$\begin{aligned} f^-(x_0) \le \frac{c_1 + c_2}{2} \le f^+(x_0)\,. \end{aligned}$$

    Then, we note that the set

    $$\begin{aligned} \left\{ x\in (0,1)\backslash (J_u\cup J_f) : |\varvec{z_{u}}(x)| = 1 \wedge {u}(x)\in (0,1) \right\} \end{aligned}$$

    is a subset of the following set:

    $$\begin{aligned} A:=\left\{ x\in (0,1)\backslash J_f : {u}(x)\in (0,1) \wedge f(x) = \frac{c_1 + c_2}{2} \right\} \,. \end{aligned}$$

    We note that \(A = A^o \cup (\partial A\cap A)\) where \(A^o\) and \(\partial A\) are the (topological) interior and boundary of A, respectively. Particularly, we note that

    $$\begin{aligned} A^o = \bigcup _{k=1} I_k, \end{aligned}$$

    where \(\{I_k = (a_k, b_k)\}_k\) is a disjoint collection of open intervals; and \({\mathcal {L}}^1(\partial A\cap A) = 0\) by assumption on f and the fact that \(\partial A\cap A\subset J_f\cup \partial \{f=\frac{c_1+c_2}{2}\}\). Next, we will modify u in each interval \(I_k\) to decrease the energy. Before doing it, we point out that, for \(c_1,c_2\) fixed, minimizing F is equivalent to minimize

    $$\begin{aligned}G(w):=&|Dw|((0,1)) +\int _0^1 {\mathbb {I}}_{[0,1]}(w)\,\mathrm{d}x \, \\&+\lambda \int _0^1 w(c_1-c_2)(c_1+c_2-2f)\,\mathrm{d}x\ .\end{aligned}$$

    Since \(I_k\subset A\), we observe that

    $$\begin{aligned} G(u\chi _{I_k})=|Du|((a_k,b_k)). \end{aligned}$$

    Therefore, if we take \(u_k:=u(a_k)^+\chi _{I_k}+ u\chi _{(0,1){\setminus } I_k}\), it is easy to show that

    $$\begin{aligned} F(u_k,c_1,c_2)\le F(u,c_1,c_2). \end{aligned}$$
  2. (ii)

    If \(x_0\notin (|\varvec{z_{u}}|)^{-1}(\{1\})\): Since \(\varvec{z_{u}}\in H^1_0(0,1)\subset C(0,1)\), we take the largest interval I containing \(x_0\), such that \(|\varvec{z_{u}}|(I)\subseteq [0,1)\). By Lemma 3.1, \(|Du|=0\) in this interval, and thus, u is constant in I.

Consequently, we note that \((0,1)\backslash (\partial A\cap A)\) can be decomposed as

$$\begin{aligned} \underset{\displaystyle B}{\underbrace{\left( (0,1)\cap ({u})^{-1}(\{0,1\}) \right) }} \cup \underset{\displaystyle C}{\underbrace{\left( (0,1)\cap ({u})^{-1}((0,1)) \right) }}, \end{aligned}$$

such that C is a subset of

$$\begin{aligned} \left( \bigcup _{k=1} I_k\right) \cup \left( C\cap (|\varvec{z_{u}}|)^{-1}([0,1)) \right) \,. \end{aligned}$$

Defining

$$\begin{aligned} {\overline{u}}(x):={\left\{ \begin{array}{ll} \quad {u_k(x)} \quad &{} \text {if }x\in I_k \text { for any }k\\ \quad u(x) \quad &{} \text {if }x\notin I_k \text { for all }k\,, \end{array}\right. } \end{aligned}$$

we obtain that the inequality (3.2) is clearly satisfied.

Note that \({\overline{u}}\) is a piecewise constant function in C because of the reasoning in (i) and (ii), and thus, we can take

$$\begin{aligned} u_s:= {\overline{u}}{\mathcal {X}}_{C}. \end{aligned}$$

Finally, as B and C are disjoint, we obtain that \({\overline{u}}\) is a quasi-piecewise constant function. \(\square \)

We introduce now two useful remarks:

Remark 3.4

Let \((u,c_1^u,c_2^u)\) be a minimizer of F, such that \(c_1^u \le c_2^u\). Defining \(w := 1 - u\), it is easy to show that \( c_1^w = c_2^u\), \(c_2^w = c_1^u\) and that \(F(w,c_1^w, c_2^w) = F(u,c_1^u, c_2^u)\). On account of it, we assume hereinafter that \(c_2\le c_1\). Furthermore, if \(c_1^u=c_2^u\), we note that

$$\begin{aligned} F(u, c_1^u,c_2^u) = |Du|((0,1)) + \lambda \int _0^1 (c_2^u - f)^2\,\mathrm{d}x, \end{aligned}$$

and thus, to be a minimizer, u is a necessarily a constant function.

Remark 3.5

Let \((u,c_1,c_2)\) be a minimizer of F and suppose that there exist \(a<b\in J_u\), such that \(u^+(a)\), \(u^-(b)\), \(u(x)\notin \{0,1\}\), for any \(x\in (a,b)\). By integrating (2.2)\(_1\) on (ab), we have that

$$\begin{aligned} \lambda \int _{a}^{b}\left( (c_1 - f)^2 - (c_2 - f)^2\right) \,\mathrm{d}x = \varvec{z_{u}}(b) - \varvec{z_{u}}(a), \end{aligned}$$

where \(\varvec{z_{u}}\) corresponds to \(\partial \Phi (u)\). Note that the right-hand side term is equal to 2, \(-2\) or 0 because of the fact that \(|\varvec{z_{u}}|=1\) in the jump set \(J_{u}\) as a consequence of Lemma 3.1; as \(\varvec{z_{u}}\in H^1_0(0,1)\), if \(x_0\in J_{u}\), then

  • \(\varvec{z_{u}}(x_0) = -1\) when u jumps toward a lower step in \(x_0\).

  • \(\varvec{z_{u}}(x_0) = 1\) when u jumps toward an upper step in \(x_0\).

After these remarks, we prove the following statement:

Proposition 3.6

Let \((u,c_1,c_2)\) be a minimizer of F, such that u is a quasi–piecewise constant function. Then, either u is constant or u is an a.e. binary function.

Proof

This statement is proved by contradiction. We suppose that u is not an a.e binary function and not constant, i.e., \(u_s\) has some non-binary step (\(u_s\) being the piecewise constant part). Let \(a,b\in J_u\) be such that \(u_s((a,b)) = \{\beta \}\notin \{0,1\}\). In addition, let \(\alpha :=u^-(a)\) and \(\gamma :=u^+(b)\).

According to the relation \(\alpha \ne \beta \ne \gamma \), there are three different cases to study:

  1. (i)

    \(\alpha< \beta < \gamma \) (or \(\gamma< \beta < \alpha \), resp.): We define

    $$\begin{aligned} v_\tau (x) := {\left\{ \begin{array}{ll} u(x) \quad &{}\text { if }x \notin (a,b)\\ \tau \quad &{}\text { if }x\in (a,b), \end{array}\right. } \end{aligned}$$

    where \(\tau \in (\alpha ,\beta )\) (or \(\tau \in (\gamma , \beta )\), resp.). In any case, according to Remarks 3.4 and 3.5, we can suppose that \(c_1>c_2\) and

    $$\begin{aligned} \qquad \quad \lambda \int _a^b \left( (c_1 - f)^2 - (c_2 - f)^2\right) \,\mathrm{d}x \,{= \lambda (c_1 - c_2) \int _a^b \left( c_1 +c_2 - 2f\right) \mathrm{d}x} = 0\,. \end{aligned}$$

    Since \(\lambda (c_1-c_2)>0\), we obtain

    $$\begin{aligned} \int _a^b f \,\mathrm{d}x = \left( \frac{c_1 + c_2}{2}\right) \left( b-a\right) . \end{aligned}$$

    It is clear that \(F(u,c_1, c_2) = F({ v_\tau }, c_1 , c_2)\). Then, \((v_\tau , c_1,c_2)\) is a minimizer too. Consequently, by (3.1), we have

    $$\begin{aligned} c_1 = \frac{\displaystyle \int _0^1 uf \,\mathrm{d}x}{\displaystyle \int _0^1 u\,\mathrm{d}x }&= \frac{\displaystyle \int _0^1 v_\tau { f} \,\mathrm{d}x}{\displaystyle \int _0^1 v_\tau \,\mathrm{d}x }\qquad&\text {i.e.}\\ \frac{\displaystyle \int _I uf{\,\mathrm{d}x + \beta ML}}{\displaystyle \int _{I} u{ \,\mathrm{d}x + \beta L}}&= \frac{\displaystyle \int _I u f {\,\mathrm{d}x + \tau ML} }{\displaystyle \int _I u {\,\mathrm{d}x + \tau L}}\,,&\end{aligned}$$

    where we denoted by \(I:=[0,a)\cup (b,1]\), \(M:=\frac{c_1+c_2}{2}\) and \(L:=b-a\). Then

    $$\begin{aligned} \beta M L \int _I u\,\mathrm{d}x + \tau L \int _I u{ f}\,\mathrm{d}x &=\tau M L \int _I u\,\mathrm{d}x + \beta L \int _I u{{f}}\,\mathrm{d}x \end{aligned}$$

    Since \(\beta \ne \tau \) and \(L \ne 0\), we have

    $$\begin{aligned} M \int _I u\,\mathrm{d}x &= \int _I u{f}\,\mathrm{d}x\,. \end{aligned}$$

    This yields

    $$\begin{aligned} M \left( \int _0^1 u - \beta L \right) = \int _0^1 uf - \beta M L \quad \text {i.e} \ \ M = \frac{\displaystyle \int _0^1 uf\,\mathrm{d}x }{\displaystyle \int _0^1 u \,\mathrm{d}x } = c_1\,, \end{aligned}$$

    thus leading to \(c_1 = c_2\), i.e, to a contradiction by Remark 3.4.

  2. (ii)

    \( \beta < \alpha {\le } \gamma \) (or \(\beta < \gamma { \le } \alpha \), resp.): Similarly as before, we consider

    $$\begin{aligned} v_\tau (x) := {\left\{ \begin{array}{ll} u(x) \quad &{}\text { if }x \notin (a,b)\\ \tau \quad &{}\text { if }x\in (a,b)\,, \end{array}\right. } \end{aligned}$$

    where \(\tau = \alpha \) (or \(\tau = \gamma \), resp.). In any case, according to Remarks 3.4 and 3.5, we can suppose that \(c_1>c_2\) and

    $$\begin{aligned} \lambda \int _a^b \left( (c_1 - f)^2 - (c_2 - f)^2\right) \,\mathrm{d}x = 2. \end{aligned}$$

    We observe that

    $$\begin{aligned} \int _a^b {f}\,\mathrm{d}x = ML - A, \end{aligned}$$

    where M and L are the constants defined in the previous case and \(A:=\frac{1}{\lambda (c_1 - c_2)}\). Then, it is easy to check that \(F(u,c_1, c_2) = F(v_\tau ,c_1,c_2)\). Again, \((v_\tau , c_1,c_2)\) is a minimizer, and thus, it satisfies (3.1). Repeating the reasoning in the previous case (replacing \(ML - A\) instead of ML), we obtain

    $$\begin{aligned} M - \frac{A}{L}= c_1, \end{aligned}$$

    and if we redo the same computations for \(c_2\), we obtain the same equation, thus leading to \(c_1 = c_2\), i.e., to a contradiction as before.

  3. (iii)

    \( \alpha { \le } \gamma < \beta \) (or \( \gamma { \le } \alpha < \beta \), resp.): In this case, we define

    $$\begin{aligned} v_\tau (x) := {\left\{ \begin{array}{ll} u(x) \quad &{}\text { if }x \notin (a,b)\\ \tau \quad &{}\text { if }x\in (a,b)\,, \end{array}\right. } \end{aligned}$$

    where \(\tau = \gamma \) (or \(\tau = \alpha \), resp.). Once again, repeating the computations in the previous case, we end up with \(c_1=c_2\), thus finishing the proof.

\(\square \)

We finish this section by pointing out that Theorem 3.3 and Proposition 3.6 already prove Theorem 1.2.

4 Properties of Minimizers

4.1 Proof of Properties (a) and (b)

In this section, we show that the set of minimizers of F coincides with that of minimizers to Chan–Vese and we prove Properties (a) and (b) of the minimizers.

Remark 4.1

It is obvious that given \(E\subset \Omega \) of finite perimeter

$$\begin{aligned}\begin{aligned}F(\chi _E,c_1,c_2)&= \mathrm{Per}(E;(0,1)){ + } \lambda \int _E (c_1-f)^2\,\mathrm{d}x+ { \lambda }\int _{(0,1){\setminus } E}(c_2-f)^2\,\mathrm{d}x.\end{aligned}\end{aligned}$$

Then, we note that

$$\begin{aligned} \nonumber \min _{u\in L^2(0,1),c_1,c_2} F(u,c_1,c_2)\le & {} \min _{E,c_1,c_2} \mathrm{Per} (E;\Omega )+ {\lambda } \int _E (c_1-f)^2\, \mathrm{d}x\\&+ { \lambda }\int _{\Omega {\setminus } E} (c_2-f)^2\,\mathrm{d}x.\end{aligned}$$
(4.1)

On the other hand, by Theorem 1.2, we now that the minimum of F is achieved in a constant or binary BV function u, for the first coordinate. Then, \(u=\chi _E\) (or \(u=k\in [0,1]\))for a set \(E\subseteq (0,1)\) of finite perimeter, which proves the reverse inequality in (4.1). This shows that minimizers of F coincide with Chan–Vese’s minimizers.

We now prove Properties (a) and (b):

Suppose that \(x_0\in J_u\) and that \(u^-(x_0)=1\), \(u^+(x_0)=0\). Then, as explained in Remark 3.5, \({{\varvec{z}}_u}(x_0)=-1\), with \({{\varvec{z}}_u}\) corresponding to \(\partial \Phi (u)\) as given by Theorem 2.2. Moreover, since \(\partial {\mathbb {I}}_{[0,1]}(x)=]-\infty ,0]\delta _{0}+[ 0,+\infty [\delta _{1}\), by (2.2)\(_1\), as in the Proof of Theorem 3.3 that

$$\begin{aligned} f^+(x_0)\le \frac{c_1 + c_2}{2}\le f^-(x_0). \end{aligned}$$

If instead \(u^-(x_0)=0\), \(u^+(x_0)=1\), one gets

$$\begin{aligned} f^+(x_0)\ge \frac{c_1 + c_2}{2} \ge f^-(x_0). \end{aligned}$$

Therefore

$$\begin{aligned} J_u\subseteq C_{1,2}:=\{x\in (0,1) : f(x)\ni \frac{c_1+c_2}{2}\},\end{aligned}$$
(4.2)

where the image of a jump point is understood in a multivalued sense (i.e., \(f(x)=[\min \{f^-(x),f^+(x)\},\max \{f^-(x),f^+(x)\}]\)). Note that this fact almost proves Properties (a) and (b). The only remaining thing to prove is that a jump point cannot exist in the interior of \(C_{1,2}\). Suppose, by contradiction that \(x_0\in J_u\cap C_{1,2}^o\) and let \(a<x_0\) be such that \(J_u\cap [a,x_0[=\emptyset \). Without loosing generality, we can suppose that \(u=1\) in \([a,x_0[\) and that \(u^+(x_0)=0\). Then, considering \(v:=u\chi _{(0,1){\setminus } [a,x_0[}\), we easily get that \(F(v,c_1,c_2)=F(u,c_1,c_2)\) and, therefore, \((v,c_1,c_2)\) is a minimizer too. Then, repeating the reasoning in Proposition 3.6, in case (ii), we arrive at a contradiction, which finishes the proof of Properties (a) and (b).

Remark 4.2

We note that Properties (a) and (b) of the minimizers are only expected to hold in the one-dimensional case. In fact, an easy counterexample in two dimensions is provided by \(f=\chi _E\) with \(E=[-{\frac{1}{2},\frac{1}{2}}]^2\) and \(\Omega =[-1,1]^2\). In this case, \(J_f\) is precisely the boundary of the square \([-{ \frac{1}{2},\frac{1}{2}}]^2\) while it can be proved that, for any \(\lambda >\frac{16}{3}\) Chan–Vese’s minimizer cannot be either \(u=\chi _\Omega \) or \(u=\chi _E\) , thus showing that (a) and (b) do not hold (Fig. 1).

Fig. 1
figure 1

Left: \(\partial E\) and \(\partial \Omega \) represented by solid and dashed lines, respectively. Right: Chan–Vese segmentation, where black and white colors represent 1 and 0 numerical values, respectively

Fig. 2
figure 2

\(\partial E_\delta \), i.e., \(J_{v_\delta }\)

In fact, Chan–Vese’s energy for these two candidates is exactly \(\frac{3}{4}\lambda \) (for \(u=\chi _\Omega \), \(c_1=\frac{1}{4}\)) and 4 (for \(u=\chi _E\), \(c_1=1\), \(c_2=0\)). Therefore, for \(\lambda >\frac{16}{3}\), \(u=\chi _\Omega \) is not a minimizer. On the other hand, it is easy to modify the corners of the square by reducing the perimeter and not changing to much the fidelity term in the energy. We modify E, calling the new set \(E_\delta \), by removing the corners through the use of \(\delta \)-radius circumferential arches tangent to every two contiguous sides of \(\partial E\) (see Fig. 2).

Then, one can check that \(v_\delta := \chi _{E_\delta }\), \(c_1=1\), \(c_2=\frac{(4-\pi )\delta ^2}{3+(4-\pi )\delta ^2}\) for \(\delta \) satisfying

$$\begin{aligned} \frac{3\delta }{3+{ (4-\pi )}\delta ^2}<\frac{2}{\lambda }, \end{aligned}$$

has strictly less energy. This fact is related to the non-calibrability of the set E with respect to the isotropic norm in the total variation (see  [1]).

Therefore, if one wants to obtain similar results to Properties (a) and (b) in higher dimensions, the total variation term needs to be changed by an anisotropic version of it as in the case of the anisotropic Rudin–Osher–Fatemi functional, for which stability of piecewise constant functions on rectangles has been recently shown in  [14] and [13]. We will investigate this issue further in a subsequent paper.

4.2 Application of the Properties

In this section, we propose a trivial way to approach the solution of the 1D Chan–Vese problem using properties (a) and (b) of the minimizer. We will also comment on the pros of this trivial algorithm in front of those based on a Gradient Descent (GD) scheme. We remark that those algorithms are applied in an 1D version of alternating scheme proposed by Chan and Vese in [7]. Hereinafter, we will assume that the boundary of each level of the datum f set has a finite number of points.

We start with the general case. In this case, the idea is:

  1. (1)

    Take a discretization of the range, thus defining the working level sets.

  2. (2)

    In each level set, compute the binary candidate with the least energy with jumps in the boundary of the level set.

  3. (3)

    Compare between the solutions and choose one with the smallest energy.

Besides, in the case of f being a step function, we can further simplify the previous idea thanks to the implementation of the inclusion \(J_u \subseteq J_f\). This reduction is based on trying all the possible combinations of characteristic solutions whose jump set is a subset of \(J_f\).

In Fig. 3, we sketch how to perform Step 2 in the general algorithm for a fixed level set. First, we obtain all possible jump points for the possible minimizer u, (\(a_i , i\in \{1,...,m\}\)). Then, we know that the candidate to minimizer takes the form

$$\begin{aligned} u=\chi _{\cup _{j\in {\mathcal {I}}}[a_j,a_{j+1}]} \quad \text {where}\quad {\mathcal {I}}\subseteq \{0, ...,m-1\}. \end{aligned}$$

Computing all the possibilities, we keep the candidate with the least energy among them. We compare its energy with that of the candidate obtained from a previous bigger level set and the procedure is repeated until we reach the lowest level set in the discretization.

Fig. 3
figure 3

Given the level set (illustrated by the dashed blue line), the candidates take a constant (0 or 1) value between each couple of vertical lines

To show the suitability of this trivial approach in some situations, we present the following example:

Example

Suppose we segment by the 1D Chan–Vese model a signal whose shape is similar to the one of the Weierstrass function. This kind of signals has the property of exhibiting abrupt variations of its slope on the whole domain, which causes problems for GD-based algorithms. This intuitive idea can be seen in Fig. 4, where we compute an approximation to the minimizer using two different approaches: in one of them, we use the alternating Chan–Vese scheme with a GD-based method (ADAGRAD algorithm, see  [11]); while on the other one, we use the trivial scheme explained above.

Fig. 4
figure 4

Comparison of different approaches in the segmentation of a Weierstrass type function by 1D Chan–Vese (original signal in black; in colors the result of the segmentation). Top: GD-based approach. Bottom: Trivial approach. Left: Result in [0, 1]. Right: Result zoomed in [0.1, 0.23]

We note how the GD-based approach suffers from the variations of the signal, thus affecting to the performance of the alternating Chan–Vese scheme. In contrast, the trivial approach, based on properties (a) and (b), provides an adequate minimizer approximation. Moreover, we note that even if we chose a particular case of an algorithm based on GD, this type of behaviour is a trend in any of these algorithms. Therefore, the use of the scheme presented in this paper is beneficial in situations where the application of GD (or variants) gives a wrong segmentation of the signal, as shown in the figure above.