Skip to main content
Log in

Convex Image Denoising via Non-convex Regularization with Parameter Selection

  • Published:
Journal of Mathematical Imaging and Vision Aims and scope Submit manuscript

Abstract

We introduce a convex non-convex (CNC) denoising variational model for restoring images corrupted by additive white Gaussian noise. We propose the use of parameterized non-convex regularizers to effectively induce sparsity of the gradient magnitudes in the solution, while maintaining strict convexity of the total cost functional. Some widely used non-convex regularization functions are evaluated and a new one is analyzed which allows for better restorations. An efficient minimization algorithm based on the alternating direction method of multipliers (ADMM) strategy is proposed for simultaneously restoring the image and automatically selecting the regularization parameter by exploiting the discrepancy principle. Theoretical convexity conditions for both the proposed CNC variational model and the optimization sub-problems arising in the ADMM-based procedure are provided which guarantee convergence to a unique global minimizer. Numerical examples are presented which indicate how the proposed approach is particularly effective and well suited for images characterized by moderately sparse gradients.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Bioucas-Dias, J., Figueredo, M.: Fast image recovery using variable splitting and constrained optimization. IEEE Trans. Image Process. 19(9), 2345–2356 (2010)

    Article  MathSciNet  Google Scholar 

  2. Blake, A., Zisserman, A.: Visual Reconstruction. MIT Press, Cambridge (1987)

    Google Scholar 

  3. Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1122 (2011)

    MATH  Google Scholar 

  4. Calvetti, D., Reichel, L.: Tikhonov regularization of large linear problems. BIT Numer. Math. 43, 263–283 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  5. Calvetti, D., Morigi, S., Reichel, L., Sgallari, F.: Tikhonov regularization and the L-curve for large, discrete ill-posed problems. J. Comput. Appl. Math. 123, 423–446 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  6. Chan, R.H., Tao, M., Yuan, X.M.: Constrained total variational deblurring models and fast algorithms based on alternating direction method of multipliers. SIAM J. Imag. Sci. 6, 680–697 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  7. Chen, P.Y., Selesnick, I.W.: Group-sparse signal denoising: non-convex regularization, convex optimization. IEEE Trans. Sign. Proc. 62, 3464–3478 (2014)

    Article  MathSciNet  Google Scholar 

  8. Christiansen, M., Hanke, M.: Deblurring methods using antireflective boundary conditions. SIAM J. Sci. Comput. 30, 855–872 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  9. Clarke, F.H.: Optimization and Nonsmooth Analysis. Wiley, New York (1983)

    MATH  Google Scholar 

  10. Corless, R.M., Gonnet, G.H., Hare, D.E.G., Jeffrey, D.J., Knuth, D.E.: On the Lambert W Function. Adv. Comput. Math. 5, 329–359 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  11. Ekeland, I., Temam, R.: Convex Analysis and Variational Problems (Classics in Applied Mathematics). SIAM, Philadelphia (1999)

    Book  MATH  Google Scholar 

  12. Engl, H.W., Hanke, M., Neubauer, A.: Regularization of Inverse Problems. Kluwer, Dordrech (1996)

    Book  MATH  Google Scholar 

  13. Glowinski, R., Le Tallec, P.: Augmented Lagrangians and Operator-Splitting Methods in Nonlinear Mechanics. SIAM, Philadelphia (1989)

    Book  MATH  Google Scholar 

  14. He, C., Hu, C., Zhang, W., Shi, B.: A fast adaptive parameter estimation for total variation image restoration. IEEE Trans. Image Process. 23(21), 4954–4967 (2014)

    Article  MathSciNet  Google Scholar 

  15. Lanza, A., Morigi, S., Sgallari, F.: Convex image denoising via non-convex regularization. Scale Space Var. Meth. Comput. Vis. 9087, 666–677 (2015)

    MathSciNet  Google Scholar 

  16. Lu, C.W.: Image restoration and decomposition using nonconvex nonsmooth regularisation and negative Hilbert-Sobolev norm. IET Image Process. 6(6), 706–716 (2012)

    Article  MathSciNet  Google Scholar 

  17. Ng, M.K., Chan, R.H., Tang, W.C.: A fast algorithm for deblurring models with neumann boundary conditions. Siam J. Sci. Comput. 21, 851–866 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  18. Nikolova, M.: Estimation of binary images by minimizing convex criteria. Proceedings of IEEE International Conference Image Processing, vol. 2, pp. 108–112 (1998)

  19. Nikolova, M.: Analysis of the recovery of edges in images and signals by minimizing nonconvex regularized least-squares. Multiscale Model. Simul. 4(3), 960–991 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  20. Nikolova, M., Ng, M., Tam, C.: Software is available at http://www.math.hkbu.edu.hk/~mng/imaging-software.html

  21. Nikolova, M., Ng, M.K., Tam, C.P.: Fast non-convex non-smooth minimization methods for image restoration and reconstruction. IEEE Trans. Image Process. 19(12), 3073–3088 (2010)

    Article  MathSciNet  Google Scholar 

  22. Parekh, A., Selesnick, I.W.: Convex denoising using non-convex tight frame regularization. arXiv Preprint arXiv:1504.00976 (2015)

  23. Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)

    Book  MATH  Google Scholar 

  24. Rockafellar, R.T., Wets, R.J.B.: Variational Analysis. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 317. Springer, Berlin (1998)

    Google Scholar 

  25. Rudin, L.I., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Phys. D 60(1–4), 259–268 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  26. Selesnick, I.W., Bayram, I.: Sparse signal estimation by maximally sparse convex optimization. IEEE Trans. Signal Proc. 62(5), 1078–1092 (2014)

    Article  MathSciNet  Google Scholar 

  27. Selesnick, I.W., Parekh, A., Bayram, I.: Convex 1-D total variation denoising with non-convex regularization. IEEE Signal Proc. Lett. 22(2), 141–144 (2015)

    Article  Google Scholar 

  28. Sidky, E.Y., Chartrand, R., Boone, J.M., Pan, X.: Constrained TpV-minimization for enhanced exploitation of gradient sparsity: application to CT image reconstruction. IEEE J. Trans. Eng. Health Med. 2, 1–18 (2014)

    Article  Google Scholar 

  29. Wen, Y.W., Chan, R.H.: Parameter selection for total-variation-based image restoration using discrepancy principle. IEEE Trans. Image Process. 21(4), 1770–1781 (2012)

    Article  MathSciNet  Google Scholar 

  30. Wu, C., Tai, X.C.: Augmented Lagrangian method. Dual methods, and split bregman iteration for ROF, vectorial TV, and high order models. SIAM J. Imag. Sci. 3(3), 300–339 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  31. Zhang, X., Burger, M., Bresson, X., Osher, S.: Bregmanized nonlocal regularization for deconvolution and sparse reconstruction. SIAMJ. Imag. Sci. 3(3), 253–276 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  32. Wu, C., Zhang, J., Tai, X.C.: Augmented Lagrangian method for total variation restoration with non-quadratic fidelity. Inv. Prob. Imag. 5(1), 237–261 (2011)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

This work was supported by the “National Group for Scientific Computation (GNCS-INDAM)” and ex60% project by the University of Bologna “Funds for selected research topics”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Serena Morigi.

Appendix

Appendix

1.1 Proof of Proposition 3.2

Proof

Recalling that convexity of a function is invariant to non-singular linear transformations of the domain, we search for a non-singular linear transformation \(T: {\mathbb R}^3 \rightarrow {\mathbb R}^3\) of the domain of function f defined in (3.7), that is \( x = T y\), with \(x = (x_1,x_2,x_3)^T\), \(y = (y_1,y_2,y_3)^T\), \(T = (T_{i,j})_{i,j:=1,2,3}\) such that convexity conditions for the function \(f_T := f \circ T\) are easier to identify than for f. We obtain the explicit expression of function \(f_T\), depending on y and on the nine entries of the transformation matrix T, by replacing \(x = Ty\) in (3.7):

$$\begin{aligned} f_T(y;T) \;&{:=}&\; f(Ty)\nonumber \\&\;{=}\;&\mu Q_1(y;T) + \phi \left( \frac{1}{\Delta } \sqrt{Q_2(y;T)} \, ; a\right) \, , \end{aligned}$$
(6.1)

where \(Q_1\) and \(Q_2\) are the quadratic functions. We impose that both \(Q_1\) and \(Q_2\) do not contain mixed products, that \(Q_2\) does not depend on \(y_3\), and that the coefficients of both \(y_1^2\) and \(y_2^2\) in \(Q_2\) are equal to one; we obtain that the transformation matrix \(T = (0,\sqrt{2}/3,\sqrt{3}/3; \sqrt{2}/2,-\sqrt{2}/6,\sqrt{3}/3; -\sqrt{2}/2,-\sqrt{2}/6,\sqrt{3}/3)\) yields

$$\begin{aligned} f_T(y_1,y_2,y_3) \;{=}\; \frac{\mu }{6} \left( y_1^2 + y_2^2 + y_3^2 \right) \;{+}\; \phi \left( \frac{1}{\Delta } \sqrt{ y_1^2 + y_2^2 } \right) . \end{aligned}$$
(6.2)

It follows that the function \(f_T(y_1,y_2,y_3)\) above, hence also f in (3.7), is strictly convex if and only if the function \(g(y_1,y_2)\) defined in (3.9) is strictly convex. \(\square \)

1.2 Proof of Proposition 3.3

Proof

The function \(g: {\mathbb R}^2 \rightarrow {\mathbb R}\) in (3.9) can be rewritten in composite form as follows:

$$\begin{aligned} g(y_1,y_2) \,\;{=}\;\, h \left( \rho (y_1,y_2) \right) \, , \end{aligned}$$
(6.3)

with the function \(\rho : {\mathbb R}^2 \rightarrow {\mathbb R}_+\) defined by

$$\begin{aligned} \rho (y_1,y_2) \;{=}\; \sqrt{y_1^2 + y_2^2} \, , \end{aligned}$$
(6.4)

and the function \(h: {\mathbb R}_{+} \rightarrow {\mathbb R}\) defined in (3.10).

We notice that, due to definition of function \(\rho \) in (6.4) and due to Assumption 1) in Sect. 2 on the penalty function \(\phi \), we have

$$\begin{aligned}&\rho \,\;{\in }\;\, \mathcal {C}^0\big ( {\mathbb R}^2 \big ) \,\;{\cap }\;\, \mathcal {C}^2\big ( {\mathbb R}^2 \setminus \{(0,0)\} \big ) \, , \nonumber \\&h \,\;{\in }\;\, \mathcal {C}^0({\mathbb R}_+) \,\;{\cap }\;\, \mathcal {C}^2({\mathbb R}_+^*) \, . \end{aligned}$$
(6.5)

It follows from (6.3)–(6.5) that the function g is such that

$$\begin{aligned} g \;{\in }\; \mathcal {C}^0 \big ( {\mathbb R}^2 \big ) \,\;{\cap }\;\, \mathcal {C}^2 \big ( {\mathbb R}^2 \setminus \{(0,0)\} \big ) \, . \end{aligned}$$
(6.6)

Hence, condition for g to be strictly convex is that its Hessian matrix \(H_g(y_1,y_2)\) is positive definite for any \((y_1,y_2) \in {\mathbb R}^2 \setminus \{(0,0)\}\). In the following, we investigate such condition.

By applying the chain rule of differentiation twice to the function g in composite form (6.3), we get

$$\begin{aligned} H_g= & {} H_{\rho } \, h' \;{+}\; \nabla \rho {\nabla \rho }^T h''\nonumber \\= & {} \left( \begin{array}{ll} \rho _{1,1} h' + \rho _1^2 h'' &{} \rho _{1,2} h' + \rho _1 \rho _2 h'' \\ \rho _{1,2} h' + \rho _1 \rho _2 h'' &{} \rho _{2,2} h' + \rho _2^2 h'' \end{array} \right) \, , \end{aligned}$$
(6.7)

where \(H_{\rho }\) and \(\nabla \rho \) denote the Hessian matrix and the gradient of function \(\rho \) in (6.4), respectively, and where, for simplicity of notations, dependencies on independent variables are dropped and a concise notation for ordinary and partial derivatives is adopted, namely \(h' := dh / d\rho \), \(h'' := d^2h / d\rho ^2\), \(\rho _i := \partial \rho / \partial y_i\), \(\rho _{i,j} := \partial ^2 \rho / \partial y_i \partial y_j\), \(i,j \in \{1,2\}\). We remark that, since we are considering the case \((y_1,y_2) \ne (0,0)\), all the differential quantities in (6.7) are well defined and, in particular, no one-sided derivative is involved.

According to the Sylvester’s criterion, the Hessian matrix \(H_g\) in (6.7) is positive definite if and only if its two leading principal minors are positive, that is if the following two conditions hold:

$$\begin{aligned} \left\{ \begin{array}{l} \displaystyle { \rho _{1,1} h' + \rho _{1}^2 h'' > 0 } \\ \displaystyle { \left( \rho _{1,1} \rho _{2,2} - \rho _{1,2}^2 \right) \left( h'\right) ^2}\\ \displaystyle {\quad \,{+}\, \left( \rho _{2,2} \rho _{1}^2 + \rho _{1,1} \rho _{2}^2 - 2 \rho _{1} \rho _{2} \rho _{1,2} \right) h' h'' > 0 } \end{array} \right. \, . \end{aligned}$$
(6.8)

The first- and second-order partial derivatives of \(\rho \) in (6.4) are as follows:

$$\begin{aligned}&\rho _{1} = \frac{y_1}{\rho } \, , \quad \, \rho _{2} \,{=}\, \frac{y_2}{\rho } \, , \quad \, \rho _{1,1} \,{=}\, \frac{y_2^2}{\rho ^3} \, , \quad \,\nonumber \\&\rho _{2,2} = \frac{y_1^2}{\rho ^3} \, , \quad \, \rho _{1,2} \,{=}\, -\frac{y_1 y_2}{\rho ^3} \, . \end{aligned}$$
(6.9)

Replacing expressions (6.9) into \(H_g\) positive definiteness conditions (6.8) and recalling that \(\rho \) is a positive quantity for every \((y_1,y_2) \ne (0,0)\), we obtain

(6.10)

Hence, the function g defined in (3.9) is strictly convex if and only if the function h in (3.10) is monotonically increasing and strictly convex. \(\square \)

1.3 Proof of Proposition 4.2

Proof

According to Proposition (3.5), in case that the parameter pair \((\mu ,a)\) satisfies (3.13), the functional \(\mathcal {J}(u;\mu ,a)\) in (1.4) is strictly convex, thus admitting a unique minimizer \(u^*\). Then, the first-order optimality condition for \(\mathcal {J}\) at \(u^*\) given in (4.6) follows immediately from the generalized Fermat’s rule (see Theorem 10.1 in [24]).

To prove (4.7), we need to write in a more explicit form the subdifferential \(\partial _u \left[ \, \mathcal {J} \,\right] \). However, we cannot apply the additive rule of subdifferentials to the functional \( \mathcal {J}\) since the regularization term \(\Phi \) in (4.5) is non-convex in u due to concavity of the penalty function \(\phi \). Hence, we resort to notions from calculus for non-smooth non-convex functions, in particular the Clarke generalized gradient [9], which extends the concept of subdifferential for non-smooth convex functions to the case of non-smooth non-convex but locally Lipschitz functions. Indeed, the rest of the proof relies on the fact that, according to Lemma 4.1, both the total functional \(\mathcal {J}\) in (1.4) and, separately, the \(\Phi \) regularization term in (4.5) and the quadratic fidelity term are locally Lipschitz functions, such that their generalized gradient is defined.

First, we recall that for non-smooth but convex functions the Clarke generalized gradient is equal to the subdifferential [9], that is, in our case:

$$\begin{aligned} \partial _u \left[ \, \mathcal {J} \,\right] (u^*) \,\;{=}\;\, \bar{\partial }_u \left[ \, \mathcal {J} \,\right] (u^*) \; . \end{aligned}$$
(6.11)

After recalling the definition of functional \(\mathcal {J}\) in (1.4), we can now apply the additive rule of generalized gradients [9] to the right-hand side of (6.11):

$$\begin{aligned} \bar{\partial }_u \left[ \, \mathcal {J} \,\right] (u^*)&\;\;{\subset }\;\;\,&\bar{\partial }_u \! \left[ \, \Phi \,\right] (u^*) \,\;{+}\;\, \bar{\partial }_u \! \left[ \, \frac{\mu }{2} \, \big \Vert u - b \big \Vert _2^2 \, \right] \!\,\!(u^*) \nonumber \\&\;\;{=}\;\;\,&\bar{\partial }_u \! \left[ \, \Phi \,\right] (u^*) \,\;{+}\;\, \mu \, (u^* - b) \; , \end{aligned}$$
(6.12)

where in (6.12) we applied the property that the generalized gradient reduces to the usual gradient in case of continuously differentiable functions.

Recalling the definition of the \(\Phi \) regularization term in (4.5) and applying to the first term of (6.12) the chain rule for generalized gradients [9], we obtain

$$\begin{aligned} \bar{\partial }_u \! \left[ \, \Phi \,\right] (u^*) \;\;{=}\;\; D^T \, \bar{\partial }_{Du} \left[ \, \Phi \,\right] (Du^*) \; . \end{aligned}$$
(6.13)

From (6.13), (6.12), (6.11), and statement (4.6), statement (4.7) follows immediately. \(\square \)

1.4 Proof of Theorem 4.4

Proof

Based on the definition of the augmented Lagrangian functional in (4.3), we rewrite in explicit form the first inequality of the saddle-point condition in (4.4):

(6.14)

and, similarly, the second inequality:

(6.15)

In the first part of the proof, we demonstrate that if \(\,(u^*,z^*,t^*;\lambda _z^*,\lambda _t^*)\,\) is a solution of the saddle-point problem (4.3)–(4.4), that is it satisfies the two inequalities (6.14) and (6.15), then \(u^*\) is the unique minimizer of the functional \(\mathcal {J}(u;\mu ,a)\) in (1.4).

Since (6.14) must be satisfied for any \((\lambda _z,\lambda _t) \;{\in }\; V {\times }\, Q\), by taking \(\lambda _z = \lambda _z^*\) we obtain

$$\begin{aligned} \langle \, \lambda _t^* - \lambda _t, t^* - D u^* \, \rangle \,\;{\le }\;\, 0 \quad \; \forall \, \lambda _t \;{\in }\; Q \quad {\Longrightarrow } \quad t^* = D u^* \; . \end{aligned}$$
(6.16)

Similarly, by taking \(\lambda _t = \lambda _t^*\) in (6.14) we have

$$\begin{aligned} \langle \, \lambda _z^* - \lambda _z, z^* - u^* \, \rangle \,\;{\le }\;\, 0 \quad \; \forall \, \lambda _z \;{\in }\; V \quad {\Longrightarrow } \quad z^* = u^* \; . \end{aligned}$$
(6.17)

The second inequality (6.15) must be satisfied for any \((u,z,t) \;{\in }\; V {\times }\, V {\times }\, Q\). Hence, by taking simultaneously \(z = u\) and \(t = Du\) in (6.15) and, at the same time, substituting in (6.15) the two previously derived conditions (6.16) and (6.17), we obtain

(6.18)

Inequality (6.18) indicates that \(u^*\) is a global minimizer of the functional \(\mathcal {J}(u;\mu ,a)\) in (1.4). Hence, we have demonstrated that all the saddle-point solutions, if there exists one, of problem (4.3)–(4.4) are of the form \(\,(u^*,u^*,Du^*;\lambda _z^*,\lambda _t^*)\,\), with \(u^*\) denoting the unique global minimizer of \(\mathcal {J}(u;\mu ,a)\).

In the second part of the proof, we demonstrate that at least one solution of the saddle-point problem exists. In particular, we prove that if \(u^*\) is a minimizer of \(\mathcal {J}(u;\mu ,a)\) in (1.4), then there exist \(\,(z^*,t^*) \in V {\times }\, Q\) and \(\,(\lambda _z^*,\lambda _t^*) \in V {\times }\, Q\) such that \((u^*,z^*,t^*;\lambda _z^*,\lambda _t^*)\) is a solution of the saddle-point problem (4.3)–(4.4), that is it satisfies the two inequalities (6.14) and (6.15). The demonstration relies on an initial suitable choice of the vectors \(z^*\), \(t^*\), \(\lambda _z^*\), and \(\lambda _t^*\). Analogously to the proofs in [14, 30], we take

$$\begin{aligned} z^* \;{=}\; u^* \, , \quad t^* \;{=}\; D u^* \, , \end{aligned}$$
(6.19)
$$\begin{aligned} \lambda _z^*= & {} \mu \, (u^* - b) \, , \quad \lambda _t^* \;{\in }\;\, \bar{\partial }_{Du} \left[ \, \Phi \,\right] (Du^*) \;\, \mathrm {s.t.}\,\mathrm {:} \quad \;\nonumber \\&\lambda _z^* + D^T \lambda _t^* \,\;{=}\;\, 0 \, , \end{aligned}$$
(6.20)

where the term \(\bar{\partial }_{Du} \left[ \, \Phi \,\right] (Du^*)\) indicates the Clarke generalized gradient (with respect to Du, calculated at \(Du^*\)) of the non-convex regularization term \(\Phi \) in (4.5). We notice that a vector \(\lambda _t^*\) satisfying (6.20) is guaranteed to exist thanks to Proposition 4.2. In fact, since here we are assuming that \(u^*\) is a minimizer of functional \(\mathcal {J}(u;\mu ,a)\), the first-order optimality condition in (4.7) holds true.

Due to the first two settings in (6.19), the first saddle-point condition in (6.14) is clearly satisfied. Proving the second condition, that we rewrite in compact form as

$$\begin{aligned}&\mathcal {L}\,(u^*,z^*,t^*;\lambda _z^*,\lambda _t^*;\mu ,a)\;\;\nonumber \\&\quad {\le }\;\; \mathcal {L}\,(u,z,t;\lambda _z^*,\lambda _t^*;\mu ,a) \;\;\;\, \forall \; (u,z,t\,\!\,\!) \;{\in }\; V {\times }\; V {\times }\; Q \, ,\nonumber \\ \end{aligned}$$
(6.21)

is less straightforward: we need to investigate the optimality conditions of functional \(\mathcal {L}\,(u,z,t;\lambda _z^*,\lambda _t^*;\mu ,a)\) in (6.21). To this aim, we introduce below the three functions \(\mathcal {L}_u(u,z,t;\lambda _z^*,\lambda _t^*;\mu ,a)\), \(\mathcal {L}_z(u,z,t;\lambda _z^*,\lambda _t^*;\mu ,a)\), and \(\mathcal {L}_t(u,z,t;\lambda _z^*,\lambda _t^*;\mu ,a)\) representing the restriction of \(\mathcal {L}\,(u,z,t;\lambda _z^*,\lambda _t^*;\mu ,a)\) to only the terms depending on the primal variables u, z, and t, respectively:

$$\begin{aligned}&\mathcal {L}_u(u,z,t;\lambda _z^*,\lambda _t^*;\mu ,a) \,\;\nonumber \\&\quad {=}\;\,\underbrace{ \frac{\beta _z}{2} \, \Vert z - u \Vert _2^2 \,{+}\, \frac{\beta _t}{2} \, \Vert t - D u \Vert _2^2 \,{+}\, \langle \lambda _z^* , u \rangle \,{+}\, \langle \lambda _t^* , D u \rangle }_{F_1(u)} \end{aligned}$$
(6.22)
$$\begin{aligned}&\mathcal {L}_z(u,z,t;\lambda _z^*,\lambda _t^*;\mu ,a) \,\;\nonumber \\&\quad {=}\;\, \displaystyle { \underbrace{ \frac{\beta _z}{2} \, \Vert z - u \Vert _2^2 \;{-}\; \langle \, \lambda _z^* , z \, \rangle }_{F_1(z)} \,\;{+}\;\, \underbrace{ \frac{\mu }{2} \, \Vert z - b \Vert _2^2 }_{F_2(z)} } \end{aligned}$$
(6.23)
$$\begin{aligned}&\mathcal {L}_t(u,z,t;\lambda _z^*,\lambda _t^*;\mu ,a) \,\;\nonumber \\&\quad {=}\;\, \displaystyle { \underbrace{ {-}\;\, \langle \, \lambda _t^* , t \, \rangle }_{F_1(t)} \,\;{+}\;\, \underbrace{ \sum _{i = 1}^{n} \phi \left( \Vert t_i \Vert _2 ; a \right) \;{+}\; \frac{\beta _t}{2} \, \Vert t - D u \Vert _2^2 }_{F_2(t).} }\nonumber \\ \end{aligned}$$
(6.24)

We notice that the functions \(\mathcal {L}_u\), \(\mathcal {L}_z\), and \(\mathcal {L}_t\) above are proper, continuous, and coercive with respect to the variables u, z, and t, respectively. Moreover, the three selected functions \(F_1\) and the function \(F_2\) in (6.23) are convex, hence \(\mathcal {L}_u\) and \(\mathcal {L}_z\) are convex. For what concerns the function \(F_2\) in (6.24), it follows from the results that are given in Proposition 4.5 that it is strictly convex if and only if the condition \(\beta _t > a\) is satisfied. Since in the ADMM-based scheme that we will present for solving the saddle-point problem (4.3)–(4.4) such condition will be taken as a constraint, we can assume here that it is satisfied, such that \(F_2\) in (6.24) is convex and, hence, \(\mathcal {L}_t\) is convex as well. By finally noticing that the three functions \(F_1\) are G\(\hat{a}\)teaux differentiable, we can apply Lemma 4.3 separately to (6.22), (6.23), and (6.24) thus obtaining the following optimality conditions for a generic point \((\bar{u},\bar{z},\bar{t})\):

$$\begin{aligned}&\big \langle \,{-}\, \beta _z (\bar{z} - \bar{u}) \;{-}\; \beta _t D^T (\bar{t} - D \bar{u}) \;{+}\; \lambda _z^* \;{+}\; D^T \lambda _t^* \,\, , \, u - \bar{u} \,\, \big \rangle \,\;\nonumber \\&\quad {\ge }\;\, 0 \quad \; \forall \, u \;{\in }\; V \, , \end{aligned}$$
(6.25)
$$\begin{aligned}&\quad \frac{\mu }{2} \, \Vert z - b \Vert _2^2 \;{-}\;\,\! \frac{\mu }{2} \, \Vert \bar{z} - b \Vert _2^2 \,\;{+}\;\, \big \langle \,\, \beta _z (\bar{z} - \bar{u}) - \lambda _z^* \,\, , \,\, z - \bar{z} \,\, \big \rangle \,\;\nonumber \\&\quad {\ge }\;\, 0 \quad \; \forall \, z \;{\in }\; V \, , \end{aligned}$$
(6.26)
$$\begin{aligned}&\sum _{i = 1}^{n} \phi \left( \Vert t_i \Vert _2 ; a \right) \;{+}\; \frac{\beta _t}{2} \, \Vert t - D \bar{u} \Vert _2^2 \;\nonumber \\&\quad {-}\; \sum _{i = 1}^{n} \phi \left( \Vert \bar{t}_i \Vert _2 ; a \right) \;{-}\; \frac{\beta _t}{2} \, \Vert \bar{t} - D \bar{u} \Vert _2^2 \nonumber \\&\quad {-}\, \left\langle \, \lambda _t^* \, , \, t - \bar{t} \,\, \right\rangle \,\;{\ge }\;\, 0 \quad \; \forall \, t \;{\in }\; Q \, . \end{aligned}$$
(6.27)

We now verify that the triplet \((z^*,t^*,u^*)\) satisfies the optimality conditions above, so that the second saddle-point condition (6.21) holds true. By substituting \((z^*,t^*,u^*)\) for \((\bar{z},\bar{t},\bar{u})\) in (6.25), (6.26), and (6.27), we obtain

$$\begin{aligned}&\big \langle \,{-}\, \beta _z \underbrace{(z^* - u^*)}_{0} \;{-}\; \beta _t D^T \underbrace{(t^* - D u^*)}_{0} \;\nonumber \\&\quad {+}\; \underbrace{\lambda _z^* \;{+}\; D^T \lambda _t^*}_{0} \,\, , \, u - u^* \,\, \big \rangle \,\;{\ge }\;\, 0 \;\;\, \forall \, u \;{\in }\; V \, , \end{aligned}$$
(6.28)
$$\begin{aligned}&\frac{\mu }{2} \, \Vert z - b \Vert _2^2 \;{-}\;\,\! \frac{\mu }{2} \, \Vert z^* - b \Vert _2^2 \,\;\nonumber \\&\quad {+}\;\, \big \langle \,\, \beta _z \underbrace{(z^* - u^*)}_{0} - \lambda _z^* \,\, , \,\, z - z^* \,\, \big \rangle \,\;{\ge }\;\, 0 \quad \; \forall \, z \;{\in }\; V \, , \end{aligned}$$
(6.29)
$$\begin{aligned}&\sum _{i = 1}^{n} \phi \left( \Vert t_i \Vert _2 ; a \right) \;{+}\; \frac{\beta _t}{2} \, \Vert t - D u^* \Vert _2^2 \;\nonumber \\&\quad {-}\; \sum _{i = 1}^{n} \phi \left( \Vert t^*_i \Vert _2 ; a \right) \;{-}\; \frac{\beta _t}{2} \, \underbrace{\Vert t^* - D u^* \Vert _2^2}_{0} \nonumber \\&\quad {-}\, \left\langle \, \lambda _t^* \, , \, t - t^* \,\, \right\rangle \,\;{\ge }\;\, 0 \quad \; \forall \, t \;{\in }\; Q \, , \end{aligned}$$
(6.30)

where the underlined terms are null due to some of the settings in (6.19)–(6.20). The first condition (6.28) is clearly satisfied. We rewrite the second and third conditions by substituting also the settings on \(\lambda _z^*\) and \(\lambda _t^*\) in (6.20):

$$\begin{aligned}&\frac{\mu }{2} \, \Vert z - b \Vert _2^2 \;{-}\; \frac{\mu }{2} \, \Vert z^* - b \Vert _2^2 \;{-}\; \langle \, \mu \, (z^* - b) , z - z^* \, \rangle \,\;\nonumber \\&\quad {\ge }\;\, 0 \quad \;\;\, \forall \, z \;{\in }\; V \, , \end{aligned}$$
(6.31)
$$\begin{aligned}&\sum _{i = 1}^{n} \phi \left( \Vert t_i \Vert _2 ; a \right) \;{+}\; \frac{\beta _t}{2} \, \Vert t - D u^* \Vert _2^2 \;\nonumber \\&\quad {-}\; \sum _{i = 1}^{n} \phi \left( \Vert t^*_i \Vert _2 ; a \right) \;{-}\; \frac{\beta _t}{2} \, \Vert t^* - D u^* \Vert _2^2 \nonumber \\&\quad {-}\;\ \bigg \langle \, \bar{\partial }_t \! \bigg [ \sum _{i = 1}^{n} \phi \left( \Vert t_i \Vert _2 ; a\right) \bigg ]\!(t^*) \;{+}\; \mu \, (t^* - Du^*) \, , \, t - t^* \, \bigg \rangle \,\;\nonumber \\&{\ge }\;\, 0 \quad \;\;\, \forall \, t \;{\in }\; Q \, , \end{aligned}$$
(6.32)

where in (6.32) we added the null term \(\mu \, (t^* - Du^*)\). The two optimality conditions (6.31) and (6.32) can be proved based on the concept of Bregman distance that we thus recall here briefly: given a convex not necessarily smooth function G and two points x, \(x^*\) in its domain, the Bregman distance (or divergence) associated with function G for points x, \(x^*\) is defined as

$$\begin{aligned} B_G(x,x^*) \,\;{:=}\;\, G(x) - G(x^*) - \langle \, \partial G(x^*) \, , \, x - x^* \, \rangle \; , \end{aligned}$$
(6.33)

where \(\partial G\) denotes the subdifferential of G. The Bregman distance is not a distance in strict sense, but it is always non-negative for convex functions G. Inequalities (6.31) and (6.32) can be equivalently rewritten in terms of Bregman distances as follows:

$$\begin{aligned}&B_Z(z,z^*) \,\;{\ge }\;\, 0 \;\; \forall \, z \;{\in }\; V \, , \;\; \;\; Z(z) \,\;{:=}\;\, \frac{\mu }{2} \, \Vert z - b \Vert _2^2 \, , \end{aligned}$$
(6.34)
$$\begin{aligned}&B_T(t,t^*) \,\;{\ge }\;\, 0 \;\; \forall \, t \;{\in }\; Q \, , \;\; \;\;\; T(t) \,\;{:=}\;\, \sum _{i = 1}^{n} \phi \left( \Vert t_i \Vert _2 ; a \right) \;\nonumber \\&\quad {+}\; \frac{\beta _t}{2} \, \Vert t - D u^* \Vert _2^2 \; . \end{aligned}$$
(6.35)

In particular, (6.34) follows immediately from (6.31) and (6.33), whereas (6.35) follows from (6.32) and (6.33) and from two further observations. First, the function T in (6.35) is convex for the same reasons for which the function \(F_2\) in (6.24) is convex. Second, the first term of the scalar product in (6.35) represents the subdifferential of the convex function T. Since the Bregman distance is always non-negative, (6.34) and (6.35) hold true, the second saddle-point condition in (6.21) is satisfied, and, finally, the second part of the Theorem proof is completed. \(\square \)

Fig. 10
figure 10

Geometric representation of problem (4.33)

1.5 Proof of Proposition 4.5

Proof

Condition (4.32) for convexity of function \(\theta \) in (4.31) can be easily demonstrated based on Proposition 3.3 in Sect. 3. In fact, after rewriting \(\theta \) as follows:

$$\begin{aligned} \theta (x) \;{=}\; \phi \left( \Vert x \Vert _2;a \right) \;{+}\; \frac{\beta }{2} \Vert x \Vert _2^2 + L(x) \, , \quad x \in {\mathbb R}^2 \, , \end{aligned}$$
(6.36)

with L being an affine function which, hence, does not affect convexity, we notice that the non-affine part of \(\theta \) in (6.36) can be written in composite form as

$$\begin{aligned}&h(\rho (x)) \, , \;\; \mathrm {with} \;\;\, \rho (x) := \Vert x \Vert _2 \, , \, x \in {\mathbb R}^2 \,\;\;\mathrm {and}\;\;\;\nonumber \\&\quad h(t):= \phi (t;a) \;{+}\; \frac{\beta }{2} t^2 \, , \, t \ge 0 \, . \end{aligned}$$
(6.37)

Hence, recalling demonstration of Proposition 3.3, the function \(\theta \) is convex if and only if both of the following conditions hold:

$$\begin{aligned} \left\{ \! \begin{array}{l} \,h'(t) \,\;{>}\;\, 0 \\ h''(t) \,\;{>}\;\, 0 \end{array} \right. \,\equiv & {} \;\, \left\{ \! \begin{array}{l} \phi '(t;a) + \beta \,t \,\;{>}\; 0 \\ \phi ''(t;a) + \beta \,\;{>}\; 0 \end{array} \right. \nonumber \\\equiv & {} \left\{ \! \begin{array}{l} \beta \,t \,\;{>}\; -\phi '(t;a) \\ \beta \;\,\,\! \,\;{>}\; - \phi ''(t;a) \end{array} \right. \quad \; \forall \, t \ge 0 \, . \end{aligned}$$
(6.38)

Since by hypothesis \(\beta > 0\) and the function \(\phi \) satisfies assumption (A2) in Sect. 2, namely \(\,\phi '(t;a) > 0\,\) for any \(t \ge 0\), the first condition in (6.38) is always satisfied. The second condition in (6.38) is equivalent to the convexity condition in statement (4.32).

We now prove statement (4.34), according to which the unique solution \(x^*\) of the strictly convex problem (4.33) is obtained by a shrinkage of vector r. To allow for a clearer understanding of the proof, in Fig. 10 we give a geometric representation of problem (4.33). First, we prove that the solution \(x^*\) of (4.33) lies on the closed half-line Or with origin at the 2-dimensional null vector O and passing through r, represented in solid red in Fig. 10a. To this purpose, we demonstrate that for any point z not lying on Or there always exists a point \(z^*\) on Or providing a lower value of the objective function in (4.33), that is a point \(z^*\) such that \(\theta (z) - \theta (z^*) > 0\). In particular, we define \(z^*\) as the intersection point between the half-line Or and the sphere with center in O and passing through z, depicted in solid blue in Fig. 10a. Recalling the definition of \(\theta \) in (4.31) and noting that \(\Vert z^*\Vert _2 = \Vert z \Vert _2\) by construction, we can thus write

$$\begin{aligned} \theta (z) - \theta (z^*)= & {} \phi \left( \Vert z \Vert _2;a \right) - \phi \left( \Vert z^* \Vert _2;a \right) \nonumber \\&+ \frac{\beta }{2} \left( \Vert z - r \Vert _2^2 - \Vert z^* - r \Vert _2^2 \right) \nonumber \\= & {} \frac{\beta }{2} \left( \Vert z \Vert _2^2 + \Vert r \Vert _2^2 - 2 \, \langle \, z \, , \, r \rangle \right. \nonumber \\&\left. - \Vert z^* \Vert _2^2 - \Vert r \Vert _2^2 + 2 \, \langle \, z^* \, , \, r \rangle \right) \nonumber \\= & {} \beta \left\langle z^* - z , r \right\rangle \nonumber \\= & {} \beta \left\| z^* - z \right\| _2 \left\| r \right\| _2 cos\left( \widehat{O \, z^* z} \right) \, . \end{aligned}$$
(6.39)

Since \(\beta > 0\) by hypothesis, \(z^* \!\ne z\) and \(r \ne O\) by construction, and noting that the angle \(\widehat{O \, z^* z}\) is always acute, we can conclude that the expression in (6.39) is positive. Hence, the solution \(x^*\) of (4.33) lies on the closed half-line Or.

We now prove that the solution \(x^*\) of (4.33) lies inside the segment [Or], represented in solid red in Fig. 10b. To this purpose, we demonstrate that for any point z lying on the half-line Or but outside the segment [Or] there always exists a point \(z^*\) on [Or] such that \(\theta (z) - \theta (z^*) > 0\). In particular, it suffices to choose \(z^* = r\), as illustrated in Fig. 10b. We obtain

$$\begin{aligned}&\theta (z) - \theta (z^*) \;{=}\; \theta (z) - \theta (r) \;\nonumber \\&{=}\; \phi \left( \Vert z \Vert _2;a \right) - \phi \left( \Vert r \Vert _2;a \right) + \frac{\beta }{2} \Vert z - r \Vert _2^2 \, . \end{aligned}$$
(6.40)

Since \(\Vert z \Vert _2 > \Vert r \Vert _2\) by construction and the function \(\phi \) is monotonically increasing by hypothesis, the expression in (6.40) is positive, hence the solution \(x^*\) of (4.33) lies on the segment [Or].

To conclude the proof of statement (4.34), we notice that the directional derivative of the objective function \(\theta \) in (4.31) at r in the direction of r is as follows:

$$\begin{aligned} \frac{ \partial \theta }{\partial r}(r) \,\;{=}\;\, \phi '(\Vert r\Vert _2;a) \,\;{>}\;\, 0 \; , \end{aligned}$$
(6.41)

where the inequality follows from assumption (A2) in Sect. 2. It follows from (6.41) that the solution \(x^*\) of (4.33) never coincides with vector r.

Based on (4.34), by setting \(x = \xi r\), \(0 \le \xi < 1\), the original unconstrained 2-dimensional problem in (4.33) can be reduced to the following equivalent constrained 1-dimensional problem:

(6.42)

where in (6.42) we omitted the constant terms and introduced the objective function f for future reference. Since we are assuming that the function \(\phi \) is twice continuously differentiable in \({\mathbb R}_+\), so it is the cost function f in (6.42) in the optimization domain \(0 \le \xi \le 1\). Moreover, f is strictly convex since it represents the restriction of the strictly convex function \(\theta \) in (4.31) to the segment \(\xi \, r, \, 0 \le \xi \le 1\). Hence, the necessary and sufficient condition for an inner point \(0 < \xi < 1\) to be the global minimizer of f is as follows:

(6.43)

Since f is strictly convex, the first derivative \(f'(\xi )\) is strictly increasing in the entire domain \(0 \le \xi \le 1\) and at the extremes we have

$$\begin{aligned} f'(0^+)= & {} \Vert r \Vert _2 \, \big [ \, \phi '(0^+;a) - \beta \, \Vert r \Vert _2 \, \big ] \, , \qquad \; \nonumber \\ f'(1)= & {} \Vert r \Vert _2 \, \phi '(\Vert r \Vert _2;a) \,. \end{aligned}$$
(6.44)

Since \(\Vert r \Vert _2 > 0\) and \(\phi '(t;a) > 0\) for any \(t \ge 0\) by hypothesis, \(f'(1)\) in (6.44) is positive. Hence, we have two cases. If \(f'(0^+) \ge 0\), that is \(\Vert r\Vert \le \phi '(0^+;a) / \beta \), \(f'(t)\) is positive in \(0 < t \le 1\), hence the function f has its minimum at \(\xi ^* = 0\); if \(f'(0^+) < 0\), then f has the minimum at its unique stationary point \(0 < \xi ^* < 1\), which can be obtained by solving the nonlinear equation in (6.43). \(\square \)

1.6 Proof of Proposition 4.6

Proof

After setting \(\,\alpha := \Vert r \Vert _2\,\) for simplicity of notations, we have to solve the following constrained nonlinear equation:

$$\begin{aligned} \phi ' \left( \alpha x;a \right) + \beta \, \alpha (x - 1) \;{=}\; 0 \, , \quad 0 \;{<}\; x \;{<}\; 1 \, , \quad \alpha \, \beta \;{>}\; 1 \, , \quad \beta \;{>}\; a \; . \end{aligned}$$
(6.45)

Substituting in (6.45) the expression of the first-order derivative of the exponential penalty function reported in the second row of Table 1, we obtain

$$\begin{aligned} \frac{1}{e^{a \alpha x}} \,\;{+}\;\, \beta \, \alpha \, (x - 1) \,\;= & {} \;\, 0 \;\;{\equiv }\;\; 1 \,\;{+}\;\, \beta \, \alpha \, (x - 1) \, e^{a \alpha x} \,\;\nonumber \\= & {} \;\, 0 \;\;{\equiv }\;\; \beta \, \alpha \, (x - 1) \, e^{a \alpha x} \,\;{=}\;\, -1.\nonumber \\ \end{aligned}$$
(6.46)

We notice that

$$\begin{aligned} e^{a \alpha x} \,\;{=}\;\, e^{a \alpha } e^{a \alpha (x - 1)} \, , \end{aligned}$$
(6.47)

such that (6.46) can be written as

$$\begin{aligned} \beta \, \alpha \, (x - 1) \, e^{a \alpha } e^{a \alpha (x - 1)} \,\;{=}\;\, -1 \; . \end{aligned}$$
(6.48)

After multiplying both sides of (6.48) by \(a \, / \, (\beta \, e^{a \alpha })\), we obtain

$$\begin{aligned} a \, \alpha \, (x - 1) \, e^{a \alpha (x - 1)} \,\;{=}\;\, - \frac{a}{\beta \, e^{a \alpha }} \; . \end{aligned}$$
(6.49)

After the following change of variable:

$$\begin{aligned} y = a \, \alpha \, (x - 1) \, , \end{aligned}$$
(6.50)

we obtain

$$\begin{aligned} y \, e^{y} \,\;{=}\;\, - \frac{a}{\beta \, e^{a \alpha }} \, , \quad y \,{\in }\; ] -a \, \alpha \, , \, 0 \, [ \; , \end{aligned}$$
(6.51)

Hence, the unique solution \(y^*\) of (6.51) is given by

$$\begin{aligned} y^* \;{=}\; W_0 \left( - \frac{a}{\beta \, e^{a \alpha }} \right) \; , \end{aligned}$$
(6.52)

and, following from (6.50), the unique solution \(x^*\) of (6.46) is

$$\begin{aligned} x^* \,\;{=}\;\, 1 \,\;{+}\;\, \frac{1}{a \, \alpha } \, W_0 \left( - \frac{a}{\beta \, e^{a \alpha }} \right) \; , \end{aligned}$$
(6.53)

where \(W_0(\cdot )\) represents the principal branch of the Lambert W function [10]. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lanza, A., Morigi, S. & Sgallari, F. Convex Image Denoising via Non-convex Regularization with Parameter Selection. J Math Imaging Vis 56, 195–220 (2016). https://doi.org/10.1007/s10851-016-0655-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10851-016-0655-7

Keywords

Navigation