Skip to main content
Log in

A hybrid Bregman alternating direction method of multipliers for the linearly constrained difference-of-convex problems

  • Published:
Journal of Global Optimization Aims and scope Submit manuscript

Abstract

In this paper, we propose a hybrid Bregman alternating direction method of multipliers for solving the linearly constrained difference-of-convex problems whose objective can be written as the sum of a smooth convex function with Lipschitz gradient, a proper closed convex function and a continuous concave function. At each iteration, we choose either subgradient step or proximal step to evaluate the concave part. Moreover, the extrapolation technique was utilized to compute the nonsmooth convex part. We prove that the sequence generated by the proposed method converges to a critical point of the considered problem under the assumption that the potential function is a Kurdyka–Łojasiewicz function. One notable advantage of the proposed method is that the convergence can be guaranteed without the Lischitz continuity of the gradient function of concave part. Preliminary numerical experiments show the efficiency of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. An, L.T.H., Belghiti, M.T., Tao, P.D.: A new efficient algorithm based on DC programming and DCA for clustering. J. Global Optim. 37(4), 593–608 (2007)

    MathSciNet  MATH  Google Scholar 

  2. Attouch, H., Redont, P., Soubeyran, A.: A new class of alternating proximal minimization algorithms with costs-to-move. SIAM J. Optim. 18(3), 1061–1081 (2007)

    MathSciNet  MATH  Google Scholar 

  3. Attouch, H., Bolte, J.: On the convergence of the proximal algorithms for nonsmooth functions involving analytic features. Math. Program. 116(1–2), 5–16 (2009)

    MathSciNet  MATH  Google Scholar 

  4. Bai, M.R., Zhang, X.J., Shao, Q.Q.: Adaptive correction procedure for TVL1 image deblurring under impulse noise. Inverse Probl. 32(8), 085004 (2016)

    MathSciNet  MATH  Google Scholar 

  5. Banert, S., Bot, R.I.: A general double-proximal gradient algorithm for d.c. programming. Math. Program. (2018). https://doi.org/10.1007/s10107-018-1292-2

    Article  MATH  Google Scholar 

  6. Bauschke, H.H., Combettes, P.L.: Convex analysis and monotone operator theory in Hilbert spaces. Springer, New York (2011)

    MATH  Google Scholar 

  7. Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)

    MathSciNet  MATH  Google Scholar 

  8. Beck, A., Teboulle, M.: Fast gradient-based algorithms for constrained total variation image denoising and deblurring problem. IEEE Trans. Image Process. 18(11), 2419–2434 (2009)

    MathSciNet  MATH  Google Scholar 

  9. Becker, S., Bobin, J., Candès, E.: NESTA: a fast and accurate first-order method for sparse recovery. SIAM J. Imaging Sci. 4(1), 1–39 (2009)

    MathSciNet  MATH  Google Scholar 

  10. Bolte, J., Sabach, S., Teboule, M.: Proximal alternating linerized minimization for nonconvex and nonsmooth problems. Math. Program. 146(1–2), 459–494 (2014)

    MathSciNet  MATH  Google Scholar 

  11. Bot, R.I., Csetnek, E.R.: An inertial Tseng’s type proximal algorithm for nonsmooth and nonconvex optimization problems. J. Optim. Theory Appl. 171(2), 600–616 (2016)

    MathSciNet  MATH  Google Scholar 

  12. Boyd, S., Parikh, N., Chu, E., Peleato, B., Echstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011)

    MATH  Google Scholar 

  13. Bredies, K., Lorenz, D.A., Reiterer, S.: Minimization of nonsmooth, nonconvex functionals by iterative thresholding. J Optim. Theory Appl. 165(1), 78–112 (2015)

    MathSciNet  MATH  Google Scholar 

  14. Cai, J., Chan, R.H., Shen, L., Shen, Z.: Convergence analysis of tight framelet approach for missing data recovery. Adv. Comput. Math. 31(1), 87–113 (2009)

    MathSciNet  MATH  Google Scholar 

  15. Eckstein, J., Bertsekas, D.P.: On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 55(1–3), 293–318 (1992)

    MathSciNet  MATH  Google Scholar 

  16. Eckstein, J., Yao, W.: Relative-error approximate versions of Douglas Rachford splitting and special cases of the ADMM. Math. Program. 170(2), 417–444 (2018)

    MathSciNet  MATH  Google Scholar 

  17. Gabay, D., Mercier, B.: A dual algorithm for the solution of nonlinear variational problems via finite element approximation. Comput. Math. Appl. 2, 17–40 (1976)

    MATH  Google Scholar 

  18. Gabay, D.: Applications of the method of multipliers to variational inequalities. In: Fortin, M., Glowinski, R. (eds.) Augmented Lagrangian Methods Applications to the Numerical Solution of Boundary-Value Problems, pp. 299–331. North-Holland, Amsterdam (1983)

    Google Scholar 

  19. Gasso, G., Rakotomamonjy, A., Canu, S.: Recovering sparse signals with a certain family of nonconvex penalties and DC programming. IEEE Trans. Signal Process. 57(12), 4686–4698 (2009)

    MathSciNet  MATH  Google Scholar 

  20. Geremew, W., Nam, N.M., Semenova, A., Boginski, V., Pasiliao, E.: A DC programming approach for solving multicast network design problems via the Nesterov smoothing technique. J. Global Optim. 72(4), 705–729 (2018)

    MathSciNet  MATH  Google Scholar 

  21. Gotoh, J., Takeda, A., Tono, K.: DC formulations and algorithms for sparse optimization problems. Math. Program. 169(1), 141–176 (2018)

    MathSciNet  MATH  Google Scholar 

  22. Gonçalves, M.L.N., Melo, J.G., Monteiro, R.D.C.: Convergence rate bounds for a proximal ADMM with over-relaxation stepsize parameter for solving nonconvex linearly constrained problems (2017). arXiv preprint arXiv:1702.01850v2

  23. Guo, K., Han, D.R., Wu, T.T.: Convergence of ADMM for optimization problem nonseparable nonconvex objective and linear constraints. Int. J. Comput. Math. 94(8), 1653–1669 (2017)

    MathSciNet  Google Scholar 

  24. Han, D.R., Yuan, X.M.: Local linear convergence of the alternating direction method of multipliers for quadratic programs. SIAM J. Numer. Anal. 51(6), 3446–3457 (2013)

    MathSciNet  MATH  Google Scholar 

  25. Hansen, P.C., Nagy, J.G., OĹeary, D.P.: Deblurring Images: Matrices, Spectra, and Filtering. SIAM, Philadelphia (2006)

    MATH  Google Scholar 

  26. He, B.S., Yuan, X.M.: On the \(O(1/n)\) convergence rate of the Douglas–Rachford alternating direction method. SIAM J. Numer. Anal. 50(2), 700–709 (2012)

    MathSciNet  MATH  Google Scholar 

  27. Li, G.Y., Pong, T.K.: Global convergence of splitting methods for nonconvex composite optimization. SIAM J. Optim. 25(4), 2434–2460 (2015)

    MathSciNet  MATH  Google Scholar 

  28. Liavas, A.P., Sidiropoulos, N.D.: Parallel algorithms for constrained tensor factorization via alternating direction method of multipliers. IEEE Trans. Signal. Process. 63(20), 5450–5463 (2015)

    MathSciNet  MATH  Google Scholar 

  29. Liu, T.X., Pong, T.K., Takeda, A.: A refined convergence analysis of \(\text{ pDCA }_{{e}}\) with applications to simultaneous sparse recovery and outlier detection. Comput. Optim. Appl. 73(1), 69–100 (2019)

    MathSciNet  MATH  Google Scholar 

  30. Liu, Q.H., Shen, X.Y., Gu, Y.T.: Lineralized ADMM for non-convex non-smooth optimization with convergence analysis (2017). arXiv preprint arXiv:1705.02502

  31. Lou, Y.F., Yin, P.H., Xin, J.: Point source super-resolution via non-convex \(l_1\) based methods. J. Sci. Comput. 68(3), 1082–1100 (2016)

    MathSciNet  MATH  Google Scholar 

  32. Lou, Y.F., Yan, M.: Fast \(l_{1}\)-\(l_{2}\) minimization via a proximal operator. J. Sci. Comput. 74(2), 767–785 (2018)

    MathSciNet  MATH  Google Scholar 

  33. Lou, Y.F., Zeng, T.Y., Osher, S., Xin, J.: A weighted difference of anisotropic and isotropic total variation model for image processing. SIAM J. Imaging Sci. 8(3), 1798–1823 (2015)

    MathSciNet  MATH  Google Scholar 

  34. Lu, Z.S., Li, X.R.: Sparse recovery via partial regularization: models, theory, and algorithms. Math. Oper. Res. 43(4), 1290–1316 (2018)

    MathSciNet  Google Scholar 

  35. Lu, Z.S., Zhou, Z.R., Sun, Z.: Enhanced proximal DC algorithms with extrapolation for a class of structured nonsmooth DC minimization. Math. Program. 176(1–2), 369–401 (2019)

    MathSciNet  MATH  Google Scholar 

  36. Maingé, P.E., Moudafi, A.: Convergence of new inertial proximal methods for DC programming. SIAM J. Optim. 19(1), 397–413 (2008)

    MathSciNet  MATH  Google Scholar 

  37. Mordukhovich, B.S., Nam, N.M., Yen, N.D.: Fréchet subdifferential calculus and optimality conditions in nondifferentiable programming. Optimization 55(5–6), 685–708 (2006)

    MathSciNet  MATH  Google Scholar 

  38. Nesterov, Y.: Introductory Lectures on Convex Optimization. A Basic Course. Kluwer, Boston (2004)

    MATH  Google Scholar 

  39. Mordukhovich, B.S.: Variational Analysis and Generalized Differentiation. I: Basic Theory, II: Applications. Springer, Berlin (2006)

    Google Scholar 

  40. Pratt, W.K.: Digital Image Processing: PIKS Scientific Inside. Wiley, Hoboken (2001)

    MATH  Google Scholar 

  41. Parihk, N., Boyd, S.: Proximal algorithms. Found. Trends Optim. 1(3), 127–239 (2013)

    Google Scholar 

  42. Rockafellar, R.T., Wets, R.: Variational Analysis. Grundlehren Math., Wiss., vol. 317. Springer, Berlin (1998)

    MATH  Google Scholar 

  43. Rudin, L., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Phys. D. 60(1–4), 259–268 (1992)

    MathSciNet  MATH  Google Scholar 

  44. Souza, J.C.O., Oliveira, P.R.: A proximal point algorithm for DC fuctions on Hadamard manifolds. J. Global Optim. 63(4), 797–810 (2015)

    MathSciNet  MATH  Google Scholar 

  45. Sun, T., Yin, P.H., Cheng, L.Z., Jiang, H.: Alternating direction method of multipliers with difference of convex functions. Adv. Comput. Math. 44, 723–744 (2018)

    MathSciNet  MATH  Google Scholar 

  46. Tao, P.D., An, L.T.H.: Convex analysis approach to DC programming: theory, algorithms and applications. Acta Math. Vietnam. 22(1), 289–355 (1997)

    MathSciNet  MATH  Google Scholar 

  47. Tao, P.D., An, L.T.H.: A DC optimization algorithm for solving the trust-region subproblem. SIAM J. Optim. 8(2), 476–505 (1998)

    MathSciNet  MATH  Google Scholar 

  48. Wang, F.H., Xu, Z.B., Xu, H.K.: Convergence of Bregman alternating direction method with multipliers for nonconvex composite problems (2014). arXiv preprint arXiv:1410.8625

  49. Wang, H.F., Kong, L.C., Tao, J.Y.: The linearized alternating direction method of multipliers for sparse group LAD model. Optim. Lett. 13, 505–525 (2019)

    MathSciNet  MATH  Google Scholar 

  50. Wu, Z.M., Li, M., Wang, D.Z.W., Han, D.R.: A symmetric alternating direction method of multipliers for separable nonconvex minimization problems. Asia Pac. J. Oper. Res. 34, 1750030 (2017)

    MathSciNet  MATH  Google Scholar 

  51. Wang, Y., Yao, W.T., Zeng, J.S.: Global convergence of ADMM in nonconvex nonsmooth optimization. J. Sci. Comput. 78, 29–63 (2019)

    MathSciNet  MATH  Google Scholar 

  52. Wen, B., Chen, X.J., Pong, T.K.: A proximal difference-of-convex algorithm with extrapolation. Comput. Optim. Appl. 69(2), 297–324 (2018)

    MathSciNet  MATH  Google Scholar 

  53. Yang, L., Pong, T.K., Chen, X.J.: Alternating direction method of multipliers for a class of nonconvex and nonsmooth problems with applications to background/foreground extraction. SIAM J. Imaging Sci. 10(1), 74–110 (2017)

    MathSciNet  MATH  Google Scholar 

  54. Yin, P.H., Liu, Y.F., He, Q., Xin, J.: Minimization of \(l_{1-2}\) for compressed sensing. SIAM J. Sci. Comput. 37(1), 536–563 (2015)

    MathSciNet  Google Scholar 

  55. Zhang, T.: Some sharp performance bounds for the least squares regression with \(l_1\) regularization. Ann. Stat. 37(5A), 2109–2144 (2009)

    MATH  Google Scholar 

Download references

Acknowledgements

This research was supported by the National Natural Science Foundation of China Grants 11801161, 61179033 and 11771003, and Natural Science Foundation of Hunan Province of China Grant 2018JJ3093. The authors are very grateful to Beijing Innovation Center for Engineering Science and Advanced Technology, Peking University and Beijing University of Technology for their joint project support. The first author Kai Tu would like to thank Prof. Penghua Yin from University of California, Los Angeles for providing the codes of [45], and Dr. Wenxing Zhang from University of Electronic Science and Technology of China and Dr. Benxing Zhang from Guilin University of Electronic Technology for the advices and discussions on the codes of the total variation image restoration problem. The authors also take this opportunity to thank the anonymous referees for their patient and valuable comments, which improved the quality of this paper greatly.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huan Gao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Proof of Proposition 2

Proof

Clearly, problem (51) is the special case of problem (1)–(2) with \(f_1(x)= \rho \Vert x\Vert _1\), \(f_2(x)=\rho \Vert x\Vert _2\) and \(g(y)=\frac{1}{2}\Vert Hy-y^0\Vert ^2\), \(A={\mathcal {I}}\), \(B=- K\) and \(b=\mathbf {0}\). Moreover, \(f_1\), \(f_2\), g, A and B satisfy Assumption 1 (a)–(d). It follows from the choice of \(\phi \) and \(\psi \) that \(L_{\phi }=r_1\), \(L_{\psi }=r_2\), \(\upsilon _{\phi }=r_1\) and \(\upsilon _{\psi }=r_2\). By simple computations, we have that \(b_1>0\) and \(b_2>0\). It follows from \(\sigma < \frac{1}{2\Vert H\Vert ^2}\) that

$$\begin{aligned} g(y)- \sigma \Vert \nabla g(y)\Vert ^2&= \frac{1}{2}\Vert Hy-y^0\Vert ^2 - \sigma \Vert H^{*}(Hy-y^{0})\Vert ^2\nonumber \\&\ge (\frac{1}{2} -\sigma \Vert H\Vert ^2) \Vert Hy-y^0\Vert ^2\ge 0. \end{aligned}$$
(53)

It follows that for any \(k\ge 1\),

$$\begin{aligned}&\varTheta (\xi _{1},x_{1},x_{0},y_{1}, y_0, \lambda _{1}) \nonumber \\&\quad \ge \rho (\Vert x_k \Vert _1 - \Vert x_k\Vert _2) + g(y_k)- \sigma \Vert \nabla g(y_{k})\Vert ^2+ \frac{\eta _2}{2\beta } \Vert y_{k} -y_{k-1}\Vert ^2\nonumber \\&\qquad + (\sigma -\frac{1}{\beta \eta _0}) \Vert \nabla g(y_k)\Vert ^2 + \frac{\beta }{2}\Vert x_k -K y_k - \frac{\lambda _k}{\beta }\Vert ^2,\nonumber \\&\quad \ge a_{1} t_0 \Vert y_k - z\Vert ^2 + \frac{\beta }{2}\Vert x_k -K y_k - \frac{\lambda _k}{\beta }\Vert ^2 + t_1, \end{aligned}$$
(54)

where \(a_{1} =(\frac{1}{2} -\sigma \Vert H\Vert ^2)\), \(z=\frac{1}{t_0}H^* y^{0}\) and \(t_1= a_1 (\Vert y_0\Vert ^2 - t_0\Vert z\Vert ^2 ) \) with \(t_0=\lambda _{\min }(H^*H)\), the first inequality follows from (48), the second inequality is from \(\inf _{x} \{\Vert x\Vert _1 - \Vert x\Vert _2\} \ge 0\) and (53). Since H has full column rank, it follows from (54) that the sequences \(\{ y_k \}_{k\in {\mathbb {N}}}\) and \(\{ x_k -K y_k - \frac{\lambda _k}{\beta }\}_{k\in {\mathbb {N}}}\) are bounded, which together with (47) implies the sequences \(\{ \lambda _k \}_{k\in {\mathbb {N}}}\) and \(\{ x_k \}_{k\in {\mathbb {N}}}\) are bounded. Thus, the sequence \(\{\omega _k\}_{k\in {\mathbb {N}}}\) is bounded. Now, we point that for this problem, the potential function \(\varTheta (\xi ,x,{\tilde{x}},y,{\tilde{y}},\lambda )\), defined in (10), is a KL function. In fact, it follows from the definition of \(f_1\), \(f_2\) and \(\varTheta \) that

$$\begin{aligned}&\varTheta (\xi ,x,{\tilde{x}},y,{\tilde{y}},\lambda ) \\&\quad =\rho \Vert x\Vert _1 + I_{\varOmega }(\xi )-\langle \xi , x\rangle + \frac{1}{2}\Vert Hy-y^{0}\Vert ^2 -\langle \lambda , x-Ky-b\rangle \\&\qquad +\frac{\beta }{2}\Vert x-Ky-b\Vert ^2+ \frac{\theta _1}{2}\Vert x-{\tilde{x}}\Vert ^2 + \frac{\eta _2}{\beta } \Vert y-{\tilde{y}}\Vert ^2, \end{aligned}$$

where \(I_{\varOmega }(\xi )\) is the indicator function of closed convex set \(\varOmega =\{\xi \in {\mathbb {R}}^{n_1}\mid \Vert \xi \Vert ^2 \le \rho ^2\}\). Clearly, \(\varOmega \) is a semi-algebraic set. By [10], we know that indicator function of semi-algebraic set is semi-algebraic, and that \(\Vert \cdot \Vert _{p}\) is semi-algebraic whenever p is rational, i.e., \(p=\frac{p_1}{p_2}\) where \(p_1\) and \(p_2\) are positive integers. Using the fact that finite sums of semi-algebraic functions is a semi-algebraic function, it yields that \(\varTheta \) is a semi-algebraic function, and hence it is a KL function. Note that all assumptions in Theorem 2 hold, which means the conclusion holds. \(\square \)

Proposition 7

Consider the total variation image restoration problem [45]:

$$\begin{aligned} \min _{x,y}\,\, \rho \Vert x\Vert _1 - \rho \Vert x\Vert _2 + \frac{1}{2}\Vert Hy-y^{0}\Vert ^2, \quad s.t. \quad x - K y= \mathbf {0}, \end{aligned}$$
(55)

where \(\rho >0\) is a regularization parameter, H is a blurred operator, \(K:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}^{2n}\) is the discrete gradient operator. Suppose that \(({\bar{x}}, {\bar{y}})\) is a local minimum of problem (55), then there exists \({\bar{\lambda }}\) such that \(({\bar{x}}, {\bar{y}}, {\bar{\lambda }})\) is a critical point of problem (55), i.e., \(({\bar{x}}, {\bar{y}}, {\bar{\lambda }})\) satisfy the inclusion (3).

Proof

We note that problem (55) is equivalent to the following problem

$$\begin{aligned} \min _{x,y}\,\, \rho \Vert Ky\Vert _1 - \rho \Vert Ky\Vert _2 + \frac{1}{2}\Vert Hy-y^{0}\Vert ^2. \end{aligned}$$
(56)

Since \(({\bar{x}}, {\bar{y}})\) is a local minimum of problem (55), it yields that \({\bar{x}}=K{\bar{y}}\), and that \({\bar{y}}\) is a local minimum of problem (56). It follows from Lemma 1 (b) that

$$\begin{aligned} \mathbf {0}\in \partial ((\rho \Vert \cdot \Vert _1 -\rho \Vert \cdot \Vert _2)\circ K) ({\bar{y}}) + H^{*}(H{\bar{y}}-y^0), \end{aligned}$$
(57)

where \(H^*\) is the adjoint operator of H. Since \(\Vert \cdot \Vert _1\) is a proper convex function and \(\Vert \cdot \Vert _2\) is a continuous convex function, it follows from the Corollary 3.4 in [37] and (57) that

$$\begin{aligned} \mathbf {0}\in \partial (\rho \Vert \cdot \Vert _1\circ K) ({\bar{y}}) - \partial (\rho \Vert \cdot \Vert _2\circ K) ({\bar{y}}) + H^{*}(H{\bar{y}}-y^0). \end{aligned}$$

Note that \(\partial (\rho \Vert \cdot \Vert _1\circ K) ({\bar{y}}) = \rho K^{*}\partial \Vert K{\bar{y}}\Vert _1 \) and \(\partial (\rho \Vert \cdot \Vert _2\circ K) ({\bar{y}}) = \rho K^{*}\partial \Vert K{\bar{y}}\Vert _2\), where \(K^*\) is the adjoint operator of K. Take \({\bar{\xi }}_1 \in \rho \partial \Vert {\bar{x}}\Vert _1 \) and \({\bar{\xi }}_2 \in \rho \partial \Vert {\bar{x}}\Vert _2 \), such that \(K^{*}({\bar{\xi }}_1 -{\bar{\xi }}_2) + H^{*}(H{\bar{y}}-y^0)= \mathbf {0}\). Letting \({\bar{\lambda }}= {\bar{\xi }}_1 -{\bar{\xi }}_2 \), it yields that

$$\begin{aligned} {\left\{ \begin{array}{ll} {\bar{\lambda }}\in \partial f_1 ({\bar{x}}) - \partial f_2 ({\bar{x}}),\\ -K^{*}{\bar{\lambda }}= \nabla g({\bar{y}}),\\ {\bar{x}} - K{\bar{y}} = \mathbf {0}. \end{array}\right. } \end{aligned}$$

This completes the proof. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tu, K., Zhang, H., Gao, H. et al. A hybrid Bregman alternating direction method of multipliers for the linearly constrained difference-of-convex problems. J Glob Optim 76, 665–693 (2020). https://doi.org/10.1007/s10898-019-00828-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10898-019-00828-4

Keywords

Navigation