Computational Optimization and Applications

, Volume 67, Issue 3, pp 489–520 | Cite as

Further properties of the forward–backward envelope with applications to difference-of-convex programming

Article

Abstract

In this paper, we further study the forward–backward envelope first introduced in Patrinos and Bemporad (Proceedings of the IEEE Conference on Decision and Control, pp 2358–2363, 2013) and Stella et al. (Comput Optim Appl, doi: 10.1007/s10589-017-9912-y, 2017) for problems whose objective is the sum of a proper closed convex function and a twice continuously differentiable possibly nonconvex function with Lipschitz continuous gradient. We derive sufficient conditions on the original problem for the corresponding forward–backward envelope to be a level-bounded and Kurdyka–Łojasiewicz function with an exponent of \(\frac{1}{2}\); these results are important for the efficient minimization of the forward–backward envelope by classical optimization algorithms. In addition, we demonstrate how to minimize some difference-of-convex regularized least squares problems by minimizing a suitably constructed forward–backward envelope. Our preliminary numerical results on randomly generated instances of large-scale \(\ell _{1-2}\) regularized least squares problems (Yin et al. in SIAM J Sci Comput 37:A536–A563, 2015) illustrate that an implementation of this approach with a limited-memory BFGS scheme usually outperforms standard first-order methods such as the nonmonotone proximal gradient method in Wright et al. (IEEE Trans Signal Process 57:2479–2493, 2009).

Keywords

Forward–backward envelope Kurdyka–Łojasiewicz property Difference-of-convex programming 

References

  1. 1.
    Attouch, H., Bolte, J.: On the convergence of the proximal algorithm for nonsmooth functions involving analytic features. Math. Program. 116, 5–16 (2009)MathSciNetCrossRefMATHGoogle Scholar
  2. 2.
    Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the Kurdyka–Łojasiewicz inequality. Math. Oper. Res. 35, 438–457 (2010)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods. Math. Program. 137, 91–129 (2013)MathSciNetCrossRefMATHGoogle Scholar
  4. 4.
    Auslender, A., Teboulle, M.: Asymptotic Cones and Functions in Optimization and Variational Inequalities. Springer, Berlin (2003)MATHGoogle Scholar
  5. 5.
    Bolte, J., Daniilidis, A., Lewis, A.: The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17, 1205–1223 (2006)MathSciNetCrossRefMATHGoogle Scholar
  6. 6.
    Bauschke, H.H., Borwein, J.M., Combettes, P.L.: Bregman monotone optimization algorithms. SIAM J. Control Optim. 42, 596–636 (2003)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, Berlin (2011)CrossRefMATHGoogle Scholar
  8. 8.
    Byrd, R.H., Chin, G.M., Nocedal, J., Oztoprak, F.: A family of second-order methods for convex \(\ell _1\)-regularized optimization. Math. Program. 159, 435–467 (2016)MathSciNetCrossRefMATHGoogle Scholar
  9. 9.
    Candès, E.J., Tao, T.: Decoding by linear programming. IEEE Trans. Inf. Theory 51, 4203–4215 (2005)MathSciNetCrossRefMATHGoogle Scholar
  10. 10.
    Chambolle, A.: An algorithm for total variation minimization and applications. J. Math. Imaging Vis. 20, 89–97 (2004)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Chen, X., Lu, Z., Pong, T.K.: Penalty methods for a class of non-Lipschitz optimization problems. SIAM J. Optim. 26, 1465–1492 (2016)MathSciNetCrossRefMATHGoogle Scholar
  12. 12.
    Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theory 52, 1289–1306 (2006)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Facchinei, F., Pang, J.S.: Finite-Dimensional Variational Inequalities and Complementarity Problems Volume I/II. Springer, Berlin (2003)MATHGoogle Scholar
  14. 14.
    Fan, J., Li, R.: Variable selection via nonconvex penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96, 1348–1360 (2011)CrossRefMATHGoogle Scholar
  15. 15.
    Friedlander, M., Goh, G.: Efficient evaluation of scaled proximal operators. Preprint arXiv:1603.05719 (2016)
  16. 16.
    Gong, P., Zhang, C., Lu, Z., Huang, J.Z., Ye, J.: A general iterative Shrinkage and thresholding algorithm for non-convex regularized optimization problems. Proc. Int. Conf. Mach. Learn. 28, 37–45 (2013)Google Scholar
  17. 17.
    Griesse, R., Lorenz, D.A.: A semismooth Newton method for Tikhonov functionals with sparsity constraints. Inverse Probl. 24, 035007 (2008)MathSciNetCrossRefMATHGoogle Scholar
  18. 18.
    Kan, C., Song, W.: The Moreau envelope function and proximal mapping in the sense of the Bregman distance. Nonlinear Anal. Theory Methods Appl. 75, 1385–1399 (2012)MathSciNetCrossRefMATHGoogle Scholar
  19. 19.
    Lee, J.D., Sun, Y., Saunders, M.A.: Proximal Newton-type methods for minimizing composite functions. SIAM J. Optim. 24, 1420–1443 (2014)MathSciNetCrossRefMATHGoogle Scholar
  20. 20.
    Li, G., Pong, T.K.: Calculus of the exponent of Kurdyka–Łojasiewicz inequality and its applications to linear convergence of first-order methods. Preprint arXiv:1602.02915 (2016)
  21. 21.
    Lu, Z., Pong, T.K., Zhang, Y.: An alternating direction method for finding Dantzig selectors. Comput. Stat. Data Anal. 56, 4037–4946 (2012)MathSciNetCrossRefMATHGoogle Scholar
  22. 22.
    Luo, Z.Q., Tseng, P.: On the linear convergence of descent methods for convex essentially smooth minimization. SIAM J. Control Optim. 30, 408–425 (1992)MathSciNetCrossRefMATHGoogle Scholar
  23. 23.
    Luo, Z.Q., Tseng, P.: Error bounds and convergence analysis of feasible descent methods: a general approach. Ann. Oper. Res. 46, 157–178 (1993)MathSciNetCrossRefMATHGoogle Scholar
  24. 24.
    Luo, Z.Q., Tseng, P.: On the convergence rate of dual ascent methods for linearly constrained convex minimization. Math. Oper. Res. 18, 846–867 (1993)MathSciNetCrossRefMATHGoogle Scholar
  25. 25.
    Milzarek, A., Ulbrich, M.: A semismooth Newton method with multidimensional filter globalization for \(\ell _1\)-optimization. SIAM J. Optim. 24, 298–333 (2014)MathSciNetCrossRefMATHGoogle Scholar
  26. 26.
    Nocedal, J., Wright, S.J.: Numerical Optimization, 1st edn. Springer, Berlin (1999)CrossRefMATHGoogle Scholar
  27. 27.
    Noll, D., Rondepierre, A.: Convergence of linesearch and trust-region methods using the Kurdyka–Łojasiewicz inequality. In: Bailey, D.H., Bauschke, H.H., Borwein, P., Garvan, F., Théra, M., Vanderwerff, J.D., Wolkowicz, H. (eds.) Computational and Analytical Mathematics. Springer, Berlin (2013)Google Scholar
  28. 28.
    Patrinos, P., Bemporad, A.: Proximal Newton methods for convex composite optimization. In: Proceedings of the IEEE Conference on Decision and Control, pp. 2358–2363. (2013)Google Scholar
  29. 29.
    Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, Berlin (1998)CrossRefMATHGoogle Scholar
  30. 30.
    Stella, L., Themelis, A., Patrinos, P.: Forward–backward quasi-Newton methods for nonsmooth optimization problems. Comput. Optim. Appl. (2017). doi: 10.1007/s10589-017-9912-y
  31. 31.
    Tseng, P., Yun, S.: A coordinate gradient descent method for linearly constrained smooth optimization and support vector machines training. Comput. Optim. Appl. 47, 179–206 (2010)MathSciNetCrossRefMATHGoogle Scholar
  32. 32.
    Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. Ser. B 117, 387–423 (2009)MathSciNetCrossRefMATHGoogle Scholar
  33. 33.
    Tseng, P.: Approximation accuracy, gradient methods, and error bound for structured convex optimization. Math. Program. Ser. B 125, 263–295 (2010)MathSciNetCrossRefMATHGoogle Scholar
  34. 34.
    Wang, Y., Luo, Z., Zhang, X.: New improved penalty methods for sparse reconstruction based on difference of two norms. Preprint. doi: 10.13140/RG.2.1.3256.3369 (2015)
  35. 35.
    Wright, S.J., Nowak, R., Figueiredo, M.A.T.: Sparse reconstruction by separable approximation. IEEE T. Signal Process. 57, 2479–2493 (2009)MathSciNetCrossRefGoogle Scholar
  36. 36.
    Xiao, X., Li, Y., Wen, Z., Zhang, L.: Semi-smooth second-order type methods for composite convex programs. Preprint arXiv:1603.07870 (2016)
  37. 37.
    Yin, P., Lou, Y., He, Q., Xin, J.: Minimization of \(\ell _{1-2}\) for compressed sensing. SIAM J. Sci. Comput. 37, A536–A563 (2015)MathSciNetCrossRefMATHGoogle Scholar
  38. 38.
    Zhang, C.-H.: Nearby unbiased variable selection under minimax concave penalty. Ann. Stat. 38, 894–942 (2010)CrossRefMATHGoogle Scholar
  39. 39.
    Zhou, Z., So, A.M.-C.: A unified approach to error bounds for structured convex optimization problems. Preprint arXiv:1512.03518 (2015)

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  1. 1.Department of Applied MathematicsThe Hong Kong Polytechnic UniversityKowloonHong Kong

Personalised recommendations