# Further properties of the forward–backward envelope with applications to difference-of-convex programming

- 824 Downloads
- 5 Citations

## Abstract

In this paper, we further study the forward–backward envelope first introduced in Patrinos and Bemporad (Proceedings of the IEEE Conference on Decision and Control, pp 2358–2363, 2013) and Stella et al. (Comput Optim Appl, doi: 10.1007/s10589-017-9912-y, 2017) for problems whose objective is the sum of a proper closed convex function and a twice continuously differentiable possibly nonconvex function with Lipschitz continuous gradient. We derive sufficient conditions on the original problem for the corresponding forward–backward envelope to be a level-bounded and Kurdyka–Łojasiewicz function with an exponent of \(\frac{1}{2}\); these results are important for the efficient minimization of the forward–backward envelope by classical optimization algorithms. In addition, we demonstrate how to minimize some difference-of-convex regularized least squares problems by minimizing a suitably constructed forward–backward envelope. Our preliminary numerical results on randomly generated instances of large-scale \(\ell _{1-2}\) regularized least squares problems (Yin et al. in SIAM J Sci Comput 37:A536–A563, 2015) illustrate that an implementation of this approach with a limited-memory BFGS scheme usually outperforms standard first-order methods such as the nonmonotone proximal gradient method in Wright et al. (IEEE Trans Signal Process 57:2479–2493, 2009).

## Keywords

Forward–backward envelope Kurdyka–Łojasiewicz property Difference-of-convex programming## References

- 1.Attouch, H., Bolte, J.: On the convergence of the proximal algorithm for nonsmooth functions involving analytic features. Math. Program.
**116**, 5–16 (2009)MathSciNetCrossRefMATHGoogle Scholar - 2.Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the Kurdyka–Łojasiewicz inequality. Math. Oper. Res.
**35**, 438–457 (2010)MathSciNetCrossRefMATHGoogle Scholar - 3.Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods. Math. Program.
**137**, 91–129 (2013)MathSciNetCrossRefMATHGoogle Scholar - 4.Auslender, A., Teboulle, M.: Asymptotic Cones and Functions in Optimization and Variational Inequalities. Springer, Berlin (2003)MATHGoogle Scholar
- 5.Bolte, J., Daniilidis, A., Lewis, A.: The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim.
**17**, 1205–1223 (2006)MathSciNetCrossRefMATHGoogle Scholar - 6.Bauschke, H.H., Borwein, J.M., Combettes, P.L.: Bregman monotone optimization algorithms. SIAM J. Control Optim.
**42**, 596–636 (2003)MathSciNetCrossRefMATHGoogle Scholar - 7.Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, Berlin (2011)CrossRefMATHGoogle Scholar
- 8.Byrd, R.H., Chin, G.M., Nocedal, J., Oztoprak, F.: A family of second-order methods for convex \(\ell _1\)-regularized optimization. Math. Program.
**159**, 435–467 (2016)MathSciNetCrossRefMATHGoogle Scholar - 9.Candès, E.J., Tao, T.: Decoding by linear programming. IEEE Trans. Inf. Theory
**51**, 4203–4215 (2005)MathSciNetCrossRefMATHGoogle Scholar - 10.Chambolle, A.: An algorithm for total variation minimization and applications. J. Math. Imaging Vis.
**20**, 89–97 (2004)MathSciNetCrossRefGoogle Scholar - 11.Chen, X., Lu, Z., Pong, T.K.: Penalty methods for a class of non-Lipschitz optimization problems. SIAM J. Optim.
**26**, 1465–1492 (2016)MathSciNetCrossRefMATHGoogle Scholar - 12.Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theory
**52**, 1289–1306 (2006)MathSciNetCrossRefMATHGoogle Scholar - 13.Facchinei, F., Pang, J.S.: Finite-Dimensional Variational Inequalities and Complementarity Problems Volume I/II. Springer, Berlin (2003)MATHGoogle Scholar
- 14.Fan, J., Li, R.: Variable selection via nonconvex penalized likelihood and its oracle properties. J. Am. Stat. Assoc.
**96**, 1348–1360 (2011)CrossRefMATHGoogle Scholar - 15.Friedlander, M., Goh, G.: Efficient evaluation of scaled proximal operators. Preprint arXiv:1603.05719 (2016)
- 16.Gong, P., Zhang, C., Lu, Z., Huang, J.Z., Ye, J.: A general iterative Shrinkage and thresholding algorithm for non-convex regularized optimization problems. Proc. Int. Conf. Mach. Learn.
**28**, 37–45 (2013)Google Scholar - 17.Griesse, R., Lorenz, D.A.: A semismooth Newton method for Tikhonov functionals with sparsity constraints. Inverse Probl.
**24**, 035007 (2008)MathSciNetCrossRefMATHGoogle Scholar - 18.Kan, C., Song, W.: The Moreau envelope function and proximal mapping in the sense of the Bregman distance. Nonlinear Anal. Theory Methods Appl.
**75**, 1385–1399 (2012)MathSciNetCrossRefMATHGoogle Scholar - 19.Lee, J.D., Sun, Y., Saunders, M.A.: Proximal Newton-type methods for minimizing composite functions. SIAM J. Optim.
**24**, 1420–1443 (2014)MathSciNetCrossRefMATHGoogle Scholar - 20.Li, G., Pong, T.K.: Calculus of the exponent of Kurdyka–Łojasiewicz inequality and its applications to linear convergence of first-order methods. Preprint arXiv:1602.02915 (2016)
- 21.Lu, Z., Pong, T.K., Zhang, Y.: An alternating direction method for finding Dantzig selectors. Comput. Stat. Data Anal.
**56**, 4037–4946 (2012)MathSciNetCrossRefMATHGoogle Scholar - 22.Luo, Z.Q., Tseng, P.: On the linear convergence of descent methods for convex essentially smooth minimization. SIAM J. Control Optim.
**30**, 408–425 (1992)MathSciNetCrossRefMATHGoogle Scholar - 23.Luo, Z.Q., Tseng, P.: Error bounds and convergence analysis of feasible descent methods: a general approach. Ann. Oper. Res.
**46**, 157–178 (1993)MathSciNetCrossRefMATHGoogle Scholar - 24.Luo, Z.Q., Tseng, P.: On the convergence rate of dual ascent methods for linearly constrained convex minimization. Math. Oper. Res.
**18**, 846–867 (1993)MathSciNetCrossRefMATHGoogle Scholar - 25.Milzarek, A., Ulbrich, M.: A semismooth Newton method with multidimensional filter globalization for \(\ell _1\)-optimization. SIAM J. Optim.
**24**, 298–333 (2014)MathSciNetCrossRefMATHGoogle Scholar - 26.Nocedal, J., Wright, S.J.: Numerical Optimization, 1st edn. Springer, Berlin (1999)CrossRefMATHGoogle Scholar
- 27.Noll, D., Rondepierre, A.: Convergence of linesearch and trust-region methods using the Kurdyka–Łojasiewicz inequality. In: Bailey, D.H., Bauschke, H.H., Borwein, P., Garvan, F., Théra, M., Vanderwerff, J.D., Wolkowicz, H. (eds.) Computational and Analytical Mathematics. Springer, Berlin (2013)Google Scholar
- 28.Patrinos, P., Bemporad, A.: Proximal Newton methods for convex composite optimization. In: Proceedings of the IEEE Conference on Decision and Control, pp. 2358–2363. (2013)Google Scholar
- 29.Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, Berlin (1998)CrossRefMATHGoogle Scholar
- 30.Stella, L., Themelis, A., Patrinos, P.: Forward–backward quasi-Newton methods for nonsmooth optimization problems. Comput. Optim. Appl. (2017). doi: 10.1007/s10589-017-9912-y
- 31.Tseng, P., Yun, S.: A coordinate gradient descent method for linearly constrained smooth optimization and support vector machines training. Comput. Optim. Appl.
**47**, 179–206 (2010)MathSciNetCrossRefMATHGoogle Scholar - 32.Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. Ser. B
**117**, 387–423 (2009)MathSciNetCrossRefMATHGoogle Scholar - 33.Tseng, P.: Approximation accuracy, gradient methods, and error bound for structured convex optimization. Math. Program. Ser. B
**125**, 263–295 (2010)MathSciNetCrossRefMATHGoogle Scholar - 34.Wang, Y., Luo, Z., Zhang, X.: New improved penalty methods for sparse reconstruction based on difference of two norms. Preprint. doi: 10.13140/RG.2.1.3256.3369 (2015)
- 35.Wright, S.J., Nowak, R., Figueiredo, M.A.T.: Sparse reconstruction by separable approximation. IEEE T. Signal Process.
**57**, 2479–2493 (2009)MathSciNetCrossRefGoogle Scholar - 36.Xiao, X., Li, Y., Wen, Z., Zhang, L.: Semi-smooth second-order type methods for composite convex programs. Preprint arXiv:1603.07870 (2016)
- 37.Yin, P., Lou, Y., He, Q., Xin, J.: Minimization of \(\ell _{1-2}\) for compressed sensing. SIAM J. Sci. Comput.
**37**, A536–A563 (2015)MathSciNetCrossRefMATHGoogle Scholar - 38.Zhang, C.-H.: Nearby unbiased variable selection under minimax concave penalty. Ann. Stat.
**38**, 894–942 (2010)CrossRefMATHGoogle Scholar - 39.Zhou, Z., So, A.M.-C.: A unified approach to error bounds for structured convex optimization problems. Preprint arXiv:1512.03518 (2015)