Block coordinate proximal gradient methods with variable Bregman functions for nonsmooth separable optimization

Abstract

In this paper, we propose a class of block coordinate proximal gradient (BCPG) methods for solving large-scale nonsmooth separable optimization problems. The proposed BCPG methods are based on the Bregman functions, which may vary at each iteration. These methods include many well-known optimization methods, such as the quasi-Newton method, the block coordinate descent method, and the proximal point method. For the proposed methods, we establish their global convergence properties when the blocks are selected by the Gauss–Seidel rule. Further, under some additional appropriate assumptions, we show that the convergence rate of the proposed methods is R-linear. We also present numerical results for a new BCPG method with variable kernels for a convex problem with separable simplex constraints.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

References

  1. 1.

    Beck, A., Teboulle, M.: Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper. Res. Lett. 31, 167–175 (2003)

    MathSciNet  Article  MATH  Google Scholar 

  2. 2.

    Chen, G., Teboulle, M.: Convergence analysis of a proximal-like minimization algorithm using Bregman functions. SIAM J. Optim. 3, 538–543 (1993)

    MathSciNet  Article  MATH  Google Scholar 

  3. 3.

    Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20, 33–61 (1998)

    MathSciNet  Article  MATH  Google Scholar 

  4. 4.

    Collins, M., Globerson, A., Koo, T., Carreras, X., Bartlett, P.: Exponentiated gradient algorithms for conditional random fields and max-margin markov networks. J. Mach. Learn. Res. 9, 1775–1822 (2008)

    MathSciNet  MATH  Google Scholar 

  5. 5.

    Curtis, F.E., Overton, M.L.: A sequential quadratic programming algorithm for nonconvex, nonsmooth constrained optimization. SIAM J. Optim. 22, 474–500 (2012)

    MathSciNet  Article  MATH  Google Scholar 

  6. 6.

    Friedman, J., Hastie, T., Höfling, H., Tibshirani, R.: Pathwise coordinate optimization. Ann. Appl. Stat. 1, 302–332 (2007)

    MathSciNet  Article  MATH  Google Scholar 

  7. 7.

    Hua, X.Q.: Studies on block coordinate gradient methods for nonlinear optimization problems with separate structures, Ph.D. thesis, Graduate school of informatics, Kyoto University, Japan (2015). http://www-optima.amp.i.kyoto-u.ac.jp/papers/doctor/2015_doctor_hua

  8. 8.

    Koh, K., Kim, S.J., Boyd, S.: An interior-point method for large-scale \(l_1\)-regularized logistic regression. J. Mach. Learn. Res. 8, 1519–1555 (2007)

    MathSciNet  MATH  Google Scholar 

  9. 9.

    Liu, H., Palatucci, M., Zhang, J.: Blockwise coordinate descent procedures for the multi-task lasso, with applications to neural semantic basis discovery. In: ICML ’09 Proceedings of the 26th Annual International Conference on Machine Learning, pp. 649–656 (2009)

  10. 10.

    Luenberger, D.G.: Linear and Nonlinear Programming. Kluwer Academic, Massachusetts (2003)

    Google Scholar 

  11. 11.

    Mehrotra, S.: On the implementation of a primal-dual interior point method. SIAM J. Optim. 2, 575–601 (1992)

    MathSciNet  Article  MATH  Google Scholar 

  12. 12.

    Meier, L., Van De Geer, S., Bühlmann, P.: The group lasso for logistic regression. J. R. Stat. Soc. Series B 70, 53–71 (2008)

    MathSciNet  Article  MATH  Google Scholar 

  13. 13.

    Moré, J.J., Toraldo, G.: On the solution of large quadratic programming problems with bound constraints. SIAM J. Optim. 1, 93–113 (1991)

    MathSciNet  Article  MATH  Google Scholar 

  14. 14.

    Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course. Springer, The Netherlands (2004)

    Google Scholar 

  15. 15.

    Taylor, H.L., Bank, S.C., McCoy, J.F.: Deconvolution with the \(l_1\) norm. Geophysics 44, 39–52 (1979)

    Article  Google Scholar 

  16. 16.

    Tseng, P.: Approximation accuracy, gradient methods, and error bound for structured convex optimization. Math. Program. 125, 263–295 (2010)

    MathSciNet  Article  MATH  Google Scholar 

  17. 17.

    Tseng, P.: Convegence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109, 475–494 (2001)

    MathSciNet  Article  MATH  Google Scholar 

  18. 18.

    Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. 117, 387–423 (2009)

    MathSciNet  Article  MATH  Google Scholar 

  19. 19.

    Wright, S.J.: Accelerated block-coordinate relaxation for regularized optimization. SIAM J. Optim. 22, 159–186 (2012)

    MathSciNet  Article  MATH  Google Scholar 

  20. 20.

    Wu, T.T., Lange, K.: Coordinate descent algorithms for lasso penalized regression. Ann. Appl. Stat. 2, 224–244 (2008)

    MathSciNet  Article  MATH  Google Scholar 

  21. 21.

    Xu, Y., Yin, W.: A block coordinate descent method for multi-convex optimization with applications to nonnegative tensor factorization and completion, Rice University CAAM Technical Report (2012)

  22. 22.

    Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Series B 68, 49–67 (2006)

    MathSciNet  Article  MATH  Google Scholar 

Download references

Acknowledgments

We would like to thank the associate editor and the two anonymous reviewers for their constructive comments, which improved this paper significantly. In particular, they encourage us to give the inexact block coordinate descent in Sect. 6 and propose a new method for the convex problem with separable simplex constraints in Sect. 7.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Xiaoqin Hua.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hua, X., Yamashita, N. Block coordinate proximal gradient methods with variable Bregman functions for nonsmooth separable optimization. Math. Program. 160, 1–32 (2016). https://doi.org/10.1007/s10107-015-0969-z

Download citation

Keywords

  • Nonsmooth optimization
  • Block coordinate proximal gradient methods
  • Linear convergence
  • Error bounds

Mathematics Subject Classification

  • 49M27
  • 65K05
  • 90C06
  • 90C25
  • 90C26
  • 90C30