Mathematical Programming

, Volume 160, Issue 1–2, pp 1–32 | Cite as

Block coordinate proximal gradient methods with variable Bregman functions for nonsmooth separable optimization

  • Xiaoqin HuaEmail author
  • Nobuo Yamashita
Full Length Paper


In this paper, we propose a class of block coordinate proximal gradient (BCPG) methods for solving large-scale nonsmooth separable optimization problems. The proposed BCPG methods are based on the Bregman functions, which may vary at each iteration. These methods include many well-known optimization methods, such as the quasi-Newton method, the block coordinate descent method, and the proximal point method. For the proposed methods, we establish their global convergence properties when the blocks are selected by the Gauss–Seidel rule. Further, under some additional appropriate assumptions, we show that the convergence rate of the proposed methods is R-linear. We also present numerical results for a new BCPG method with variable kernels for a convex problem with separable simplex constraints.


Nonsmooth optimization Block coordinate proximal gradient methods Linear convergence Error bounds 

Mathematics Subject Classification

49M27 65K05 90C06 90C25 90C26 90C30 



We would like to thank the associate editor and the two anonymous reviewers for their constructive comments, which improved this paper significantly. In particular, they encourage us to give the inexact block coordinate descent in Sect. 6 and propose a new method for the convex problem with separable simplex constraints in Sect. 7.


  1. 1.
    Beck, A., Teboulle, M.: Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper. Res. Lett. 31, 167–175 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Chen, G., Teboulle, M.: Convergence analysis of a proximal-like minimization algorithm using Bregman functions. SIAM J. Optim. 3, 538–543 (1993)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20, 33–61 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Collins, M., Globerson, A., Koo, T., Carreras, X., Bartlett, P.: Exponentiated gradient algorithms for conditional random fields and max-margin markov networks. J. Mach. Learn. Res. 9, 1775–1822 (2008)MathSciNetzbMATHGoogle Scholar
  5. 5.
    Curtis, F.E., Overton, M.L.: A sequential quadratic programming algorithm for nonconvex, nonsmooth constrained optimization. SIAM J. Optim. 22, 474–500 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Friedman, J., Hastie, T., Höfling, H., Tibshirani, R.: Pathwise coordinate optimization. Ann. Appl. Stat. 1, 302–332 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Hua, X.Q.: Studies on block coordinate gradient methods for nonlinear optimization problems with separate structures, Ph.D. thesis, Graduate school of informatics, Kyoto University, Japan (2015).
  8. 8.
    Koh, K., Kim, S.J., Boyd, S.: An interior-point method for large-scale \(l_1\)-regularized logistic regression. J. Mach. Learn. Res. 8, 1519–1555 (2007)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Liu, H., Palatucci, M., Zhang, J.: Blockwise coordinate descent procedures for the multi-task lasso, with applications to neural semantic basis discovery. In: ICML ’09 Proceedings of the 26th Annual International Conference on Machine Learning, pp. 649–656 (2009)Google Scholar
  10. 10.
    Luenberger, D.G.: Linear and Nonlinear Programming. Kluwer Academic, Massachusetts (2003)zbMATHGoogle Scholar
  11. 11.
    Mehrotra, S.: On the implementation of a primal-dual interior point method. SIAM J. Optim. 2, 575–601 (1992)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Meier, L., Van De Geer, S., Bühlmann, P.: The group lasso for logistic regression. J. R. Stat. Soc. Series B 70, 53–71 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Moré, J.J., Toraldo, G.: On the solution of large quadratic programming problems with bound constraints. SIAM J. Optim. 1, 93–113 (1991)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course. Springer, The Netherlands (2004)CrossRefzbMATHGoogle Scholar
  15. 15.
    Taylor, H.L., Bank, S.C., McCoy, J.F.: Deconvolution with the \(l_1\) norm. Geophysics 44, 39–52 (1979)CrossRefGoogle Scholar
  16. 16.
    Tseng, P.: Approximation accuracy, gradient methods, and error bound for structured convex optimization. Math. Program. 125, 263–295 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Tseng, P.: Convegence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109, 475–494 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. 117, 387–423 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Wright, S.J.: Accelerated block-coordinate relaxation for regularized optimization. SIAM J. Optim. 22, 159–186 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Wu, T.T., Lange, K.: Coordinate descent algorithms for lasso penalized regression. Ann. Appl. Stat. 2, 224–244 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Xu, Y., Yin, W.: A block coordinate descent method for multi-convex optimization with applications to nonnegative tensor factorization and completion, Rice University CAAM Technical Report (2012)Google Scholar
  22. 22.
    Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Series B 68, 49–67 (2006)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg and Mathematical Optimization Society 2016

Authors and Affiliations

  1. 1.School of Mathematics and PhysicsJiangsu University of Science and TechnologyZhenjiangChina
  2. 2.Department of Applied Mathematics and Physics, Graduate School of InformaticsKyoto UniversityKyotoJapan

Personalised recommendations