Block coordinate proximal gradient methods with variable Bregman functions for nonsmooth separable optimization
- 639 Downloads
In this paper, we propose a class of block coordinate proximal gradient (BCPG) methods for solving large-scale nonsmooth separable optimization problems. The proposed BCPG methods are based on the Bregman functions, which may vary at each iteration. These methods include many well-known optimization methods, such as the quasi-Newton method, the block coordinate descent method, and the proximal point method. For the proposed methods, we establish their global convergence properties when the blocks are selected by the Gauss–Seidel rule. Further, under some additional appropriate assumptions, we show that the convergence rate of the proposed methods is R-linear. We also present numerical results for a new BCPG method with variable kernels for a convex problem with separable simplex constraints.
KeywordsNonsmooth optimization Block coordinate proximal gradient methods Linear convergence Error bounds
Mathematics Subject Classification49M27 65K05 90C06 90C25 90C26 90C30
We would like to thank the associate editor and the two anonymous reviewers for their constructive comments, which improved this paper significantly. In particular, they encourage us to give the inexact block coordinate descent in Sect. 6 and propose a new method for the convex problem with separable simplex constraints in Sect. 7.
- 7.Hua, X.Q.: Studies on block coordinate gradient methods for nonlinear optimization problems with separate structures, Ph.D. thesis, Graduate school of informatics, Kyoto University, Japan (2015). http://www-optima.amp.i.kyoto-u.ac.jp/papers/doctor/2015_doctor_hua
- 9.Liu, H., Palatucci, M., Zhang, J.: Blockwise coordinate descent procedures for the multi-task lasso, with applications to neural semantic basis discovery. In: ICML ’09 Proceedings of the 26th Annual International Conference on Machine Learning, pp. 649–656 (2009)Google Scholar
- 21.Xu, Y., Yin, W.: A block coordinate descent method for multi-convex optimization with applications to nonnegative tensor factorization and completion, Rice University CAAM Technical Report (2012)Google Scholar