Abstract
Our main goal in this paper is to show that one can skip gradient computations for gradient descent type methods applied to certain structured convex programming (CP) problems. To this end, we first present an accelerated gradient sliding (AGS) method for minimizing the summation of two smooth convex functions with different Lipschitz constants. We show that the AGS method can skip the gradient computation for one of these smooth components without slowing down the overall optimal rate of convergence. This result is much sharper than the classic black-box CP complexity results especially when the difference between the two Lipschitz constants associated with these components is large. We then consider an important class of bilinear saddle point problem whose objective function is given by the summation of a smooth component and a nonsmooth one with a bilinear saddle point structure. Using the aforementioned AGS method for smooth composite optimization and Nesterov’s smoothing technique, we show that one only needs \({{\mathcal{O}}}(1/\sqrt{\varepsilon })\) gradient computations for the smooth component while still preserving the optimal \({{\mathcal{O}}}(1/\varepsilon )\) overall iteration complexity for solving these saddle point problems. We demonstrate that even more significant savings on gradient computations can be obtained for strongly convex smooth and bilinear saddle point problems.
Similar content being viewed by others
Notes
See Sect. s 7.1.3 in [8] for the settings on relaxation and inertial parameters; note that \(\rho =2\) and \(\alpha =1/3\) are not applicable for problem (4.1). The stepsize parameter in [8] are chosen as the following: \(\sigma =1/\Vert K\Vert\), and \(\tau\) is the largest value that satisfies convergence conditions (16), (23), or (26) in [8].
References
Arrow, K., Hurwicz, L., Uzawa, H.: Studies in Linear and Non-linear Programming. Stanford Mathematical Studies in the Social Sciences. Stanford University Press (1958). http://books.google.com/books?id=jWi4AAAAIAAJ
Auslender, A., Teboulle, M.: Interior gradient and proximal methods for convex and conic optimization. SIAM J. Optim. 16(3), 697–725 (2006)
Becker, S., Bobin, J., Candès, E.: NESTA: a fast and accurate first-order method for sparse recovery. SIAM J. Imaging Sci. 4(1), 1–39 (2011)
Bregman, L.M.: The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Math. Phys. 7(3), 200–217 (1967)
Chambolle, A.: An algorithm for total variation minimization and applications. J. Math. Imaging Vis. 20(1), 89–97 (2004)
Chambolle, A., Pock, T.: A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math. Imaging Vis. 40(1), 120–145 (2011)
Chambolle, A., Pock, T.: An introduction to continuous optimization for imaging. Acta Numerica 25, 161–319 (2016)
Chambolle, A., Pock, T.: On the ergodic convergence rates of a first-order primal-dual algorithm. Math. Program. 159(1), 253–287 (2016)
Chen, Y., Lan, G., Ouyang, Y.: Accelerated schemes for a class of variational inequalities. arXiv preprint arXiv:1403.4164 (2014)
Chen, Y., Lan, G., Ouyang, Y.: Optimal primal-dual methods for a class of saddle point problems. SIAM J. Optim. 24(4), 1779–1814 (2014)
d’Aspremont, A.: Smooth optimization with approximate gradient. SIAM J. Optim. 19(3), 1171–1183 (2008)
Eckstein, J., Bertsekas, D.P.: On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 55(1–3), 293–318 (1992)
Esser, E., Zhang, X., Chan, T.: A general framework for a class of first order primal-dual algorithms for convex optimization in imaging science. SIAM J. Imaging Sci. 3(4), 1015–1046 (2010)
Ghadimi, S., Lan, G.: Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization i: a generic algorithmic framework. SIAM J. Optim. 22(4), 1469–1492 (2012)
He, B., Yuan, X.: Convergence analysis of primal-dual algorithms for a saddle-point problem: from contraction perspective. SIAM J. Imaging Sci. 5(1), 119–149 (2012)
He, B., Yuan, X.: On the O(1/n) convergence rate of the Douglas-Rdachford alternating direction method. SIAM J. Numer. Anal. 50(2), 700–709 (2012)
He, N., Juditsky, A., Nemirovski, A.: Mirror prox algorithm for multi-term composite minimization and alternating directions. arXiv preprint arXiv:1311.1098 (2013)
He, Y., Monteiro, R.D.: Accelerating block-decomposition first-order methods for solving generalized saddle-point and nash equilibrium problems. Optimization-online preprint (2013)
He, Y., Monteiro, R.D.: An accelerated hpe-type algorithm for a class of composite convex-concave saddle-point problems. Submitt. SIAM J. Optim. (2014)
Hoda, S., Gilpin, A., Pena, J., Sandholm, T.: Smoothing techniques for computing nash equilibria of sequential games. Math. Oper. Res. 35(2), 494–512 (2010)
Juditsky, A., Nemirovski, A., Tauvel, C.: Solving variational inequalities with stochastic mirror-prox algorithm. Stoch. Syst. 1, 17–58 (2011)
Lan, G.: Bundle-level type methods uniformly optimal for smooth and nonsmooth convex optimization. Math. Program. 149(1), 1–45 (2015)
Lan, G.: Gradient sliding for composite optimization. Math. Program. 159(1–2), 201–235 (2016)
Lan, G., Lu, Z., Monteiro, R.D.: Primal-dual first-order methods with \(\cal{O}(1/\epsilon )\) iteration-complexity for cone programming. Math. Program. 126(1), 1–29 (2011)
Lorenz, D.A., Pock, T.: An inertial forward-backward algorithm for monotone inclusions. J. Math. Imaging Vis. 51(2), 311–325 (2015)
Monteiro, R.D., Svaiter, B.F.: Iteration-complexity of block-decomposition algorithms and the alternating direction method of multipliers. SIAM J. Optim. 23(1), 475–507 (2013)
Nemirovski, A.: Prox-method with rate of convergence \({O}(1/t)\) for variational inequalities with Lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM J. Optim. 15(1), 229–251 (2004)
Nemirovski, A., Yudin, D.: Problem Complexity and Method Efficiency in Optimization. Wiley-Interscience Series in Discrete Mathematics, Wiley, New York (1983)
Nesterov, Y.: Excessive gap technique in nonsmooth convex minimization. SIAM J. Optim. 16(1), 235–249 (2005)
Nesterov, Y.: Smooth minimization of non-smooth functions. Math. Program. 103(1), 127–152 (2005)
Nesterov, Y.E.: A method for unconstrained convex minimization problem with the rate of convergence \(O(1/k^2)\). Doklady AN SSSR 269, 543–547 (1983)
Nesterov, Y.E.: Introductory Lectures on Convex Optimization: A Basic Course. Kluwer Academic Publishers, Norwell (2004)
Ouyang, H., He, N., Tran, L., Gray, A.G.: Stochastic alternating direction method of multipliers. In: Proceedings of the 30th International Conference on Machine Learning (ICML-13), pp. 80–88 (2013)
Ouyang, Y., Chen, Y., Lan, G., Eduardo Pasiliao, J.: An accelerated linearized alternating direction method of multipliers. SIAM J. Imaging Sci. 8(1), 644–681 (2015)
Ouyang, Y., Xu, Y.: Lower complexity bounds of first-order methods for convex-concave bilinear saddle-point problems. Math. Program. 185(1), 1–35 (2021)
Tseng, P.: On accelerated proximal gradient methods for convex-concave optimization. Submitt. SIAM J. Optim. (2008)
Zhu, M., Chan, T.: An efficient primal-dual hybrid gradient algorithm for total variation image restoration. UCLA CAM Report, pp. 08–34 (2008)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Guanghui Lan is partially supported by National Science Foundation Grants 1319050, 1637473 and 1637474, and Office of Naval Research Grant N00014-16-1-2802. Yuyuan Ouyang is partially supported by US Dept. of the Air Force Grant FA9453-19-1-0078 and Office of Naval Research Grant N00014-19-1-2295.
Rights and permissions
About this article
Cite this article
Lan, G., Ouyang, Y. Accelerated gradient sliding for structured convex optimization. Comput Optim Appl 82, 361–394 (2022). https://doi.org/10.1007/s10589-022-00365-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10589-022-00365-z