Abstract
In this paper, we consider smooth convex optimization problems with simple constraints and inexactness in the oracle information such as value, partial or directional derivatives of the objective function. We introduce a unifying framework, which allows to construct different types of accelerated randomized methods for such problems and to prove convergence rate theorems for them. We focus on accelerated random block-coordinate descent, accelerated random directional search, accelerated random derivative-free method and, using our framework, provide their versions for problems with inexact oracle information. Our contribution also includes accelerated random block-coordinate descent with inexact oracle and entropy proximal setup as well as derivative-free version of this method. Moreover, we present an extension of our framework for strongly convex optimization problems. We also discuss an extension for the case of inexact model of the objective function.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agarwal, A., Dekel, O., Xiao, L.: Optimal algorithms for online convex optimization with multi-point bandit feedback. In: COLT 2010—The 23rd Conference on Learning Theory (2010)
Allen-Zhu, Z.: Katyusha: the first direct acceleration of stochastic gradient methods. In: Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, New York, NY, USA, pp. 1200–1205. ACM (2017). https://doi.org/10.1145/3055399.3055448, arXiv:1603.05953
Allen-Zhu, Z., Qu, Z., Richtarik, P., Yuan, Y.: Even faster accelerated coordinate descent using non-uniform sampling. In: Balcan, M.F., Weinberger, K.Q. (eds.) Proceedings of The 33rd International Conference on Machine Learning, Proceedings of Machine Learning Research, New York, New York, USA, 20–22 Jun 2016, PMLR, vol. 48, pp. 1110–1119. http://proceedings.mlr.press/v48/allen-zhuc16.html. First appeared in arXiv:1512.09103
Bayandina, A., Gasnikov, A., Lagunovskaya, A.: Gradient-free two-points optimal method for non smooth stochastic convex optimization problem with additional small noise. Autom. Remote Control 79 (2018). https://doi.org/10.1134/S0005117918080039, arXiv:1701.03821
Ben-Tal, A., Nemirovski, A.: Lectures on Modern Convex Optimization. Lecture Notes (2015)
Berahas, A.S., Cao, L., Choromanski, K., Scheinberg, K.: A theoretical and empirical comparison of gradient approximations in derivative-free optimization. Found. Comput. Math. 1–54 (2021). https://doi.org/10.1007/s10208-021-09513-z
Berahas, A.S., Cao, L., Scheinberg, K.: Global convergence rate analysis of a generic line search algorithm with noise. SIAM J. Optim. 31, 1489–1518 (2021). https://doi.org/10.1137/19M1291832
Bogolubsky, L., Dvurechenskii, P., Gasnikov, A., Gusev, G., Nesterov, Y., Raigorodskii, A.M., Tikhonov, A., Zhukovskii, M.: Learning supervised pagerank with gradient-based and gradient-free optimization methods. In: Lee D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 29, pp. 4914–4922. Curran Associates, Inc., (2016). http://papers.nips.cc/paper/6565-learning-supervised-pagerank-with-gradient-based-and-gradient-free-optimization-methods.pdf. arXiv:1603.00717
Cohen, M., Diakonikolas, J., Orecchia, L.: On acceleration with noise-corrupted gradients. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, Proceedings of Machine Learning Research, Stockholmsmässan, Stockholm Sweden, 10–15 Jul 2018, vol. 80, pp. 1019–1028. PMLR. http://proceedings.mlr.press/v80/cohen18a.html. arXiv:1805.12591
Conn, A., Scheinberg, K., Vicente, L.: Introduction to Derivative-Free Optimization, Society for Industrial and Applied Mathematics (2009). https://doi.org/10.1137/1.9780898718768, http://epubs.siam.org/doi/abs/10.1137/1.9780898718768
Dang, C.D., Lan, G.: Stochastic block mirror descent methods for nonsmooth and stochastic optimization. SIAM J. Optim. 25, 856–881 (2015). https://doi.org/10.1137/130936361
Danilova, M., Dvurechensky, P., Gasnikov, A., Gorbunov, E., Guminov, S., Kamzolov, D., Shibaev, I.: Recent theoretical advances in non-convex optimization, pp. 79–163. Springer International Publishing, Cham (2020). ISBN 978-3-031-00832-0. https://doi.org/10.1007/978-3-031-00832-03. arXiv:2012.06188. (accepted)
d’Aspremont, A.: Smooth optimization with approximate gradient. SIAM J. Optim. 19, 1171–1183 (2008). https://doi.org/10.1137/060676386
Devolder, O., Glineur, F., Nesterov, Y.: First-order methods of smooth convex optimization with inexact oracle. Math. Program. 146, 37–75 (2014). https://doi.org/10.1007/s10107-013-0677-5
Duchi, J.C., Jordan, M.I., Wainwright, M.J., Wibisono, A.: Optimal rates for zero-order convex optimization: the power of two function evaluations. IEEE Trans. Inf. Theory 61, 2788–2806 (2015). https://doi.org/10.1109/TIT.2015.2409256, arXiv:1312.2139
Dvinskikh, D., Ogaltsov, A., Gasnikov, A., Dvurechensky, P., Spokoiny, V.: On the line-search gradient methods for stochastic optimization. IFAC-PapersOnLine 53, 1715–1720 (2020). https://doi.org/10.1016/j.ifacol.2020.12.2284, https://www.sciencedirect.com/science/article/pii/S240589632032944X. 21th IFAC World Congress, arXiv:1911.08380
Dvinskikh D.M, Turin, A.I., Gasnikov, A.V., Omelchenko, S.S.: Accelerated and non accelerated stochastic gradient descent in model generality. Matematicheskie Zametki 108, 515–528 (2020). https://doi.org/10.1134/S0001434620090230
Dvurechensky, P., Dvinskikh, D., Gasnikov, A., Uribe, C.A., Nedić, A.: Decentralize and randomize: faster algorithm for Wasserstein barycenters. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R., (eds.) Advances in Neural Information Processing Systems 31, NeurIPS 2018, Curran Associates, Inc., pp. 10783–10793 (2018). http://papers.nips.cc/paper/8274-decentralize-and-randomize-faster-algorithm-for-wasserstein-barycenters.pdf, arXiv:1806.03915
Dvurechensky, P., Gasnikov, A.: Stochastic intermediate gradient method for convex problems with stochastic inexact oracle. J. Optim. Theory Appl. 171, pp. 121–145 (2016). https://doi.org/10.1007/s10957-016-0999-6
Dvurechensky, P., Gasnikov, A., Omelchenko, S., Tiurin, A.: A stable alternative to Sinkhorn’s algorithm for regularized optimal transport. In: Kononov, A., Khachay, M., Kalyagin, V.A., Pardalos, P., (eds.) Mathematical Optimization Theory and Operations Research, pp. 406–423. Springer International Publishing, Cham (2020). https://doi.org/10.1007/978-3-030-49988-4_28
Dvurechensky, P., Gorbunov, E., Gasnikov, A.: An accelerated directional derivative method for smooth stochastic convex optimization. Eur. J. Oper. Res. 290, 601–621 (2021). https://doi.org/10.1016/j.ejor.2020.08.027, http://www.sciencedirect.com/science/article/pii/S0377221720307402
Fercoq, O., Richtárik, P.: Accelerated, parallel, and proximal coordinate descent. SIAM J. Optim. 25, 1997–2023 (2015). https://doi.org/10.1137/130949993, First appeared in arXiv:1312.5799
Frostig, R., Ge, R., Kakade, S., Sidford, A.: Un-regularizing: approximate proximal point and faster stochastic algorithms for empirical risk minimization. In: Bach, F., Blei, D., (eds.) Proceedings of the 32nd International Conference on Machine Learning, Proceedings of Machine Learning Research, Lille, France, 07–09 Jul 2015, vol. 37, pp. 2540–2548. PMLR. http://proceedings.mlr.press/v37/frostig15.html
Gasnikov, A.: Universal gradient descent (2017). arXiv:1711.00394
Gasnikov, A., Dvurechensky, P., Nesterov, Y.: Stochastic gradient methods with inexact oracle. Proc. Mosc. Inst. Phys. Technol. 8, pp. 41–91 (2016). In Russian, first appeared in arXiv:1411.4218
Gasnikov, A., Dvurechensky, P., Usmanova, I.: On accelerated randomized methods. Proc. Mosc. Inst. Phys. Technol. 8, pp. 67–100 (2016). In Russian, first appeared in arXiv:1508.02182
Gasnikov, A., Tyurin, A.: Fast gradient descent for convex minimization problems with an oracle producing a (\(\delta \), l)-model of function at the requested point. Comput. Math. Math. Phys. 59, 1085–1097 (2019). https://doi.org/10.1134/S0965542519070078
Gasnikov, A.V., Dvurechensky, P.E.: Stochastic intermediate gradient method for convex optimization problems. Dokl. Math. 93, 148–151 (2016). https://doi.org/10.1134/S1064562416020071
Gasnikov, A.V., Dvurechensky, P.E., Zhukovskii, M.E., Kim, S.V., Plaunov, S.S., Smirnov, D.A., Noskov, F.A.: About the power law of the pagerank vector component distribution. Part 2. The Buckley–Osthus model, verification of the power law for this model, and setup of real search engines. Numer. Anal. Appl. 11, 16–32 (2018). https://doi.org/10.1134/S1995423918010032
Gasnikov, A.V., Gasnikova, E.V., Dvurechensky, P.E., Mohammed, A.A.M., Chernousova, E.O.: About the power law of the pagerank vector component distribution. Part 1. Numerical methods for finding the pagerank vector. Numer. Anal. Appl. 10, 299–312 (2017). https://doi.org/10.1134/S1995423917040024
Gasnikov, A.V., Krymova, E.A., Lagunovskaya, A.A., Usmanova, I.N., Fedorenko, F.A.: Stochastic online optimization. Single-point and multi-point non-linear multi-armed bandits. Convex and strongly-convex case. Autom. Remote Control 78, 224–234 (2017). https://doi.org/10.1134/S0005117917020035, arXiv:1509.01679
Gasnikov, A.V., Lagunovskaya, A.A., Usmanova, I.N., Fedorenko, F.A.: Gradient-free proximal methods with inexact oracle for convex stochastic nonsmooth optimization problems on the simplex. Autom. Remote Control 77, 2018–2034 (2016). https://doi.org/10.1134/S0005117916110114, arXiv:1412.3890
Gasnikov, A.V., Nesterov, Y.E.: Universal method for stochastic composite optimization problems. Comput. Math. Math. Phys. 58, 48–64 (2018). https://doi.org/10.7868/S0044466918010052
Ghadimi, S., Lan, G.: Stochastic first- and zeroth-order methods for nonconvex stochastic programming. SIAM J. Optim. 23, 2341–2368 (2013). https://doi.org/10.1137/120880811, arXiv:1309.5549
Ghadimi, S., Lan, G., Zhang, H.: Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization. Math. Program. 155, 267–305 (2016). https://doi.org/10.1007/s10107-014-0846-1, arXiv:1308.6594
Gladin, E., Sadiev, A., Gasnikov, A., Dvurechensky, P., Beznosikov, A., Alkousa, M.: Solving smooth min-min and min-max problems by mixed oracle algorithms. In: Strekalovsky, A., Kochetov, Y., Gruzdeva, T., Orlov, A., (eds.) Mathematical Optimization Theory and Operations Research: Recent Trends. pp. 19–40. Springer International Publishing, Cham (2021). https://link.springer.com/chapter/10.1007/978-3-030-86433-0_2, arXiv:2103.00434
Gorbunov, E., Danilova, M., Shibaev, I., Dvurechensky, P., Gasnikov, A.: Near-optimal high probability complexity bounds for non-smooth stochastic optimization with heavy-tailed noise (2021). arXiv:2106.05958
Gorbunov, E., Dvurechensky, P., Gasnikov, A.: An accelerated method for derivative-free smooth stochastic convex optimization. SIAM J. Optim. 32(2), 1210–1238 (2022). arXiv:1802.09022
Ivanova, A., Gasnikov, A., Dvurechensky, P., Dvinskikh, D., Tyurin, A., Vorontsova, E., Pasechnyuk, D.: Oracle complexity separation in convex optimization. Optim. Methods. Softw. 36(4), 720–754 (2021) https://doi.org/10.1080/10556788.2020.1712599. arXiv:2002.02706. WIAS Preprint No. 2711
Juditsky, A., Nesterov, Y.: Deterministic and stochastic primal-dual subgradient algorithms for uniformly convex minimization. Stoch. Syst. 4, 44–80 (2014). https://doi.org/10.1287/10-SSY010
Lan, G., Zhou, Y.: An optimal randomized incremental gradient method. Math. Program. (2017). https://doi.org/10.1007/s10107-017-1173-0
Larson, J., Menickelly, M., Wild, S.M.: Derivative-free optimization methods. Acta Numer. 28, 287–404 (2019). https://doi.org/10.1017/s0962492919000060
Lee, Y.T., Sidford, A.: Efficient accelerated coordinate descent methods and faster algorithms for solving linear systems. In: Proceedings of the 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, FOCS ’13, pp. 147–156. IEEE Computer Society, Washington, DC, USA (2013). https://doi.org/10.1109/FOCS.2013.24. First appeared in arXiv:1305.1922
Lin, H., Mairal, J., Harchaoui, Z.: A universal catalyst for first-order optimization. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, NIPS’15, pp. 3384–3392. MIT Press, Cambridge, MA, USA (2015). http://dl.acm.org/citation.cfm?id=2969442.2969617
Lin, Q., Lu, Z., Xiao, L.: An accelerated proximal coordinate gradient method. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q., (eds.) Advances in Neural Information Processing Systems, vol. 27, pp. 3059–3067. Curran Associates, Inc., (2014). http://papers.nips.cc/paper/5356-an-accelerated-proximal-coordinate-gradient-method.pdf. First appeared in arXiv:1407.1296
Nesterov, Y.: Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22, 341–362 (2012). https://doi.org/10.1137/100802001. First appeared in 2010 as CORE discussion paper 2010/2
Nesterov, Y.: Lectures on convex optimization, vol. 137, Springer, Berlin (2018). https://doi.org/10.1007/978-3-319-91578-4
Nesterov, Y., Spokoiny, V.: Random gradient-free minimization of convex functions. Found. Comput. Math. 17, 527–566 (2017). https://doi.org/10.1007/s10208-015-9296-2. First appeared in 2011 as CORE discussion paper 2011/16
Nesterov, Y., Stich, S.U.: Efficiency of the accelerated coordinate descent method on structured optimization problems. SIAM J. Optim. 27, 110–123 (2017). https://doi.org/10.1137/16M1060182
Rogozin, A., Bochko, M., Dvurechensky, P., Gasnikov, A., Lukoshkin, V.: An accelerated method for decentralized distributed stochastic optimization over time-varying graphs. In: 2021 60th IEEE Conference on Decision and Control (CDC), pp. 3367–3373 (2021). https://doi.org/10.1109/CDC45484.2021.9683110. arXiv:2103.15598
Sadiev, A., Beznosikov, A., Dvurechensky, P., Gasnikov, A.: Zeroth-order algorithms for smooth saddle-point problems. In: Strekalovsky, A., Kochetov, Y., Gruzdeva, T., Orlov, A., (eds.) Mathematical Optimization Theory and Operations Research: Recent Trends, pp. 71–85. Springer International Publishing, Cham (2021). https://link.springer.com/chapter/10.1007/978-3-030-86433-0_5, arXiv:2009.09908
Shalev-Shwartz, S., Zhang, T.: Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization. In: Xing, E.P., Jebara, T., (eds.) Proceedings of the 31st International Conference on Machine Learning, Proceedings of Machine Learning Research, 22–24 Jun 2014, vol. 32, pp. 64–72. PMLR, Bejing, China (2014). http://proceedings.mlr.press/v32/shalev-shwartz14.html. First appeared in arXiv:1309.2375
Shamir, O.: An optimal algorithm for bandit and zero-order convex optimization with two-point feedback. J. Mach. Learn. Res. 18, 52:1–52:11 (2017). http://jmlr.org/papers/v18/papers/v18/16-632.html. First appeared in arXiv:1507.08752
Shibaev, I., Dvurechensky, P., Gasnikov, A.: Zeroth-order methods for noisy Hölder-gradient functions. Optim. Lett. 16(7), 2123–2143 Sep (2022). https://doi.org/10.1007/s11590-021-01742-z. arXiv:2006.11857
Stonyakin, F., Tyurin, A., Gasnikov, A., Dvurechensky, P., Agafonov, A., Dvinskikh, D., Alkousa, M., Pasechnyuk, D., Artamonov, S., Piskunova, V.: Inexact model: a framework for optimization and variational inequalities. Optim. Methods Softw. 36(6), 1155–1201 (2021). https://doi.org/10.1080/10556788.2021.1924714. WIAS Preprint No. 2709, arXiv:2001.09013, arXiv:1902.00990
Stonyakin, F.S., Dvinskikh, D., Dvurechensky, P., Kroshnin, A., Kuznetsova, O., Agafonov, A., Gasnikov, A., Tyurin, A., Uribe, C.A., Pasechnyuk, D., Artamonov, S.: Gradient methods for problems with inexact model of the objective. In: Khachay, M., Kochetov, Y., Pardalos, P., (eds.) Mathematical Optimization Theory and Operations Research, pp. 97–114, Springer International Publishing, Cham (2019). arXiv:1902.09001
Tappenden, R., Richtárik, P., Gondzio, J.: Inexact coordinate descent: complexity and preconditioning. J. Optim. Theory Appl. 170, 144–176 (2016). https://doi.org/10.1007/s10957-016-0867-4. First appeared in arXiv:1304.5530
Tyurin, A.: Mirror version of similar triangles method for constrained optimization problems (2017). arXiv:1705.09809
Vorontsova, E.A., Gasnikov, A.V., Gorbunov, E.A., Dvurechenskii, P.E.: Accelerated gradient-free optimization methods with a non-euclidean proximal operator. Autom. Remote Control 80, 1487–1501 (2019). https://doi.org/10.1134/S0005117919080095
Zhang, Y., Lin, X.: Stochastic primal-dual coordinate method for regularized empirical risk minimization. In: Bach, F., Blei, D., (eds.) Proceedings of the 32nd International Conference on Machine Learning, Proceedings of Machine Learning Research, 07–09 July 2015, vol. 37, pp. 353–361. PMLR, Lille, France. http://proceedings.mlr.press/v37/zhanga15.html
Acknowledgements
The authors are very grateful to Yu. Nesterov and V. Spokoiny for fruitful discussions. Our interest to this field was initiated by the paper [48]. The research was supported by the Ministry of Science and Higher Education of the Russian Federation (Goszadaniye) No. 075-00337-20-03, project No. 0714-2020-0005.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Dvurechensky, P., Gasnikov, A., Tyurin, A., Zholobov, V. (2023). Unifying Framework for Accelerated Randomized Methods in Convex Optimization. In: Belomestny, D., Butucea, C., Mammen, E., Moulines, E., Reiß, M., Ulyanov, V.V. (eds) Foundations of Modern Statistics. FMS 2019. Springer Proceedings in Mathematics & Statistics, vol 425. Springer, Cham. https://doi.org/10.1007/978-3-031-30114-8_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-30114-8_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30113-1
Online ISBN: 978-3-031-30114-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)