Oracle Complexity Separation in Convex Optimization

Ivanova, Anastasiya; Dvurechensky, Pavel; Vorontsova, Evgeniya; Pasechnyuk, Dmitry; Gasnikov, Alexander; Dvinskikh, Darina; Tyurin, Alexander

doi:10.1007/s10957-022-02038-7

Oracle Complexity Separation in Convex Optimization

Published: 27 April 2022

Volume 193, pages 462–490, (2022)
Cite this article

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Anastasiya Ivanova^1,2,
Pavel Dvurechensky³,
Evgeniya Vorontsova ORCID: orcid.org/0000-0003-2173-6503⁴,
Dmitry Pasechnyuk^5,6,
Alexander Gasnikov^1,5,6,7,
Darina Dvinskikh^1,5,6 &
…
Alexander Tyurin¹

344 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Many convex optimization problems have structured objective functions written as a sum of functions with different oracle types (e.g., full gradient, coordinate derivative, stochastic gradient) and different arithmetic operations complexity of these oracles. In the strongly convex case, these functions also have different condition numbers that eventually define the iteration complexity of first-order methods and the number of oracle calls required to achieve a given accuracy. Motivated by the desire to call more expensive oracles fewer times, we consider the problem of minimizing the sum of two functions and propose a generic algorithmic framework to separate oracle complexities for each function. The latter means that the oracle for each function is called the number of times that coincide with the oracle complexity for the case when the second function is absent. Our general accelerated framework covers the setting of (strongly) convex objectives, the setting when both parts are given through full coordinate oracle, as well as when one of them is given by coordinate derivative oracle or has the finite-sum structure and is available through stochastic gradient oracle. In the latter two cases, we obtain accelerated random coordinate descent and accelerated variance reduced methods with oracle complexity separation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Zeroth-order algorithms for nonconvex–strongly-concave minimax problems with improved complexities

Article 29 April 2022

A Simple Method for Convex Optimization in the Oracle Model

Stochastic Intermediate Gradient Method for Convex Problems with Stochastic Inexact Oracle

Article 23 August 2016

Notes

Here and below, for simplicity, we hide numerical constant and polylogarithmic factors using non-asymptotic \({\tilde{O}}\)-notation. More precisely, \(\psi _1(\varepsilon ,\delta ) = {\tilde{O}}(\psi _2(\varepsilon ,\delta ))\) if there exist constants \(C,a,b>0\) such that, for all \(\varepsilon >0\), \(\delta \in (0,1)\), \(\psi _1(\varepsilon ,\delta ) \le C\psi _2(\varepsilon ,\delta )\ln ^a\frac{1}{\varepsilon }\ln ^b\frac{1}{\delta }\).
http://archive.ics.uci.edu/ml/datasets/gas+sensors+for+home+activity+monitoring.
Source code of these experiments is available at: https://github.com/dmivilensky/Sliding-for-Kernel-SVM.
In this case, an efficient way to recalculate the partial derivatives of h(x) is as follows. From the structure of the method, we know that \(x^{new} = \alpha x^{old} + \beta e_i\), where \(e_i\) is ith coordinate vector. Thus, given \(\left\langle A_k, x^{old} \right\rangle \), recalculating \(\left\langle A_k, x^{new} \right\rangle = \alpha \left\langle A_k, x^{old} \right\rangle + \beta [A_k]_i\) requires only O(1) additional arithmetic operations independently of n and s.

References

Agarwal, A., Bottou, L.: A lower bound for the optimization of finite sums. In: Bach, F., Blei, D. (eds.) Proceedings of the 32nd International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 37, pp. 78–86. JMLR, Inc. and Microtome Publishing, Lille (2015). https://proceedings.mlr.press/v37/agarwal15.html
Alkousa, M., Gasnikov, A., Dvurechensky, P., Sadiev, A., Razouk, L.: An Approach for Non-convex Uniformly Concave Structured Saddle Point Problem. arXiv:2202.06376 (2022)
Allen-Zhu, Z.: Katyusha: the first direct acceleration of stochastic gradient methods. J. Mach. Learn. Res. 18(221), 1–51 (2018)
MathSciNet MATH Google Scholar
Allen-Zhu, Z., Qu, Z., Richtárik, P., Yuan, Y.: Even faster accelerated coordinate descent using non-uniform sampling. In: Balcan, M.F., Weinberger, K.Q. (eds.) Proceedings of The 33rd International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 48, pp. 1110–1119. JMLR, Inc. and Microtome Publishing, New York. http://proceedings.mlr.press/v48/allen-zhuc16.html (2016)
Beznosikov, A., Gorbunov, E., Gasnikov, A.: Derivative-free method for composite optimization with applications to decentralized distributed optimization. IFAC-PapersOnLine 53(2), 4038–4043 (2020)
Article Google Scholar
Bogolubsky, L., Dvurechenskii, P., Gasnikov, A., Gusev, G., Nesterov, Y., Raigorodskii, A.M., Tikhonov, A., Zhukovskii, M.: Learning supervised pagerank with gradient-based and gradient-free optimization methods. In: Lee, D., Sugiyama, M., Luxburg, U., Guyon, I., Garnett, R. (eds) Advances in Neural Information Processing Systems, vol. 29, pp. 4914-4922. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2016/file/1f34004ebcb05f9acda6016d5cc52d5e-Paper.pdf (2016)
Chen, P.Y., Zhang, H., Sharma, Y., Yi, J., Hsieh, C.J.: ZOO: Zeroth Order Optimization Based Black-Box Attacks to Deep Neural Networks without Training Substitute Models, pp. 15–26. Association for Computing Machinery, New York. https://doi.org/10.1145/3128572.3140448 (2017)
Dvinskikh, D., Gasnikov, A.: Decentralized and parallel primal and dual accelerated methods for stochastic convex programming problems. J. Inverse Ill-Posed Probl. 29(3), 385–405 (2021). https://doi.org/10.1515/jiip-2020-0068
Article MathSciNet MATH Google Scholar
Dvurechensky, P., Gasnikov, A., Tiurin, A., Zholobov, V.: Unifying framework for accelerated randomized methods in convex optimization. arXiv:1707.08486 (2017)
Dvurechensky, P., Gorbunov, E., Gasnikov, A.: An accelerated directional derivative method for smooth stochastic convex optimization. Eur. J. Oper. Res. 290(2), 601–621 (2021). https://doi.org/10.1016/j.ejor.2020.08.027
Article MathSciNet MATH Google Scholar
Dvurechensky, P., Shtern, S., Staudigl, M.: First-order methods for convex optimization. EURO J. Comput. Optim. 9, 100015 (2021). https://doi.org/10.1016/j.ejco.2021.100015, https://www.sciencedirect.com/science/article/pii/S2192440621001428, arXiv:2101.00935
Dvurechensky, P.E., Gasnikov, A.V., Nurminski, E.A., Stonyakin, F.S.: Advances in low-memory subgradient optimization, pp. 19–59. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-34910-3_2, arXiv:1902.01572
Fercoq, O., Richtárik, P.: Accelerated, parallel, and proximal coordinate descent. SIAM J. Optim. 25(4), 1997–2023 (2015). https://doi.org/10.1137/130949993
Article MathSciNet MATH Google Scholar
Gasnikov, A., Dvurechensky, P., Usmanova, I.: About accelerated randomized methods. Proc. Moscow Inst. Phys. Technol. 8(2), 67–100 (2016)
Google Scholar
Gasnikov, A., Novitskii, A., Novitskii, V., Abdukhakimov, F., Kamzolov, D., Beznosikov, A., Takac, M., Dvurechensky, P., Gu, B.: The power of first-order smooth optimization for black-box non-smooth problems. arXiv:2201.12289 (2022)
Gasnikov, A.V., Dvinskikh, D.M., Dvurechensky, P.E., Kamzolov, D.I., Matyukhin, V.V., Pasechnyuk, D.A., Tupitsa, N.K., Chernov, A.V.: Accelerated meta-algorithm for convex optimization problems. Comput. Math. Math. Phys. 61(1), 17–28 (2021). https://doi.org/10.1134/s096554252101005x
Article MathSciNet MATH Google Scholar
Gladin, E., Sadiev, A., Gasnikov, A., Dvurechensky, P., Beznosikov, A., Alkousa, M.: Solving smooth min-min and min-max problems by mixed oracle algorithms. In: Strekalovsky, A., Kochetov, Y., Gruzdeva, T., Orlov, A. (eds) Mathematical Optimization Theory and Operations Research: Recent Trends, pp. 19–40. Springer, Cham (2021). https://link.springer.com/chapter/10.1007/978-3-030-86433-0_2
Gorbunov, E., Dvurechensky, P., Gasnikov, A.: An accelerated method for derivative-free smooth stochastic convex optimization. SIAM J. Optim. (2022). (accepted). arXiv:1802.09022
Ivanova, A., Pasechnyuk, D., Grishchenko, D., Shulgin, E., Gasnikov, A., Matyukhin, V.: Adaptive catalyst for smooth convex optimization. In: Optimization and Applications, pp. 20–37. Springer (2021). https://doi.org/10.1007/978-3-030-91059-4_2
Ivanova, A., Vorontsova, E., Pasechnyuk, D., Gasnikov, A., Dvurechensky, P., Dvinskikh, D., Tyurin, A.: Oracle complexity separation in convex optimization. arXiv:2002.02706 (2020)
Kamzolov, D., Gasnikov, A., Dvurechensky, P.: Optimal combination of tensor optimization methods. In: Olenev, N., Evtushenko, Y., Khachay, M., Malkova, V. (eds.) Optim. Appl., pp. 166–183. Springer, Cham (2020)
Google Scholar
Lan, G.: Gradient sliding for composite optimization. Math. Program. 159(1–2), 201–235 (2015). https://doi.org/10.1007/s10107-015-0955-5
Article MathSciNet MATH Google Scholar
Lan, G., Li, Z., Zhou, Y.: A unified variance-reduced accelerated gradient method for convex optimization. In: Advances in Neural Information Processing Systems, pp. 10462–10472. Curran Associates Inc. (2019)
Lan, G., Ouyang, Y.: Accelerated gradient sliding for structured convex optimization. arXiv:1609.04905 (2016)
Lan, G., Ouyang, Y.: Mirror-prox sliding methods for solving a class of monotone variational inequalities. arXiv:2111.00996 (2021)
Lan, G., Zhou, Y.: Conditional gradient sliding for convex optimization. SIAM J. Optim. 26(2), 1379–1409 (2016). https://doi.org/10.1137/140992382
Article MathSciNet MATH Google Scholar
Lan, G., Zhou, Y.: An optimal randomized incremental gradient method. Math. Program. 171(1–2), 167–215 (2017). https://doi.org/10.1007/s10107-017-1173-0
Article MathSciNet MATH Google Scholar
Lin, H., Mairal, J., Harchaoui, Z.: A universal catalyst for first-order optimization. In: Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (eds) Advances in Neural Information Processing Systems, vol. 28, pp. 3384–3392. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2015/file/c164bbc9d6c72a52c599bbb43d8db8e1-Paper.pdf (2015)
Lin, H., Mairal, J., Harchaoui, Z.: Catalyst acceleration for first-order convex optimization: from theory to practice. J. Mach. Learn. Res. 18(212), 1–54 (2018). http://jmlr.org/papers/v18/17-748.html
Lin, Q., Lu, Z., Xiao, L.: An accelerated proximal coordinate gradient method. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K.Q. (eds) Advances in Neural Information Processing Systems, vol. 27, pp. 3059–3067. Curran Associates, Inc. (2014). https://proceedings.neurips.cc/paper/2014/file/8f19793b2671094e63a15ab883d50137-Paper.pdf
Monteiro, R.D.C., Svaiter, B.F.: An accelerated hybrid proximal extragradient method for convex optimization and its implications to second-order methods. SIAM J. Optim. 23(2), 1092–1125 (2013). https://doi.org/10.1137/110833786
Article MathSciNet MATH Google Scholar
Nemirovsky, A.S., Yudin, D.B.: Problem Complexity and Method Efficiency in Optimization. Wiley-Blackwell, Chichester, New York (1983)
Google Scholar
Nesterov, Y.: Smooth minimization of non-smooth functions. Math. Program. 103(1), 127–152 (2004). https://doi.org/10.1007/s10107-004-0552-5
Article MathSciNet MATH Google Scholar
Nesterov, Y.: Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22(2), 341–362 (2012). https://doi.org/10.1137/100802001
Article MathSciNet MATH Google Scholar
Nesterov, Y.: Gradient methods for minimizing composite functions. Math. Program. 140(1), 125–161 (2012). https://doi.org/10.1007/s10107-012-0629-5
Article MathSciNet MATH Google Scholar
Nesterov, Y.: Lectures on Convex Optimization, 2nd edn. Springer, Berlin (2018)
Book Google Scholar
Nesterov, Y., Spokoiny, V.: Random gradient-free minimization of convex functions. Found. Comput. Math. 17(2), 527–566 (2015). https://doi.org/10.1007/s10208-015-9296-2
Article MathSciNet MATH Google Scholar
Nesterov, Y., Stich, S.U.: Efficiency of the accelerated coordinate descent method on structured optimization problems. SIAM J. Optim. 27(1), 110–123 (2017). https://doi.org/10.1137/16M1060182
Article MathSciNet MATH Google Scholar
Rogozin, A., Beznosikov, A., Dvinskikh, D., Kovalev, D., Dvurechensky, P., Gasnikov, A.: Decentralized distributed optimization for saddle point problems. arXiv:2102.07758 (2021)
Rogozin, A., Bochko, M., Dvurechensky, P., Gasnikov, A., Lukoshkin, V.: An accelerated method for decentralized distributed stochastic optimization over time-varying graphs. In: 2021 60th IEEE Conference on Decision and Control (CDC), pp. 3367–3373. https://doi.org/10.1109/CDC45484.2021.9683110 (2021)
Sadiev, A., Beznosikov, A., Dvurechensky, P., Gasnikov, A.: Zeroth-order algorithms for smooth saddle-point problems. In: Strekalovsky, A., Kochetov, Y., Gruzdeva, T., Orlov, A. (eds) Mathematical Optimization Theory and Operations Research: Recent Trends, pp. 71–85. Springer, Cham (2021). https://link.springer.com/chapter/10.1007/978-3-030-86433-0_5, ArXiv:2009.09908
Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)
Google Scholar
Shibaev, I., Dvurechensky, P., Gasnikov, A.: Zeroth-order methods for noisy Hölder-gradient functions. Optim. Lett. (2021). https://doi.org/10.1007/s11590-021-01742-z
Article Google Scholar
Spokoiny, V., Panov, M.: Accuracy of gaussian approximation in nonparametric Bernstein–von Mises theorem. arXiv:1910.06028 (2019)
Stepanov, I., Voronov, A., Beznosikov, A., Gasnikov, A.: One-point gradient-free methods for composite optimization with applications to distributed optimization. arXiv:2107.05951 (2021)
Stonyakin, F., Tyurin, A., Gasnikov, A., Dvurechensky, P., Agafonov, A., Dvinskikh, D., Alkousa, M., Pasechnyuk, D., Artamonov, S., Piskunova, V.: Inexact model: a framework for optimization and variational inequalities. Optim. Methods Softw. (2021). https://doi.org/10.1080/10556788.2021.1924714
Article Google Scholar
Tominin, V., Tominin, Y., Borodich, E., Kovalev, D., Gasnikov, A., Dvurechensky, P.: On accelerated methods for saddle-point problems with composite structure. arXiv:2103.09344 (2021)
Tu, C.C., Ting, P., Chen, P.Y., Liu, S., Zhang, H., Yi, J., Hsieh, C.J., Cheng, S.M.: Autozoom: autoencoder-based zeroth order optimization method for attacking black-box neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 742–749 (2019)
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, New York (1995)
Book Google Scholar
Vorontsova, E.A., Gasnikov, A.V., Gorbunov, E.A., Dvurechenskii, P.E.: Accelerated gradient-free optimization methods with a non-Euclidean proximal operator. Autom. Remote. Control. 80(8), 1487–1501 (2019). https://doi.org/10.1134/s0005117919080095
Article MathSciNet MATH Google Scholar
Zhang, X., Saha, A., Vishwanathan, S.: Regularized risk minimization by Nesterov’s accelerated gradient methods: algorithmic extensions and empirical studies. arXiv:1011.0472 (2010)
Zhang, Y., Xiao, L.: Stochastic primal-dual coordinate method for regularized empirical risk minimization. In: Bach, F., Blei, D. (eds) Proceedings of the 32nd International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 37, pp. 353–361. PMLR, Lille. http://proceedings.mlr.press/v37/zhanga15.html (2015)

Download references

Acknowledgements

This work was supported by a grant for research centers in the field of artificial intelligence, provided by the Analytical Center for the Government of the Russian Federation in accordance with the subsidy Agreement (Agreement Identifier 000000D730321P5Q0002 ) and the Agreement with the Ivannikov Institute for System Programming of the Russian Academy of Sciences dated November 2, 2021 No. 70-2021-00142. The work of A. Ivanova was prepared within the framework of the HSE University Basic Research Program.

Author information

Authors and Affiliations

National Research University Higher School of Economics, Moscow, Russian Federation
Anastasiya Ivanova, Alexander Gasnikov, Darina Dvinskikh & Alexander Tyurin
Grenoble Alpes University, Grenoble, France
Anastasiya Ivanova
Weierstrass Institute for Applied Analysis and Stochastics, Berlin, Germany
Pavel Dvurechensky
Catholic University of Louvain, Louvain-la-Neuve, Belgium
Evgeniya Vorontsova
Moscow Institute of Physics and Technology, Moscow, Russian Federation
Dmitry Pasechnyuk, Alexander Gasnikov & Darina Dvinskikh
ISP RAS Research Center for Trusted Artificial Intelligence, Moscow, Russian Federation
Dmitry Pasechnyuk, Alexander Gasnikov & Darina Dvinskikh
Institute for Information Transmission Problems, Moscow, Russian Federation
Alexander Gasnikov

Authors

Anastasiya Ivanova
View author publications
You can also search for this author in PubMed Google Scholar
Pavel Dvurechensky
View author publications
You can also search for this author in PubMed Google Scholar
Evgeniya Vorontsova
View author publications
You can also search for this author in PubMed Google Scholar
Dmitry Pasechnyuk
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Gasnikov
View author publications
You can also search for this author in PubMed Google Scholar
Darina Dvinskikh
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Tyurin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anastasiya Ivanova.

Additional information

Communicated by Boris S. Mordukhovich.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ivanova, A., Dvurechensky, P., Vorontsova, E. et al. Oracle Complexity Separation in Convex Optimization. J Optim Theory Appl 193, 462–490 (2022). https://doi.org/10.1007/s10957-022-02038-7

Download citation

Received: 31 March 2021
Accepted: 06 April 2022
Published: 27 April 2022
Issue Date: June 2022
DOI: https://doi.org/10.1007/s10957-022-02038-7

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Oracle Complexity Separation in Convex Optimization

Abstract

Access this article

Similar content being viewed by others

Zeroth-order algorithms for nonconvex–strongly-concave minimax problems with improved complexities

A Simple Method for Convex Optimization in the Oracle Model

Stochastic Intermediate Gradient Method for Convex Problems with Stochastic Inexact Oracle

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Oracle Complexity Separation in Convex Optimization

Abstract

Access this article

Similar content being viewed by others

Zeroth-order algorithms for nonconvex–strongly-concave minimax problems with improved complexities

A Simple Method for Convex Optimization in the Oracle Model

Stochastic Intermediate Gradient Method for Convex Problems with Stochastic Inexact Oracle

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation