Abstract
This paper investigates zeroth-order methods for non-smooth convex-concave saddle point problems (with r-growth condition for duality gap). We assume that a black-box gradient-free oracle returns an inexact function value corrupted by an adversarial noise. In this work we prove that the standard zeroth-order version of the mirror descent method is optimal in terms of the oracle calls complexity and the maximum admissible noise.
The research was supported by Russian Science Foundation (project No. 21-71-30005).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bartlett, P., Dani, V., Hayes, T., Kakade, S., Rakhlin, A., Tewari, A.: High-probability regret bounds for bandit online linear optimization. In: Proceedings of the 21st Annual Conference on Learning Theory, COLT 2008, pp. 335–342. Omnipress (2008)
Bayandina, A.S., Gasnikov, A.V., Lagunovskaya, A.A.: Gradient-free two-point methods for solving stochastic nonsmooth convex optimization problems with small non-random noises. Autom. Remote. Control. 79(8), 1399–1408 (2018)
Ben-Tal, A., Nemirovski, A.: Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications. SIAM (2013)
Beznosikov, A., Sadiev, A., Gasnikov, A.: Gradient-free methods with inexact oracle for convex-concave stochastic saddle-point problem. In: Kochetov, Y., Bykadorov, I., Gruzdeva, T. (eds.) MOTOR 2020. CCIS, vol. 1275, pp. 105–119. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58657-7_11
Bubeck, S.: Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Found. Trends® Mach. Learn. 5(1), 1–122 (2012). https://doi.org/10.1561/2200000024
Chen, P.Y., Zhang, H., Sharma, Y., Yi, J., Hsieh, C.J.: ZOO: zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pp. 15–26 (2017)
Choromanski, K., Rowland, M., Sindhwani, V., Turner, R., Weller, A.: Structured evolution with compact architectures for scalable policy optimization. In: International Conference on Machine Learning, pp. 970–978. PMLR (2018)
Conn, A., Scheinberg, K., Vicente, L.: Introduction to Derivative-Free Optimization. Society for Industrial and Applied Mathematics (2009). https://doi.org/10.1137/1.9780898718768. http://epubs.siam.org/doi/abs/10.1137/1.9780898718768
Duchi, J.C., Jordan, M.I., Wainwright, M.J., Wibisono, A.: Optimal rates for zero-order convex optimization: the power of two function evaluations. IEEE Trans. Inf. Theor. 61(5), 2788–2806 (2015). https://doi.org/10.1109/TIT.2015.2409256. arXiv:1312.2139
Flaxman, A.D., Kalai, A.T., McMahan, H.B.: Online convex optimization in the bandit setting: gradient descent without a gradient. arXiv preprint arXiv:cs/0408007 (2004)
Gasnikov, A.V., Krymova, E.A., Lagunovskaya, A.A., Usmanova, I.N., Fedorenko, F.A.: Stochastic online optimization. Single-point and multi-point non-linear multi-armed bandits. Convex and strongly-convex case. Autom. Remote Control 78(2), 224–234 (2017). https://doi.org/10.1134/S0005117917020035. arXiv:1509.01679
Gasnikov, A., et al.: The power of first-order smooth optimization for black-box non-smooth problems. arXiv preprint arXiv:2201.12289 (2022)
Gasnikov, A.V., Nesterov, Y.E.: Universal method for stochastic composite optimization problems. Comput. Math. Math. Phys. 58(1), 48–64 (2018)
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems 27 (2014)
Gorbunov, É.A., Vorontsova, E.A., Gasnikov, A.V.: On the upper bound for the expectation of the norm of a vector uniformly distributed on the sphere and the phenomenon of concentration of uniform measure on the sphere. Math. Notes 106, 11–19 (2019)
Juditsky, A., Nesterov, Y.: Deterministic and stochastic primal-dual subgradient algorithms for uniformly convex minimization. Stoch. Syst. 4(1), 44–80 (2014). https://doi.org/10.1287/10-SSY010
Mania, H., Guy, A., Recht, B.: Simple random search of static linear policies is competitive for reinforcement learning. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 1805–1814 (2018)
Nemirovskij, A.S., Yudin, D.B.: Problem Complexity and Method Efficiency in Optimization (1983)
Nesterov, Y., Spokoiny, V.: Random gradient-free minimization of convex functions. Found. Comput. Math. 17(2), 527–566 (2017). https://doi.org/10.1007/s10208-015-9296-2. First appeared in 2011 as CORE discussion paper 2011/16
Neumann, J.: Zur theorie der gesellschaftsspiele. Mathematische annalen 100(1), 295–320 (1928)
Polyak, B.: Introduction to Optimization. Optimization Software, New York (1987)
Risteski, A., Li, Y.: Algorithms and matching lower bounds for approximately-convex optimization. Adv. Neural. Inf. Process. Syst. 29, 4745–4753 (2016)
Sergeyev, Y.D., Candelieri, A., Kvasov, D.E., Perego, R.: Safe global optimization of expensive noisy black-box functions in the \(\delta \)-lipschitz framework. Soft. Comput. 24(23), 17715–17735 (2020)
Shamir, O.: An optimal algorithm for bandit and zero-order convex optimization with two-point feedback. J. Mach. Learn. Res. 18, 52:1–52:11 (2017). http://jmlr.org/papers/v18/papers/v18/16-632.html. First appeared in arXiv:1507.08752
Shapiro, A., Dentcheva, D., Ruszczynski, A.: Lectures on Stochastic Programming: Modeling and Theory. SIAM (2021)
Spall, J.C.: Introduction to Stochastic Search and Optimization, 1st edn. Wiley, New York (2003)
Vasin, A., Gasnikov, A., Spokoiny, V.: Stopping rules for accelerated gradient methods with additive noise in gradient (2021)
Vural, N.M., Yu, L., Balasubramanian, K., Volgushev, S., Erdogdu, M.A.: Mirror descent strikes again: optimal stochastic convex optimization under infinite noise variance (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Auxiliary Results
A Auxiliary Results
This appendix presents auxiliary results to prove Theorem 1 from Sect. 3.
Lemma 1
Let vector \(\boldsymbol{e}\) be a random unit vector from the Euclidean unit sphere \(\{\boldsymbol{e}:\Vert \boldsymbol{e}\Vert _2=1\}\). Then it holds for all \(r \in \mathbb R^d\)
Lemma 2
Let f(z) be \(M_2\)-Lipschitz continuous. Then for \(f^\tau (z)\) from (4), it holds
Lemma 3
Function \(f^\tau (z)\) is differentiable with the following gradient
Lemma 4
For \(g(z,\xi ,\boldsymbol{e})\) from (3) and \(f^\tau (z)\) from (4), the following holds
-
1.
under Assumption 2
$$\begin{aligned} \mathbb E_{\xi , \boldsymbol{e}}\left[ \langle g(z,\xi ,\boldsymbol{e}),r\rangle \right] \ge \langle \nabla f^\tau (z),r\rangle - {d\varDelta }{\tau ^{-1}} \mathbb E_{\boldsymbol{e}} \left[ \left| \langle \boldsymbol{e}, r \rangle \right| \right] , \end{aligned}$$ -
2.
under Assumption 3
$$\begin{aligned} \mathbb E_{\xi , \boldsymbol{e}}\left[ \langle g(z,\xi ,\boldsymbol{e}),r\rangle \right] \ge \langle \nabla f^\tau (z),r\rangle - d M_{2,\delta } \mathbb E_{\boldsymbol{e}} \left[ \left| \langle \boldsymbol{e}, r \rangle \right| \right] , \end{aligned}$$
Lemma 5
[24, Lemma 9]. For any function \(f(\boldsymbol{e})\) which is M-Lipschitz w.r.t. the \(\ell _2\)-norm, it holds that if \(\boldsymbol{e}\) is uniformly distributed on the Euclidean unit sphere, then
for some numerical constant c.
Lemma 6
For \(g(z,\xi ,\boldsymbol{e})\) from (3), the following holds under Assumption 1
-
1.
and Assumption 2
$$\begin{aligned} \mathbb E_{\xi ,\boldsymbol{e}}\left[ \Vert g(z,\xi ,\boldsymbol{e})\Vert ^2_q\right] \le c a^2_q dM_2^2 + {d^2 a_q^2\varDelta ^2}/{\tau ^2}, \end{aligned}$$ -
2.
and Assumption 3
$$\begin{aligned} \mathbb E_{\xi ,\boldsymbol{e}}\left[ \Vert g(z,\xi ,\boldsymbol{e})\Vert ^2_q\right] \le c a^2_q d (M_2^2+M_{2,\delta }^2), \end{aligned}$$
where c is some numerical constant and \(\sqrt{\mathbb E\left[ \Vert \boldsymbol{e}\Vert _q^4\right] } \le a_q^2\).
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Dvinskikh, D., Tominin, V., Tominin, I., Gasnikov, A. (2022). Noisy Zeroth-Order Optimization for Non-smooth Saddle Point Problems. In: Pardalos, P., Khachay, M., Mazalov, V. (eds) Mathematical Optimization Theory and Operations Research. MOTOR 2022. Lecture Notes in Computer Science, vol 13367. Springer, Cham. https://doi.org/10.1007/978-3-031-09607-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-09607-5_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-09606-8
Online ISBN: 978-3-031-09607-5
eBook Packages: Computer ScienceComputer Science (R0)