Noisy Zeroth-Order Optimization for Non-smooth Saddle Point Problems

Dvinskikh, Darina; Tominin, Vladislav; Tominin, Iaroslav; Gasnikov, Alexander

doi:10.1007/978-3-031-09607-5_2

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13367))

Included in the following conference series:

International Conference on Mathematical Optimization Theory and Operations Research

509 Accesses
4 Citations

Abstract

This paper investigates zeroth-order methods for non-smooth convex-concave saddle point problems (with r-growth condition for duality gap). We assume that a black-box gradient-free oracle returns an inexact function value corrupted by an adversarial noise. In this work we prove that the standard zeroth-order version of the mirror descent method is optimal in terms of the oracle calls complexity and the maximum admissible noise.

The research was supported by Russian Science Foundation (project No. 21-71-30005).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Gradient-Free Methods with Inexact Oracle for Convex-Concave Stochastic Saddle-Point Problem

Zeroth-Order Algorithms for Smooth Saddle-Point Problems

Gradient-Free Algorithms for Solving Stochastic Saddle Optimization Problems with the Polyak–Łojasiewicz Condition

Article 01 December 2023

References

Bartlett, P., Dani, V., Hayes, T., Kakade, S., Rakhlin, A., Tewari, A.: High-probability regret bounds for bandit online linear optimization. In: Proceedings of the 21st Annual Conference on Learning Theory, COLT 2008, pp. 335–342. Omnipress (2008)
Google Scholar
Bayandina, A.S., Gasnikov, A.V., Lagunovskaya, A.A.: Gradient-free two-point methods for solving stochastic nonsmooth convex optimization problems with small non-random noises. Autom. Remote. Control. 79(8), 1399–1408 (2018)
Article MathSciNet Google Scholar
Ben-Tal, A., Nemirovski, A.: Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications. SIAM (2013)
Google Scholar
Beznosikov, A., Sadiev, A., Gasnikov, A.: Gradient-free methods with inexact oracle for convex-concave stochastic saddle-point problem. In: Kochetov, Y., Bykadorov, I., Gruzdeva, T. (eds.) MOTOR 2020. CCIS, vol. 1275, pp. 105–119. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58657-7_11
Chapter Google Scholar
Bubeck, S.: Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Found. Trends® Mach. Learn. 5(1), 1–122 (2012). https://doi.org/10.1561/2200000024
Article MathSciNet MATH Google Scholar
Chen, P.Y., Zhang, H., Sharma, Y., Yi, J., Hsieh, C.J.: ZOO: zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pp. 15–26 (2017)
Google Scholar
Choromanski, K., Rowland, M., Sindhwani, V., Turner, R., Weller, A.: Structured evolution with compact architectures for scalable policy optimization. In: International Conference on Machine Learning, pp. 970–978. PMLR (2018)
Google Scholar
Conn, A., Scheinberg, K., Vicente, L.: Introduction to Derivative-Free Optimization. Society for Industrial and Applied Mathematics (2009). https://doi.org/10.1137/1.9780898718768. http://epubs.siam.org/doi/abs/10.1137/1.9780898718768
Duchi, J.C., Jordan, M.I., Wainwright, M.J., Wibisono, A.: Optimal rates for zero-order convex optimization: the power of two function evaluations. IEEE Trans. Inf. Theor. 61(5), 2788–2806 (2015). https://doi.org/10.1109/TIT.2015.2409256. arXiv:1312.2139
Flaxman, A.D., Kalai, A.T., McMahan, H.B.: Online convex optimization in the bandit setting: gradient descent without a gradient. arXiv preprint arXiv:cs/0408007 (2004)
Gasnikov, A.V., Krymova, E.A., Lagunovskaya, A.A., Usmanova, I.N., Fedorenko, F.A.: Stochastic online optimization. Single-point and multi-point non-linear multi-armed bandits. Convex and strongly-convex case. Autom. Remote Control 78(2), 224–234 (2017). https://doi.org/10.1134/S0005117917020035. arXiv:1509.01679
Gasnikov, A., et al.: The power of first-order smooth optimization for black-box non-smooth problems. arXiv preprint arXiv:2201.12289 (2022)
Gasnikov, A.V., Nesterov, Y.E.: Universal method for stochastic composite optimization problems. Comput. Math. Math. Phys. 58(1), 48–64 (2018)
Article MathSciNet Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems 27 (2014)
Google Scholar
Gorbunov, É.A., Vorontsova, E.A., Gasnikov, A.V.: On the upper bound for the expectation of the norm of a vector uniformly distributed on the sphere and the phenomenon of concentration of uniform measure on the sphere. Math. Notes 106, 11–19 (2019)
Article MathSciNet Google Scholar
Juditsky, A., Nesterov, Y.: Deterministic and stochastic primal-dual subgradient algorithms for uniformly convex minimization. Stoch. Syst. 4(1), 44–80 (2014). https://doi.org/10.1287/10-SSY010
Mania, H., Guy, A., Recht, B.: Simple random search of static linear policies is competitive for reinforcement learning. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 1805–1814 (2018)
Google Scholar
Nemirovskij, A.S., Yudin, D.B.: Problem Complexity and Method Efficiency in Optimization (1983)
Google Scholar
Nesterov, Y., Spokoiny, V.: Random gradient-free minimization of convex functions. Found. Comput. Math. 17(2), 527–566 (2017). https://doi.org/10.1007/s10208-015-9296-2. First appeared in 2011 as CORE discussion paper 2011/16
Neumann, J.: Zur theorie der gesellschaftsspiele. Mathematische annalen 100(1), 295–320 (1928)
Google Scholar
Polyak, B.: Introduction to Optimization. Optimization Software, New York (1987)
Google Scholar
Risteski, A., Li, Y.: Algorithms and matching lower bounds for approximately-convex optimization. Adv. Neural. Inf. Process. Syst. 29, 4745–4753 (2016)
Google Scholar
Sergeyev, Y.D., Candelieri, A., Kvasov, D.E., Perego, R.: Safe global optimization of expensive noisy black-box functions in the $\delta $-lipschitz framework. Soft. Comput. 24(23), 17715–17735 (2020)
Article Google Scholar
Shamir, O.: An optimal algorithm for bandit and zero-order convex optimization with two-point feedback. J. Mach. Learn. Res. 18, 52:1–52:11 (2017). http://jmlr.org/papers/v18/papers/v18/16-632.html. First appeared in arXiv:1507.08752
Shapiro, A., Dentcheva, D., Ruszczynski, A.: Lectures on Stochastic Programming: Modeling and Theory. SIAM (2021)
Google Scholar
Spall, J.C.: Introduction to Stochastic Search and Optimization, 1st edn. Wiley, New York (2003)
Book Google Scholar
Vasin, A., Gasnikov, A., Spokoiny, V.: Stopping rules for accelerated gradient methods with additive noise in gradient (2021)
Google Scholar
Vural, N.M., Yu, L., Balasubramanian, K., Volgushev, S., Erdogdu, M.A.: Mirror descent strikes again: optimal stochastic convex optimization under infinite noise variance (2022)
Google Scholar

Download references

Author information

Authors and Affiliations

Moscow Institute of Physics and Technology, Dolgoprudny, Russia
Darina Dvinskikh, Vladislav Tominin, Iaroslav Tominin & Alexander Gasnikov
Institute for Information Transmission Problems RAS, Moscow, Russia
Darina Dvinskikh & Alexander Gasnikov
ISP RAS Research Center for Trusted Artificial Intelligence, Moscow, Russia
Darina Dvinskikh
Caucasus Mathematical Center, Adyghe State University, Maikop, Russia
Alexander Gasnikov

Authors

Darina Dvinskikh
View author publications
You can also search for this author in PubMed Google Scholar
Vladislav Tominin
View author publications
You can also search for this author in PubMed Google Scholar
Iaroslav Tominin
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Gasnikov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Darina Dvinskikh .

Editor information

Editors and Affiliations

University of Florida, Gainesville, FL, USA
Panos Pardalos
Krasovsky Institute of Mathematics and Mechanics, Ekaterinburg, Russia
Michael Khachay
Karelia Research Centre RAS, Institute of Applied Mathematical Research, Petrozavodsk, Karelian Republic, Russia
Vladimir Mazalov

A Auxiliary Results

This appendix presents auxiliary results to prove Theorem 1 from Sect. 3.

Lemma 1

Let vector $\boldsymbol{e}$ be a random unit vector from the Euclidean unit sphere $\{\boldsymbol{e}:\Vert \boldsymbol{e}\Vert _2=1\}$. Then it holds for all $r \in \mathbb R^d$

$$\begin{aligned} \mathbb E_{ \boldsymbol{e}}\left[ \left| \langle \boldsymbol{e}, r \rangle \right| \right] \le {\Vert r\Vert _2}/{\sqrt{d}}. \end{aligned}$$

Lemma 2

Let f(z) be $M_2$-Lipschitz continuous. Then for $f^\tau (z)$ from (4), it holds

$$ \sup _{z\in \mathcal {Z}}|f^\tau (z) - f(z)|\le \tau M_2. $$

Lemma 3

Function $f^\tau (z)$ is differentiable with the following gradient

$$ \nabla f^\tau (z) = \mathbb E_{{\boldsymbol{e}}}\left[ \frac{d}{\tau } f(z+\tau {\boldsymbol{e}})\boldsymbol{e}\right] . $$

Lemma 4

For $g(z,\xi ,\boldsymbol{e})$ from (3) and $f^\tau (z)$ from (4), the following holds

1.
under Assumption 2
$$\begin{aligned} \mathbb E_{\xi , \boldsymbol{e}}\left[ \langle g(z,\xi ,\boldsymbol{e}),r\rangle \right] \ge \langle \nabla f^\tau (z),r\rangle - {d\varDelta }{\tau ^{-1}} \mathbb E_{\boldsymbol{e}} \left[ \left| \langle \boldsymbol{e}, r \rangle \right| \right] , \end{aligned}$$
2.
under Assumption 3
$$\begin{aligned} \mathbb E_{\xi , \boldsymbol{e}}\left[ \langle g(z,\xi ,\boldsymbol{e}),r\rangle \right] \ge \langle \nabla f^\tau (z),r\rangle - d M_{2,\delta } \mathbb E_{\boldsymbol{e}} \left[ \left| \langle \boldsymbol{e}, r \rangle \right| \right] , \end{aligned}$$

Lemma 5

[24, Lemma 9]. For any function $f(\boldsymbol{e})$ which is M-Lipschitz w.r.t. the $\ell _2$-norm, it holds that if $\boldsymbol{e}$ is uniformly distributed on the Euclidean unit sphere, then

$$ \sqrt{\mathbb E\left[ (f(\boldsymbol{e}) - \mathbb Ef(\boldsymbol{e}))^4 \right] } \le cM_2^2/d $$

for some numerical constant c.

Lemma 6

For $g(z,\xi ,\boldsymbol{e})$ from (3), the following holds under Assumption 1

1.
and Assumption 2
$$\begin{aligned} \mathbb E_{\xi ,\boldsymbol{e}}\left[ \Vert g(z,\xi ,\boldsymbol{e})\Vert ^2_q\right] \le c a^2_q dM_2^2 + {d^2 a_q^2\varDelta ^2}/{\tau ^2}, \end{aligned}$$
2.
and Assumption 3
$$\begin{aligned} \mathbb E_{\xi ,\boldsymbol{e}}\left[ \Vert g(z,\xi ,\boldsymbol{e})\Vert ^2_q\right] \le c a^2_q d (M_2^2+M_{2,\delta }^2), \end{aligned}$$

where c is some numerical constant and $\sqrt{\mathbb E\left[ \Vert \boldsymbol{e}\Vert _q^4\right] } \le a_q^2$.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dvinskikh, D., Tominin, V., Tominin, I., Gasnikov, A. (2022). Noisy Zeroth-Order Optimization for Non-smooth Saddle Point Problems. In: Pardalos, P., Khachay, M., Mazalov, V. (eds) Mathematical Optimization Theory and Operations Research. MOTOR 2022. Lecture Notes in Computer Science, vol 13367. Springer, Cham. https://doi.org/10.1007/978-3-031-09607-5_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-09607-5_2
Published: 25 June 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-09606-8
Online ISBN: 978-3-031-09607-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Noisy Zeroth-Order Optimization for Non-smooth Saddle Point Problems

Abstract

Access this chapter

Similar content being viewed by others

Gradient-Free Methods with Inexact Oracle for Convex-Concave Stochastic Saddle-Point Problem

Zeroth-Order Algorithms for Smooth Saddle-Point Problems

Gradient-Free Algorithms for Solving Stochastic Saddle Optimization Problems with the Polyak–Łojasiewicz Condition

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Auxiliary Results

Lemma 1

Lemma 2

Lemma 3

Lemma 4

Lemma 5

Lemma 6

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Noisy Zeroth-Order Optimization for Non-smooth Saddle Point Problems

Abstract

Access this chapter

Similar content being viewed by others

Gradient-Free Methods with Inexact Oracle for Convex-Concave Stochastic Saddle-Point Problem

Zeroth-Order Algorithms for Smooth Saddle-Point Problems

Gradient-Free Algorithms for Solving Stochastic Saddle Optimization Problems with the Polyak–Łojasiewicz Condition

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Auxiliary Results

A Auxiliary Results

Lemma 1

Lemma 2

Lemma 3

Lemma 4

Lemma 5

Lemma 6

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation