Abstract
This chapter surveys some recent advances in the design and analysis of two classes of stochastic approximation methods: stochastic first- and zeroth-order methods for stochastic optimization. We focus on the finite-time convergence properties (i.e., iteration complexity) of these algorithms by providing bounds on the number of iterations required to achieve a certain accuracy. We point out that many of these complexity bounds are theoretically optimal for solving different classes of stochastic optimization problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
A subgradient of a function f at x 0 is a vector \(y \in \mathbb{R}^{n}\) such that \(f(x) \geq f(x_{0}) + y^{T}(x - x_{0}),\ \ \forall x \in \varTheta\). The set of all such subgradients is called the subdifferential of f at the point x 0.
- 2.
This assumption can be relaxed, e.g., by simply setting \(\gamma _{k} = \frac{\sqrt{2}} {M\sqrt{N}}\).
References
A. Benveniste, M. Métivier, and P. Priouret. Algorithmes adaptatifs et approximations stochastiques. Masson, 1987. English translation: Adaptive Algorithms and Stochastic Approximations, Springer Verlag (1993).
C. Cartis, N. I. M. Gould, and P. L. Toint. On the oracle complexity of first-order and derivative-free algorithms for smooth nonconvex minimization. SIAM Journal on Optimization, 22:66–86, 2012.
K. Chung. On a stochastic approximation method. Annals of Mathematical Statistics, pages 463–483, 1954.
A. R. Conn, K. Scheinberg, and L. N. Vicente. Introduction to Derivative-Free Optimization. SIAM, Philadelphia, 2009.
Y. Ermoliev. Stochastic quasigradient methods and their application to system optimization. Stochastics, 9:1–36, 1983.
A. Gaivoronski. Nonstationary stochastic programming problems. Kybernetika, 4:89–92, 1978.
R. Garmanjani and L. N. Vicente. Smoothing and worst-case complexity for direct-search methods in nonsmooth optimization. IMA Journal of Numerical Analysis, 33:1008–1028, 2013.
S. Ghadimi and G. Lan. Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization, I: a generic algorithmic framework. SIAM Journal on Optimization, 22:1469–1492, 2012.
S. Ghadimi and G. Lan. Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization, II: shrinking procedures and optimal algorithms. SIAM Journal on Optimization, 23:2061–2089, 2013.
S. Ghadimi and G. Lan. Stochastic first- and zeroth-order methods for nonconvex stochastic programming. SIAM Journal on Optimization, 23:2341–2368, 2013.
S. Ghadimi and G. Lan. Accelerated gradient methods for nonconvex nonlinear and stochastic optimization. Technical report, Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL 32611, USA, June 2013.
S. Ghadimi, G. Lan, and H. Zhang. Mini-batch stochastic approximation methods for constrained nonconvex stochastic programming. Manuscript, Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL 32611, USA, August 2013.
A. Juditsky, A. Nazin, A. B. Tsybakov, and N. Vayatis. Recursive aggregation of estimators via the mirror descent algorithm with average. Problems of Information Transmission, 41:n.4, 2005.
A. Juditsky, P. Rigollet, and A. B. Tsybakov. Learning by mirror averaging. Annals of Statistics, 36:2183–2206, 2008.
J. Kiefer and J. Wolfowitz. Stochastic estimation of the maximum of a regression function. Annals of Mathematical Statistics, 23:462–466, 1952.
A. J. Kleywegt, A. Shapiro, and T. Homem-de-Mello. The sample average approximation method for stochastic discrete optimization. SIAM Journal on Optimization, 12:479–502, 2001.
H. J. Kushner and G. Yin. Stochastic Approximation and Recursive Algorithms and Applications, volume 35 of Applications of Mathematics. Springer-Verlag, New York, 2003.
G. Lan. An optimal method for stochastic composite optimization. Mathematical Programming, 133(1):365–397, 2012.
G. Lan, A. S. Nemirovski, and A. Shapiro. Validation analysis of mirror descent stochastic approximation method. Mathematical Programming, 134(2):425–458, 2012.
A. S. Nemirovski, A. Juditsky, G. Lan, and A. Shapiro. Robust stochastic approximation approach to stochastic programming. SIAM Journal on Optimization, 19:1574–1609, 2009.
A. S. Nemirovski and D. Yudin. Problem complexity and method efficiency in optimization. Wiley-Interscience Series in Discrete Mathematics. John Wiley, XV, 1983.
Y. E. Nesterov. Primal-dual subgradient methods for convex problems. Mathematical Programming, 120:221–259, 2006.
Y. E. Nesterov. Random gradient-free minimization of convex functions. Technical report, Center for Operations Research and Econometrics (CORE), Catholic University of Louvain, January 2010.
G. Pflug. Optimization of stochastic models. In The Interface Between Simulation and Optimization. Kluwer, Boston, 1996.
B. Polyak. New stochastic approximation type procedures. Automat. i Telemekh., 7:98–107, 1990.
B. Polyak and A. Juditsky. Acceleration of stochastic approximation by averaging. SIAM J. Control and Optimization, 30:838–855, 1992.
H. Robbins and S. Monro. A stochastic approximation method. Annals of Mathematical Statistics, 22:400–407, 1951.
A. Ruszczyński and W. Sysk. A method of aggregate stochastic subgradients with on-line stepsize rules for convex stochastic programming problems. Mathematical Programming Study, 28:113–131, 1986.
J. Sacks. Asymptotic distribution of stochastic approximation. Annals of Mathematical Statistics, 29:373–409, 1958.
A. Shapiro. Monte Carlo sampling methods. In A. Ruszczyński and A. Shapiro, editors, Stochastic Programming. North-Holland Publishing Company, Amsterdam, 2003.
A. Shapiro. Sample average approximation. In S. I. Gass and M. C. Fu, editors, Encyclopedia of Operations Research and Management Science, pages 1350–1355. Springer, 3rd edition, 2013.
A. Shapiro, D. Dentcheva, and A. Ruszczyński. Lectures on Stochastic Programming: Modeling and Theory. SIAM, Philadelphia, 2009.
J. Spall. Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control. John Wiley, Hoboken, NJ, 2003.
V. Strassen. The existence of probability measures with given marginals. Annals of Mathematical Statistics, 38:423–439, 1965.
L. N. Vicente. Worst case complexity of direct search. EURO Journal on Computational Optimization, 1:143–153, 2013.
Acknowledgements
This work was supported in part by the National Science Foundation under Grants CMMI-1000347, CMMI-1254446, and DMS-1319050, and by the Office of Naval Research under Grant N00014-13-1-0036.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer Science+Business Media New York
About this chapter
Cite this chapter
Ghadimi, S., Lan, G. (2015). Stochastic Approximation Methods and Their Finite-Time Convergence Properties. In: Fu, M. (eds) Handbook of Simulation Optimization. International Series in Operations Research & Management Science, vol 216. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-1384-8_7
Download citation
DOI: https://doi.org/10.1007/978-1-4939-1384-8_7
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-1383-1
Online ISBN: 978-1-4939-1384-8
eBook Packages: Business and EconomicsBusiness and Management (R0)