Stochastic Approximation Methods and Their Finite-Time Convergence Properties
This chapter surveys some recent advances in the design and analysis of two classes of stochastic approximation methods: stochastic first- and zeroth-order methods for stochastic optimization. We focus on the finite-time convergence properties (i.e., iteration complexity) of these algorithms by providing bounds on the number of iterations required to achieve a certain accuracy. We point out that many of these complexity bounds are theoretically optimal for solving different classes of stochastic optimization problems.
This work was supported in part by the National Science Foundation under Grants CMMI-1000347, CMMI-1254446, and DMS-1319050, and by the Office of Naval Research under Grant N00014-13-1-0036.
- 1.A. Benveniste, M. Métivier, and P. Priouret. Algorithmes adaptatifs et approximations stochastiques. Masson, 1987. English translation: Adaptive Algorithms and Stochastic Approximations, Springer Verlag (1993).Google Scholar
- 3.K. Chung. On a stochastic approximation method. Annals of Mathematical Statistics, pages 463–483, 1954.Google Scholar
- 6.A. Gaivoronski. Nonstationary stochastic programming problems. Kybernetika, 4:89–92, 1978.Google Scholar
- 11.S. Ghadimi and G. Lan. Accelerated gradient methods for nonconvex nonlinear and stochastic optimization. Technical report, Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL 32611, USA, June 2013.Google Scholar
- 12.S. Ghadimi, G. Lan, and H. Zhang. Mini-batch stochastic approximation methods for constrained nonconvex stochastic programming. Manuscript, Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL 32611, USA, August 2013.Google Scholar
- 13.A. Juditsky, A. Nazin, A. B. Tsybakov, and N. Vayatis. Recursive aggregation of estimators via the mirror descent algorithm with average. Problems of Information Transmission, 41:n.4, 2005.Google Scholar
- 17.H. J. Kushner and G. Yin. Stochastic Approximation and Recursive Algorithms and Applications, volume 35 of Applications of Mathematics. Springer-Verlag, New York, 2003.Google Scholar
- 21.A. S. Nemirovski and D. Yudin. Problem complexity and method efficiency in optimization. Wiley-Interscience Series in Discrete Mathematics. John Wiley, XV, 1983.Google Scholar
- 23.Y. E. Nesterov. Random gradient-free minimization of convex functions. Technical report, Center for Operations Research and Econometrics (CORE), Catholic University of Louvain, January 2010.Google Scholar
- 24.G. Pflug. Optimization of stochastic models. In The Interface Between Simulation and Optimization. Kluwer, Boston, 1996.Google Scholar
- 25.B. Polyak. New stochastic approximation type procedures. Automat. i Telemekh., 7:98–107, 1990.Google Scholar
- 30.A. Shapiro. Monte Carlo sampling methods. In A. Ruszczyński and A. Shapiro, editors, Stochastic Programming. North-Holland Publishing Company, Amsterdam, 2003.Google Scholar
- 31.A. Shapiro. Sample average approximation. In S. I. Gass and M. C. Fu, editors, Encyclopedia of Operations Research and Management Science, pages 1350–1355. Springer, 3rd edition, 2013.Google Scholar