Abstract
This paper considers an important class of convex programming (CP) problems, namely, the stochastic composite optimization (SCO), whose objective function is given by the summation of general nonsmooth and smooth stochastic components. Since SCO covers non-smooth, smooth and stochastic CP as certain special cases, a valid lower bound on the rate of convergence for solving these problems is known from the classic complexity theory of convex programming. Note however that the optimization algorithms that can achieve this lower bound had never been developed. In this paper, we show that the simple mirror-descent stochastic approximation method exhibits the best-known rate of convergence for solving these problems. Our major contribution is to introduce the accelerated stochastic approximation (AC-SA) algorithm based on Nesterov’s optimal method for smooth CP (Nesterov in Doklady AN SSSR 269:543–547, 1983; Nesterov in Math Program 103:127–152, 2005), and show that the AC-SA algorithm can achieve the aforementioned lower bound on the rate of convergence for SCO. To the best of our knowledge, it is also the first universally optimal algorithm in the literature for solving non-smooth, smooth and stochastic CP problems. We illustrate the significant advantages of the AC-SA algorithm over existing methods in the context of solving a special but broad class of stochastic programming problems.
Similar content being viewed by others
References
Auslender A., Teboulle M.: Interior gradient and proximal methods for convex and conic optimization. SIAM J. Optim. 16, 697–725 (2006)
Bauschke H.H., Borwein J.M., Combettes P.L.: Bregman monotone optimization algorithms. SIAM J. Control Optim. 42, 596–636 (2003)
Becker S., Bobin J., Candes E.: Nesta: A Fast and Accurate First-Order Method for Sparse Recovery, Manuscript. California Institute of Technology, Pasadena (2009)
Ben-Tal A., Nemirovski A.: Non-euclidean restricted memory level method for large-scale convex optimization. Math. Program. 102, 407–456 (2005)
Benveniste, A., Métivier, M., Priouret, P.: Algorithmes adaptatifs et approximations stochastiques. Masson, 1987. English translation: Adaptive Algorithms and Stochastic Approximations. Springer (1993)
Bertsekas D.: Nonlinear Programming, 2nd edn. Athena Scientific, New York (1999)
Bregman L.M.: The relaxation method of finding the common point convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Phys. 7, 200–217 (1967)
d’Aspremont A.: Smooth optimization with approximate gradient. SIAM J. Optim. 19, 1171–1183 (2008)
d’Aspremont A., Banerjee O., El Ghaoue L.: First-order methods for sparse covariance selection. SIAM J. Matrix Anal. Appl. 30, 56–66 (2008)
Ermoliev Y.: Stochastic quasigradient methods and their application to system optimization. Stochastics 9, 1–36 (1983)
Gaivoronski A.: Nonstationary stochastic programming problems. Kybernetika 4, 89–92 (1978)
Juditsky A., Nazin A., Tsybakov A.B., Vayatis N.: Recursive aggregation of estimators via the mirror descent algorithm with average. Probl. Inf. Transm. 41, n.4 (2005)
Juditsky, A., Nemirovski, A., Tauvel, C.: Solving Variational Inequalities with Stochastic Mirror-Prox Algorithm, Manuscript. Georgia Institute of Technology. Submitted to SIAM J. Control Optim Atlanta (2008)
Juditsky A., Rigollet P., Tsybakov A.B.: Learning by mirror averaging. Ann. Stat. 36, 2183–2206 (2008)
Kiwiel K.C.: Proximal minimization methods with generalized bregman functions. SIAM J. Control Optim. 35, 1142–1168 (1997)
Kleywegt A.J., Shapiro A., Homem de Mello T.: The sample average approximation method for stochastic discrete optimization. SIAM J. Optim. 12, 479–502 (2001)
Kushner H., Yin J., Shapiro G.: Approximation and Recursive Algorithms and Applications, vol. 35 of Applications of Mathematics. Springer, New York (2003)
Lan, G., Lu, Z., Monteiro, R.D.C.: Primal-dual first-order methods with \({{\mathcal O}(1/\epsilon)}\) iteration-complexity for cone programming. Math. Program. (2009, to appear)
Lan, G., Monteiro, R.D.C.: Iteration-Complexity of First-Order Penalty Methods for Convex Programming, Manuscript. School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta (June, 2008)
Lan G., Monteiro, R.D.C.: Iteration-Complexity of First-Order Augmented Lagrangian Methods for Convex Programming, Manuscript. School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta (May, 2009)
Lan, G., Nemirovski, A., Shapiro, A.: Validation analysis of robust stochastic approximation method. submitted to Math. Program. (2008). http://www.optimization-online.org
Lewis A.S., Wright S.J.: A Proximal Method for Composite Minimization, Manuscript. Cornell University, Ithaca (2009)
Linderoth J., Shapiro A., Wright S.: The empirical behavior of sampling methods for stochastic programming. Ann. Oper. Res. 142, 215–241 (2006)
Lu Z.: Smooth optimization approach for sparse covariance selection. SIAM J. Optim. 19, 1807–1827 (2009)
Lu, Z., Monteiro, R.D.C., Yuan, M.: Convex Optimization Methods for Dimension Reduction and Coefficient Estimation in Multivariate Linear Regression, Manuscript. School of ISyE, Georgia Tech, Atlanta (January, 2008)
Lu Z., Nemirovski A., Monteiro R.D.C.: Large-scale semidefinite programming via saddle point mirror-prox algorithm. Math. Program. 109, 211–237 (2007)
Mak W.K., Morton D.P., Wood R.K.: Monte carlo bounding techniques for determining solution quality in stochastic programs. Oper. Res. Lett. 24, 47–56 (1999)
Monteiro, R.D.C., Svaiter B.F.: On the Complexity of the Hybrid Proximal Extragradient Method for the Iterates and the Ergodic Mean, Manuscript. School of ISyE, Georgia Tech, Atlanta (March, 2009)
Nemirovski A.: Prox-method with rate of convergence o(1/t) for variational inequalities with lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM J. Optim. 15, 229–251 (2004)
Nemirovski A., Juditsky A., Lan G., Shapiro A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19, 1574–1609 (2009)
Nemirovski A., Yudin D.: Problem Complexity and Method Efficiency in Optimization. Wiley-Interscience Series in Discrete Mathematics, vol. XV. Wiley, New York (1983)
Nesterov, Y.E.: A method for unconstrained convex minimization problem with the rate of convergence O(1/k 2). Doklady AN SSSR 269:543–547 (1983). Translated as Soviet Math. Docl
Nesterov Y.E.: Introductory Lectures on Convex Optimization: A Basic Course. Kluwer, Massachusetts (2004)
Nesterov Y.E.: Smooth minimization of nonsmooth functions. Math. Program. 103, 127–152 (2005)
Nesterov Y.E.: Primal-dual subgradient methods for convex problems. Math. Program. 120, 221–259 (2006)
Nesterov, Y.E.: Gradient Methods for Minimizing Composite Objective Functions. Technical report, Center for Operations Research and Econometrics (CORE), Catholic University of Louvain (2007, September)
Nesterov Y.E.: Smoothing technique and its applications in semidefinite optimization. Math. Program. 110, 245–259 (2007)
Peña J.: Nash equilibria computation via smoothing techniques. Optima 78, 12–13 (2008)
Pflug, G.C.: Optimization of Stochastic Models. In: The Interface Between Simulation and Optimization. Kluwer, Boston (1996)
Polyak B.T.: New stochastic approximation type procedures. Automat. i Telemekh 7, 98–107 (1990)
Polyak B.T., Juditsky A.B.: Acceleration of stochastic approximation by averaging. SIAM J. Control Optim. 30, 838–855 (1992)
Robbins H., Monro S.: A stochastic approximation method. Ann. Math. Stat. 22, 400–407 (1951)
Rockafellar R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
Ruszczyński A., Sysk W.: A method of aggregate stochastic subgradients with on-line stepsize rules for convex stochastic programming problems. Math. Program. Study 28, 113–131 (1986)
Shapiro A.: Monte carlo sampling methods. In: Ruszczyński, A., Shapiro, A. (eds) Stochastic Programming, North-Holland, Amsterdam (2003)
Shapiro A., Nemirovski A.: On complexity of stochastic programming problems. In: Jeyakumar, V., Rubinov, A.M. (eds) Continuous Optimization: Current Trends and Applications, pp. 111–144. Springer, Berlin (2005)
Spall J.C.: Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control. Wiley, Hoboken (2003)
Strassen V.: The existence of probability measures with given marginals. Ann. Math. Stat. 30, 423–439 (1965)
Teboulle M.: Convergence of proximal-like algorithms. SIAM J. Optim. 7, 1069–1083 (1997)
Tseng P.: On Accelerated Proximal Gradient Methods for Convex-Concave Optimization, Manuscript. University of Washington, Seattle (May 2008)
Verweij B., Ahmed S., Kleywegt J.A., Nemhauser G., Shapiro A.: The sample average approximation method applied to stochastic routing problems: a computational study. Comput. Optim. Appl. 24, 289–333 (2003)
Author information
Authors and Affiliations
Corresponding author
Additional information
The work of the author was partially supported by NSF Grants CMMI-1000347, CCF-0430644 and CCF-0808863, and ONR Grant N000140811104 and N00014-08-1-0033.
Rights and permissions
About this article
Cite this article
Lan, G. An optimal method for stochastic composite optimization. Math. Program. 133, 365–397 (2012). https://doi.org/10.1007/s10107-010-0434-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-010-0434-y
Keywords
- Stochastic approximation
- Convex optimization
- Stochastic programming
- Complexity
- Optimal method
- Quadratic penalty method
- Large deviation