Skip to main content
Log in

An optimal method for stochastic composite optimization

  • Full Length Paper
  • Series A
  • Published:
Mathematical Programming Submit manuscript

Abstract

This paper considers an important class of convex programming (CP) problems, namely, the stochastic composite optimization (SCO), whose objective function is given by the summation of general nonsmooth and smooth stochastic components. Since SCO covers non-smooth, smooth and stochastic CP as certain special cases, a valid lower bound on the rate of convergence for solving these problems is known from the classic complexity theory of convex programming. Note however that the optimization algorithms that can achieve this lower bound had never been developed. In this paper, we show that the simple mirror-descent stochastic approximation method exhibits the best-known rate of convergence for solving these problems. Our major contribution is to introduce the accelerated stochastic approximation (AC-SA) algorithm based on Nesterov’s optimal method for smooth CP (Nesterov in Doklady AN SSSR 269:543–547, 1983; Nesterov in Math Program 103:127–152, 2005), and show that the AC-SA algorithm can achieve the aforementioned lower bound on the rate of convergence for SCO. To the best of our knowledge, it is also the first universally optimal algorithm in the literature for solving non-smooth, smooth and stochastic CP problems. We illustrate the significant advantages of the AC-SA algorithm over existing methods in the context of solving a special but broad class of stochastic programming problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Auslender A., Teboulle M.: Interior gradient and proximal methods for convex and conic optimization. SIAM J. Optim. 16, 697–725 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  2. Bauschke H.H., Borwein J.M., Combettes P.L.: Bregman monotone optimization algorithms. SIAM J. Control Optim. 42, 596–636 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  3. Becker S., Bobin J., Candes E.: Nesta: A Fast and Accurate First-Order Method for Sparse Recovery, Manuscript. California Institute of Technology, Pasadena (2009)

    Google Scholar 

  4. Ben-Tal A., Nemirovski A.: Non-euclidean restricted memory level method for large-scale convex optimization. Math. Program. 102, 407–456 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  5. Benveniste, A., Métivier, M., Priouret, P.: Algorithmes adaptatifs et approximations stochastiques. Masson, 1987. English translation: Adaptive Algorithms and Stochastic Approximations. Springer (1993)

  6. Bertsekas D.: Nonlinear Programming, 2nd edn. Athena Scientific, New York (1999)

    MATH  Google Scholar 

  7. Bregman L.M.: The relaxation method of finding the common point convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Phys. 7, 200–217 (1967)

    Article  Google Scholar 

  8. d’Aspremont A.: Smooth optimization with approximate gradient. SIAM J. Optim. 19, 1171–1183 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  9. d’Aspremont A., Banerjee O., El Ghaoue L.: First-order methods for sparse covariance selection. SIAM J. Matrix Anal. Appl. 30, 56–66 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  10. Ermoliev Y.: Stochastic quasigradient methods and their application to system optimization. Stochastics 9, 1–36 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  11. Gaivoronski A.: Nonstationary stochastic programming problems. Kybernetika 4, 89–92 (1978)

    Google Scholar 

  12. Juditsky A., Nazin A., Tsybakov A.B., Vayatis N.: Recursive aggregation of estimators via the mirror descent algorithm with average. Probl. Inf. Transm. 41, n.4 (2005)

    Article  Google Scholar 

  13. Juditsky, A., Nemirovski, A., Tauvel, C.: Solving Variational Inequalities with Stochastic Mirror-Prox Algorithm, Manuscript. Georgia Institute of Technology. Submitted to SIAM J. Control Optim Atlanta (2008)

  14. Juditsky A., Rigollet P., Tsybakov A.B.: Learning by mirror averaging. Ann. Stat. 36, 2183–2206 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  15. Kiwiel K.C.: Proximal minimization methods with generalized bregman functions. SIAM J. Control Optim. 35, 1142–1168 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  16. Kleywegt A.J., Shapiro A., Homem de Mello T.: The sample average approximation method for stochastic discrete optimization. SIAM J. Optim. 12, 479–502 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  17. Kushner H., Yin J., Shapiro G.: Approximation and Recursive Algorithms and Applications, vol. 35 of Applications of Mathematics. Springer, New York (2003)

    Google Scholar 

  18. Lan, G., Lu, Z., Monteiro, R.D.C.: Primal-dual first-order methods with \({{\mathcal O}(1/\epsilon)}\) iteration-complexity for cone programming. Math. Program. (2009, to appear)

  19. Lan, G., Monteiro, R.D.C.: Iteration-Complexity of First-Order Penalty Methods for Convex Programming, Manuscript. School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta (June, 2008)

  20. Lan G., Monteiro, R.D.C.: Iteration-Complexity of First-Order Augmented Lagrangian Methods for Convex Programming, Manuscript. School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta (May, 2009)

  21. Lan, G., Nemirovski, A., Shapiro, A.: Validation analysis of robust stochastic approximation method. submitted to Math. Program. (2008). http://www.optimization-online.org

  22. Lewis A.S., Wright S.J.: A Proximal Method for Composite Minimization, Manuscript. Cornell University, Ithaca (2009)

    Google Scholar 

  23. Linderoth J., Shapiro A., Wright S.: The empirical behavior of sampling methods for stochastic programming. Ann. Oper. Res. 142, 215–241 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  24. Lu Z.: Smooth optimization approach for sparse covariance selection. SIAM J. Optim. 19, 1807–1827 (2009)

    Article  MATH  Google Scholar 

  25. Lu, Z., Monteiro, R.D.C., Yuan, M.: Convex Optimization Methods for Dimension Reduction and Coefficient Estimation in Multivariate Linear Regression, Manuscript. School of ISyE, Georgia Tech, Atlanta (January, 2008)

  26. Lu Z., Nemirovski A., Monteiro R.D.C.: Large-scale semidefinite programming via saddle point mirror-prox algorithm. Math. Program. 109, 211–237 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  27. Mak W.K., Morton D.P., Wood R.K.: Monte carlo bounding techniques for determining solution quality in stochastic programs. Oper. Res. Lett. 24, 47–56 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  28. Monteiro, R.D.C., Svaiter B.F.: On the Complexity of the Hybrid Proximal Extragradient Method for the Iterates and the Ergodic Mean, Manuscript. School of ISyE, Georgia Tech, Atlanta (March, 2009)

  29. Nemirovski A.: Prox-method with rate of convergence o(1/t) for variational inequalities with lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM J. Optim. 15, 229–251 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  30. Nemirovski A., Juditsky A., Lan G., Shapiro A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19, 1574–1609 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  31. Nemirovski A., Yudin D.: Problem Complexity and Method Efficiency in Optimization. Wiley-Interscience Series in Discrete Mathematics, vol. XV. Wiley, New York (1983)

    Google Scholar 

  32. Nesterov, Y.E.: A method for unconstrained convex minimization problem with the rate of convergence O(1/k 2). Doklady AN SSSR 269:543–547 (1983). Translated as Soviet Math. Docl

  33. Nesterov Y.E.: Introductory Lectures on Convex Optimization: A Basic Course. Kluwer, Massachusetts (2004)

    MATH  Google Scholar 

  34. Nesterov Y.E.: Smooth minimization of nonsmooth functions. Math. Program. 103, 127–152 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  35. Nesterov Y.E.: Primal-dual subgradient methods for convex problems. Math. Program. 120, 221–259 (2006)

    Article  MathSciNet  Google Scholar 

  36. Nesterov, Y.E.: Gradient Methods for Minimizing Composite Objective Functions. Technical report, Center for Operations Research and Econometrics (CORE), Catholic University of Louvain (2007, September)

  37. Nesterov Y.E.: Smoothing technique and its applications in semidefinite optimization. Math. Program. 110, 245–259 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  38. Peña J.: Nash equilibria computation via smoothing techniques. Optima 78, 12–13 (2008)

    Google Scholar 

  39. Pflug, G.C.: Optimization of Stochastic Models. In: The Interface Between Simulation and Optimization. Kluwer, Boston (1996)

  40. Polyak B.T.: New stochastic approximation type procedures. Automat. i Telemekh 7, 98–107 (1990)

    MathSciNet  Google Scholar 

  41. Polyak B.T., Juditsky A.B.: Acceleration of stochastic approximation by averaging. SIAM J. Control Optim. 30, 838–855 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  42. Robbins H., Monro S.: A stochastic approximation method. Ann. Math. Stat. 22, 400–407 (1951)

    Article  MathSciNet  MATH  Google Scholar 

  43. Rockafellar R.T.: Convex Analysis. Princeton University Press, Princeton (1970)

    MATH  Google Scholar 

  44. Ruszczyński A., Sysk W.: A method of aggregate stochastic subgradients with on-line stepsize rules for convex stochastic programming problems. Math. Program. Study 28, 113–131 (1986)

    Article  MATH  Google Scholar 

  45. Shapiro A.: Monte carlo sampling methods. In: Ruszczyński, A., Shapiro, A. (eds) Stochastic Programming, North-Holland, Amsterdam (2003)

    Google Scholar 

  46. Shapiro A., Nemirovski A.: On complexity of stochastic programming problems. In: Jeyakumar, V., Rubinov, A.M. (eds) Continuous Optimization: Current Trends and Applications, pp. 111–144. Springer, Berlin (2005)

    Google Scholar 

  47. Spall J.C.: Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control. Wiley, Hoboken (2003)

    Book  MATH  Google Scholar 

  48. Strassen V.: The existence of probability measures with given marginals. Ann. Math. Stat. 30, 423–439 (1965)

    Article  MathSciNet  Google Scholar 

  49. Teboulle M.: Convergence of proximal-like algorithms. SIAM J. Optim. 7, 1069–1083 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  50. Tseng P.: On Accelerated Proximal Gradient Methods for Convex-Concave Optimization, Manuscript. University of Washington, Seattle (May 2008)

    Google Scholar 

  51. Verweij B., Ahmed S., Kleywegt J.A., Nemhauser G., Shapiro A.: The sample average approximation method applied to stochastic routing problems: a computational study. Comput. Optim. Appl. 24, 289–333 (2003)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guanghui Lan.

Additional information

The work of the author was partially supported by NSF Grants CMMI-1000347, CCF-0430644 and CCF-0808863, and ONR Grant N000140811104 and N00014-08-1-0033.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lan, G. An optimal method for stochastic composite optimization. Math. Program. 133, 365–397 (2012). https://doi.org/10.1007/s10107-010-0434-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-010-0434-y

Keywords

Mathematics Subject Classification (2000)

Navigation