Advertisement

Accelerated Randomized Mirror Descent Algorithms for Composite Non-strongly Convex Optimization

  • Le Thi Khanh HienEmail author
  • Cuong V. Nguyen
  • Huan Xu
  • Canyi Lu
  • Jiashi Feng
Article

Abstract

We consider the problem of minimizing the sum of an average function of a large number of smooth convex components and a general, possibly non-differentiable, convex function. Although many methods have been proposed to solve this problem with the assumption that the sum is strongly convex, few methods support the non-strongly convex case. Adding a small quadratic regularization is a common devise used to tackle non-strongly convex problems; however, it may cause loss of sparsity of solutions or weaken the performance of the algorithms. Avoiding this devise, we propose an accelerated randomized mirror descent method for solving this problem without the strongly convex assumption. Our method extends the deterministic accelerated proximal gradient methods of Paul Tseng and can be applied, even when proximal points are computed inexactly. We also propose a scheme for solving the problem, when the component functions are non-smooth.

Keywords

Acceleration techniques Mirror descent method Inexact proximal point Composite optimization 

Mathematics Subject Classification

65K05 90C06 90C30 

Notes

Acknowledgements

We are grateful to the anonymous reviewers and the Editor-in-Chief for their meticulous comments and insightful suggestions. Le Thi Khanh Hien would like to give a special thank to Prof. W. B. Haskell for his support. Le Thi Khanh Hien was supported by Grant A*STAR 1421200078.

References

  1. 1.
    Nesterov, Y.: A method of solving a convex programming problem with convergence rate \(\text{ O }(1/k^2)\). Sov. Math. Dokl. 27(2), 543–547 (1983)Google Scholar
  2. 2.
    Nesterov, Y.: On an approach to the construction of optimal methods of minimization of smooth convex functions. Ekonom. i. Mat. Metody 24, 509–517 (1998)zbMATHGoogle Scholar
  3. 3.
    Nesterov, Y.: Smooth minimization of non-smooth functions. Math. Program. 103(1), 127–152 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Nesterov, Y.: Gradient methods for minimizing composite functions. Math. Program. 140(1), 125–161 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Becker, S., Bobin, J., Candès, E.J.: NESTA: a fast and accurate first-order method for sparse recovery. SIAM J. Imaging Sci. 4(1), 1–39 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    d’Aspremont, A., Banerjee, O., Ghaoui, L.E.: First-order methods for sparse covariance selection. SIAM J. Matrix Anal. Appl. 30(1), 56–66 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Auslender, A., Teboulle, M.: Interior gradient and proximal methods for convex and conic optimization. SIAM J. Optim. 16(3), 697–725 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Tseng, P.: On Accelerated Proximal Gradient Methods for Convex–Concave Optimization. Technical report (2008)Google Scholar
  9. 9.
    Nemirovski, A., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4), 1574–1609 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Roux, N.L., Schmidt, M., Bach, F.R.: A stochastic gradient method with an exponential convergence rate for finite training sets. In: Advances in Neural Information Processing Systems, pp. 2663–2671 (2012)Google Scholar
  11. 11.
    Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. In: Advances in Neural Information Processing Systems, pp. 315–323 (2013)Google Scholar
  12. 12.
    Xiao, L., Zhang, T.: A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim. 24, 2057–2075 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Nguyen, L.M., Liu, J., Scheinberg, K., Takáč, M.: SARAH: a novel method for machine learning problems using stochastic recursive gradient. In: International Conference on Machine Learning, pp. 2613–2621 (2017)Google Scholar
  14. 14.
    Fercoq, O., Richtárik, P.: Accelerated, parallel, and proximal coordinate descent. SIAM J. Optim. 25(4), 1997–2023 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Lin, H., Mairal, J., Harchaoui, Z.: A universal catalyst for first-order optimization. In: Advances in Neural Information Processing Systems, pp. 3384–3392 (2015)Google Scholar
  16. 16.
    Parikh, N., Boyd, S.: Proximal algorithms. Found. Trends Optim. 1(3), 127–239 (2014)CrossRefGoogle Scholar
  17. 17.
    Cai, J.F., Candès, E.J., Shen, Z.: A singular value thresholding algorithm for matrix completion. SIAM J. Optim. 20(4), 1956–1982 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Fadili, J.M., Peyre, G.: Total variation projection with first order schemes. IEEE Trans. Image Process. 20(3), 657–669 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Ma, S., Goldfarb, D., Chen, L.: Fixed point and Bregman iterative methods for matrix rank minimization. Math. Program. 128(1), 321–353 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Rockafellar, R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14(5), 877–898 (1976)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Devolder, O., Glineur, F., Nesterov, Y.: First-order methods of smooth convex optimization with inexact oracle. Math. Program. 146(1), 37–75 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Schmidt, M., Roux, N.L., Bach, F.R.: Convergence rates of inexact proximal-gradient methods for convex optimization. In: Advances in Neural Information Processing Systems, pp. 1458–1466 (2011)Google Scholar
  23. 23.
    Solodov, M., Svaiter, B.: Error bounds for proximal point subproblems and associated inexact proximal point algorithms. Math. Program. 88(2), 371–389 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Villa, S., Salzo, S., Baldassarre, L., Verri, A.: Accelerated and inexact forward–backward algorithms. SIAM J. Optim. 23(3), 1607–1633 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Allen-Zhu, Z.K.: The first direct acceleration of stochastic gradient methods. In: ACM SIGACT Symposium on Theory of Computing (2017)Google Scholar
  26. 26.
    Bregman, L.: The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Math. Phys. 7(3), 200–217 (1967)MathSciNetCrossRefzbMATHGoogle Scholar
  27. 27.
    Teboulle, M.: Convergence of proximal-like algorithms. SIAM J. Optim. 7(4), 1069–1083 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course. Kluwer, Dordrecht (2004)CrossRefzbMATHGoogle Scholar
  29. 29.
    Auslender, A.: Numerical Methods for Nondifferentiable Convex Optimization, pp. 102–126. Springer, Berlin (1987)zbMATHGoogle Scholar
  30. 30.
    Lee, Y.J., Mangasarian, O.: SSVM: a smooth support vector machine for classification. Comput. Optim. Appl. 20(1), 5–22 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  31. 31.
    Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  32. 32.
    Defazio, A., Bach, F., Lacoste-julien, S.: SAGA: a fast incremental gradient method with support for non-strongly convex composite objectives. In: Advances in Neural Information Processing Systems, pp. 1646–1654 (2014)Google Scholar
  33. 33.
    Fan, R.E., Lin, C.J.: LIBSVM Data: Classification, Regression and Multi-Label. http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets (2011). Accessed 01 April 2018
  34. 34.
    Jacob, L., Obozinski, G., Vert, J.P.: Group Lasso with overlap and graph Lasso. In: International Conference on Machine Learning, pp. 433–440 (2009)Google Scholar
  35. 35.
    Mosci, S., Villa, S., Verri, A., Rosasco, L.: A primal–dual algorithm for group sparse regularization with overlapping groups. In: Advances in Neural Information Processing Systems, pp. 2604–2612 (2010)Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.National University of SingaporeSingaporeSingapore
  2. 2.University of CambridgeCambridgeUK
  3. 3.Georgia Institute of TechnologyAtlantaUSA
  4. 4.Carnegie Mellon UniversityPittsburghUSA

Personalised recommendations