Skip to main content
Log in

Convergence rates of accelerated proximal gradient algorithms under independent noise

  • Original Paper
  • Published:
Numerical Algorithms Aims and scope Submit manuscript

Abstract

We consider an accelerated proximal gradient algorithm for the composite optimization with “independent errors” (errors little related with historical information) for solving linear inverse problems. We present a new inexact version of FISTA algorithm considering deterministic and stochastic noises. We prove some convergence rates of the algorithm and we connect it with the current existing catalyst framework for many algorithms in machine learning. We show that a catalyst can be regarded as a special case of the FISTA algorithm where the smooth part of the function vanishes. Our framework gives a more generic formulation that provides convergence results for the deterministic and stochastic noise cases and also to the catalyst framework. Some of our results provide simpler alternative analysis of some existing results in literature, but they also extend the results to more generic situations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Agarwal, A., Bartlett, P.L., Ravikumar, P., Wainwright, M.J.: Information-theoretic lower bounds on the oracle complexity of stochastic convex optimization. IEEE Trans. Inf. Theory 58(5), 3235–3249 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  2. Ash, R.B., Doleans-Dade, C.: Probability and Measure Theory. Academic Press, San Diego (2000)

    MATH  Google Scholar 

  3. Auslender, A.: Numerical methods for nondifferentiable convex optimization. In: Nonlinear Analysis and Optimization, pp. 102–126. Springer (1987)

  4. Bai, M.R., Chung, C., Wu, P.-C., Chiang, Y.-H., Yang, C.-M.: Solution strategies for linear inverse problems in spatial audio signal processing. Appl. Sci. 7, 582 (2017)

    Article  Google Scholar 

  5. Beck, A., Teboulle, M.: Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans. Image Process. 18 (11), 2419–2434 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  6. Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imag. Sci. 2(1), 183–202 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  7. Björck, Å.: Numerical methods for least squares problems. SIAM (1996)

  8. Candes, E., Recht, B.: Exact matrix completion via convex optimization. Commun. ACM 55(6), 111–119 (2012)

    Article  MATH  Google Scholar 

  9. Chambolle, A., De Vore, R.A., Lee, N.-Y., Lucier, B.J.: Nonlinear wavelet image processing: variational problems, compression, and noise removal through wavelet shrinkage. IEEE Trans. Image Process. 7(3), 319–335 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  10. Cominetti, R.: Coupling the proximal point algorithm with approximation methods. J. Optim. Theory Appl. 95(3), 581–600 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  11. Daubechies, I., Defrise, M., De Mol, C.: An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun. Pure Appl. Math. 57(11), 1413–1457 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  12. Defazio, A., Bach, F., Lacoste-Julien, S.: Saga: A fast incremental gradient method with support for non-strongly convex composite objectives. In: Advances in Neural Information Processing Systems, pp. 1646–1654 (2014)

  13. Devolder, O., Glineur, F., Nesterov, Y.: First-order methods of smooth convex optimization with inexact oracle. Math. Program. 146(1-2), 37–75 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  14. Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  15. Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)

    MathSciNet  MATH  Google Scholar 

  16. Escande, P., Weiss, P.: Sparse wavelet representations of spatially varying blurring operators. SIAM J. Imag. Sci. 8(4), 2976–3014 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  17. Figueiredo, M.A.T., Nowak, R.D.: An EM algorithm for wavelet-based image restoration. IEEE Trans. Image Process. 12(8), 906–916 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  18. Figueiredo, M.A.T., Nowak, R.D., Wright, S.J.: Gradient projection for sparse reconstruction application to compressed sensing and other inverse problems. IEEE J. Sel. Top. Sign. Proces. 1(4), 586–597 (2007)

    Article  Google Scholar 

  19. Hale, E.T., Yin, W., Zhang, Y.: A fixed-point continuation method for 1-regularized minimization with applications to compressed sensing. CAAM Technical Report TR07-07, Rice University, http://www.caam.rice.edu/zhang/reports/tr0707.pdf (2007)

  20. Hale, E.T, Yin, W., Zhang, Y.: Fixed-point continuation for 1-minimization methodology and convergence. SIAM J. Optim. 19(3), 1107–1130 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  21. Honorio, J.: Convergence rates of biased stochastic optimization for learning sparse ising models. In: Proceedings of the 29th International Coference on International Conference on Machine Learning, pp. 1099–1106, Omnipress (2012)

  22. Xiaowei, H u, Prashanth, L.A.: András György, and Csaba Szepesvári. (Bandit) convex optimization with biased noisy gradient oracles. In: Artificial Intelligence and Statistics, pp. 819–828 (2016)

  23. Jiang, K., Sun, D., Toh, K.-C.: An inexact accelerated proximal gradient method for large scale linearly constrained convex SDP. SIAM J. Optim. 22(3), 1042–1064 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  24. Kaipio, J., Somersalo, E.: Statistical inverse problems: discretization, model reduction and inverse crimes. J. Comput. Appl. Math. 198(2), 493–504 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  25. Lin, H., Mairal, J., Harchaoui, Z.: A universal catalyst for first-order optimization. In: Advances in Neural Information Processing Systems, pp. 3384–3392 (2015)

  26. Mairal, J.: Incremental majorization-minimization optimization with application to large-scale machine learning. SIAM J. Optim. 25(2), 829–855 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  27. Mohammad-Djafari, A.: Inverse problems in imaging science: from classical regularization methods to state of the art bayesian methods. In: International Image Processing, Applications and Systems Conference, pp. 1–2 (2014)

  28. Monteiro, R.D.C., Svaiter, B.F.: An accelerated hybrid proximal extragradient method for convex optimization and its implications to second-order methods. SIAM J. Optim. 23(2), 1092–1125 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  29. Mordukhovich, B.S.: Variational Analysis and Generalized Differentiation I: Basic Theory, vol. 330. Springer Science & Business Media, Berlin (2006)

    Google Scholar 

  30. Nesterov, Y.: Introductory Lectures on Convex Optimization: a Basic Course, vol. 87. Springer Science & Business Media, Berlin (2013)

    Google Scholar 

  31. Reem, D., De Pierro, A.: A new convergence analysis and perturbation resilience of some accelerated proximal forward–backward algorithms with errors. Inverse Prob. 33(4), 044001 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  32. Robbins, H., Siegmund, D.: A convergence theorem for non negative almost supermartingales and some applications. In: Optimizing Methods in Statistics, pp. 233–257. Elsevier (1971)

  33. Tyrrell Rockafellar, R., Wets, R.J.-B.: Variational Analysis, vol. 317. Springer Science & Business Media, Berlin (2009)

    Google Scholar 

  34. Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (2015)

    Google Scholar 

  35. Salzo, S., Villa, S.: Inexact and accelerated proximal point algorithms. J. Convex Anal. 19(4), 1167–1192 (2012)

    MathSciNet  MATH  Google Scholar 

  36. Schmidt, M., Le Roux, N., Bach, F.: Minimizing finite sums with the stochastic average gradient. Math. Program. 162(1), 83–112 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  37. Schmidt, M., Roux, N.L., Bach, F.R.: Convergence rates of inexact proximal-gradient methods for convex optimization. In: Advances in Neural Information Processing Systems, pp. 1458–1466 (2011)

  38. Shalev-Shwartz, S., Zhang, T.: Proximal stochastic dual coordinate ascent. arXiv:1211.2717 (2012)

  39. Solodov, M.V., Svaiter, B.F.: A hybrid approximate extragradient–proximal point algorithm using the enlargement of a maximal monotone operator. Set-Valued Anal. 7(4), 323–345 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  40. Sun, T., Cheng, L.: Reweighted fast iterative shrinkage thresholding algorithm with restarts for 1 1 minimisation. IET Signal Proc. 10(1), 28–36 (2016)

    Article  Google Scholar 

  41. Sun, T., Du, P., Cheng, L., Jiang, H.: Alternating projection for sparse recovery. IET Signal Proc. 11(2), 135–144 (2016)

    Article  Google Scholar 

  42. Sun, T., Zhang, H., Cheng, L.: Subgradient projection for sparse signal recovery with sparse noise. Electron. Lett. 50(17), 1200–1202 (2014)

    Article  Google Scholar 

  43. Sun, T., Zhang, H., Cheng, L.: Precondition techniques for accelerated linearized Bregman algorithms. Pac. J. Optim. 11(3), 527–548 (2015)

    MathSciNet  MATH  Google Scholar 

  44. Villa, S., Salzo, S., Baldassarre, L., Verri, A.: Accelerated and inexact forward-backward algorithms. SIAM J. Optim. 23(3), 1607–1633 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  45. Wright, S.J.: Coordinate descent algorithms. Math. Program. 151(1), 3–34 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  46. Xiao, L., Zhang, T.: A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim. 24(4), 2057–2075 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  47. Zaslavski, A.J.: Convergence of a proximal point method in the presence of computational errors in hilbert spaces. SIAM J. Optim. 20(5), 2413–2421 (2010)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

We thank the unknown referees for suggestions on the improvement of the paper. H.J. and L.C. have been supported by the National Science Foundation of China (No. 61402495) and National Natural Science Foundation of Hunan Province in China (2018JJ3616). R.B. has been supported by the Spanish Research Project MTM2015-64095-P. T.S. has been supported by National Science Foundation of China (No.61571008).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tao Sun.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, T., Barrio, R., Jiang, H. et al. Convergence rates of accelerated proximal gradient algorithms under independent noise. Numer Algor 81, 631–654 (2019). https://doi.org/10.1007/s11075-018-0565-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11075-018-0565-4

Keywords

Mathematics Subject Classification (2010)

Navigation