Skip to main content

Proximal Gradient Methods for Machine Learning and Imaging

Part of the Applied and Numerical Harmonic Analysis book series (ANHA)

Abstract

Convex optimization plays a key role in data sciences. The objective of this work is to provide basic tools and methods at the core of modern nonlinear convex optimization. Starting from the gradient descent method we will focus on a comprehensive convergence analysis for the proximal gradient algorithm and its state-of-the art variants, including accelerated, stochastic and block-wise implementations, which are nowadays very popular techniques to solve machine learning and inverse problems.

This is a preview of subscription content, access via your institution.

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Fig. 1
Fig. 2
Fig. 3

Notes

  1. 1.

    Note that if \(\inf \Phi = -\infty \), it follows from (18) that \(\inf \Phi =\sup (-\Psi )= - \inf \Psi =-\infty \). In this case, \(\Psi \equiv +\infty \) and \(\inf \Phi + \inf \Psi = -\infty + \infty \) does not make sense. Anyway, since there is no gap between \(\Phi \) and \(-\Psi \), by convention, we set \(\inf \Phi + \inf \Psi = 0\). The same situation occurs if \(\inf \Psi =-\infty \).

References

  1. Alvarez, F., Attouch, H.: An inertial proximal method for maximal monotone operators via discretization of a nonlinear oscillator with damping. Set-Valued Anal. 9, 3–11 (2001)

    MathSciNet  MATH  Google Scholar 

  2. Atchadé, Y.F., Fort, G., Moulines, E.: On perturbed proximal gradient algorithms. J. Mach. Learn. Res. 18, 1–33 (2017)

    MathSciNet  MATH  Google Scholar 

  3. Attouch, H., Bolte, J.: On the convergence of the proximal algorithm for nonsmooth functions involving analytic features. Math. Progr. 116, 5–16 (2009)

    MathSciNet  MATH  Google Scholar 

  4. Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems. An approach based on the Kurdyka-Ł ojasiewicz inequality, Math. Oper. Res. 35, 438–457 (2010)

    Google Scholar 

  5. Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods. Math. Progr. 137, 91–129 (2013)

    MathSciNet  MATH  Google Scholar 

  6. H. Attouch, Z. Chbani, J. Peypouquet, P. Redont, Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity. Math. Prog. Ser. B 168, 123–175 (2018)

    Google Scholar 

  7. Aujol, J.-F., Dossal, C., Rondepierre, A.: Optimal convergence rates for Nesterov Acceleration. SIAM J. Optim. 29, 3131–3153 (2019)

    MathSciNet  MATH  Google Scholar 

  8. Bach, F., Jenatton, R., Mairal, J., Obozinski, G.: Optimization with Sparsity-Inducing Penalties. Optim. Mach. Learn. 5, 19–53 (2011)

    MATH  Google Scholar 

  9. Baillon, J.B., Bruck, R.E., Reich, S.: On the asymptotic behavior of nonexpansive mappings and semigroups in Banach spaces. Houston J. Math. 4, 1–9 (1978)

    MathSciNet  MATH  Google Scholar 

  10. Barbu, V., Precupanu, T.: Convexity and Optimization in Banach Spaces. Springer, Dordrecht (2012)

    MATH  Google Scholar 

  11. Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2nd edn. Springer, New York (2017)

    Google Scholar 

  12. Beck, A., Teboulle, M.: A fast iterative Shrinkage-Thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2, 183–202 (2009)

    MathSciNet  MATH  Google Scholar 

  13. Beck, A., Teboulle, M.: Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans. Image Process. 18, 2419–2434 (2009)

    MathSciNet  MATH  Google Scholar 

  14. Bolte, J., Daniilidis, A., Lewis, A.: The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17, 1205–1223 (2006)

    MathSciNet  MATH  Google Scholar 

  15. Beck, A., Teboulle, M.: A fast dual proximal gradient algorithm for convex minimization and applications. Oper. Res. Lett. 42, 1–6 (2014)

    MathSciNet  MATH  Google Scholar 

  16. Bolte, J., Daniilidis, A., Lewis, A., Shiota, M.: Clarke subgradients of stratifiable functions. SIAM J. Optim. 18, 556–572 (2007)

    MathSciNet  MATH  Google Scholar 

  17. Bolte, J., Nguyen, T.P., Peypouquet, J., Suter, B.W.: From error bounds to the complexity of first-order descent methods for convex functions. Math. Program. 165, 471–507 (2017)

    MathSciNet  MATH  Google Scholar 

  18. Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Prog. 146, 459–494 (2013)

    MathSciNet  MATH  Google Scholar 

  19. Borwein, J.M., Vanderwerff, J.D.: Convex Functions: Constructions, Characterizations and Counterexamples. Cambridge University Press, Cambridge (2010)

    Google Scholar 

  20. Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory—COLT ’92, p. 144 (1992)

    Google Scholar 

  21. Bottou, L., Bousquet, O.: The tradeoffs of large-scale learning. In: Optimization for Machine Learning, pp. 351–368, The MIT Press, Cambridge (2012)

    Google Scholar 

  22. Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM Rev. 60, 223–311 (2018)

    MathSciNet  MATH  Google Scholar 

  23. Bourbaki, N.: General Topology, 2nd edn. Springer, New York (1989)

    Google Scholar 

  24. Bredies, K.: A forward-backward splitting algorithm for the minimization of non-smooth convex functionals in Banach space. Inv. Prob. 25, Art. 015005 (2009)

    Google Scholar 

  25. Browder, F.E., Petryshyn, W.V.: The solution by iteration of nonlinear functional equations in Banach spaces. Bull. Am. Math. Soc. 72, 571–575 (1966)

    MathSciNet  MATH  Google Scholar 

  26. Browder, F.E., Petryshyn, W.V.: Construction of fixed points of nonlinear mappings in Hilbert space. J. Math. Anal. Appl. 20, 197–228 (1967)

    MathSciNet  MATH  Google Scholar 

  27. Burke, J.V., Ferris, M.C.: Weak sharp minima in mathematical programming. SIAM J. Control Optim. 31, 1340–1359 (1993)

    MathSciNet  MATH  Google Scholar 

  28. Chambolle, A.: An algorithm for total variation minimization and applications. J. Math. Imaging Vis. 20, 89–97 (2004)

    MathSciNet  MATH  Google Scholar 

  29. Chambolle, A., Dossal, C.: On the convergence of the iterates of the “Fast Iterative Shrinkage/Thresholding Algorithm". J. Optim. Theory Appl. 166, 968–982 (2015)

    Google Scholar 

  30. Chambolle, A., Lions, P.-L.: Image restoration by constrained total variation minimization and variants. In: Investigative and Trial Image Processing, San Diego, CA (SPIE), vol. 2567, pp. 50–59 (1995)

    Google Scholar 

  31. Chambolle, A., Lions, P.-L.: Image recovery via total variation minimization and related problems. Numer. Math. 76, 167–188 (1997)

    MathSciNet  MATH  Google Scholar 

  32. Chambolle, A., Pock, T.: An introduction to continuous optimization for imaging. Acta Numerica 25, 161–319 (2016)

    MathSciNet  MATH  Google Scholar 

  33. Chan, T.F., Golub, G.H., Mulet, P.: A nonlinear primal-dual method for total variation-based image restoration. SIAM J. Sci. Comput. 20, 1964–1977 (1999)

    MathSciNet  MATH  Google Scholar 

  34. Combettes, P.L., Pesquet, J.-C.: Proximal splitting methods in signal processing, In: Fixed-Point Algorithms for Inverse Problems in Science and Engineering, pp. 185–212. Springer, New York, NY (2011)

    Google Scholar 

  35. Combettes, P.L., Pesquet, J.-C.: Stochastic quasi-Fejér block-coordinate fixed point iterations with random sweeping. SIAM J. Optim. 25, 1121–1248 (2015)

    MATH  Google Scholar 

  36. Combettes, P.L., Pesquet, J.-C.: Proximal thresholding algorithms for minimization over orthonormal bases. SIAM J. Optim. 18, 1351–1376 (2007)

    MathSciNet  MATH  Google Scholar 

  37. Combettes, P.L., V u, B.C.: Dualization of signal recovery problems. Set-Valued Anal. 18, 373–404 (2010)

    Google Scholar 

  38. Combettes, P.L., Yamada, I.: Compositions and convex combinations of averaged nonexpansive operators. J. Math. Anal. Appl. 425, 55–70 (2015)

    MathSciNet  MATH  Google Scholar 

  39. Combettes, P.L., Wajs, V.: Signal recovery by proximal forward-backward splitting. Multiscale Model. Simul. 4, 1168–1200 (2005)

    MathSciNet  MATH  Google Scholar 

  40. Cortes, C., Vapnik, V.: Support vector networks. Mach. Learn. 20, 273–297 (1995)

    MATH  Google Scholar 

  41. Daubechies, I., Defrise, M., De Mol, C.: An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Comm. Pure Appl. Math. 57, 1413–1457 (2004)

    MathSciNet  MATH  Google Scholar 

  42. Defazio, A., Bach, F., Lacoste-Julien, S.: SAGA: a fast incremental gradient method with support for non-strongly convex composite objectives. In: Advances in Neural Information Processing Systems, vol. 27 (2014)

    Google Scholar 

  43. Dotson, W.G.: On the Mann iterative process. Trans. Am. Math. Soc. 149, 65–73 (1970)

    MathSciNet  MATH  Google Scholar 

  44. Duchi, J., Singer, Y.: Efficient online and batch learning using forward backward splitting. J. Mach. Learn. Res. 10, 2899–2934 (2009)

    MathSciNet  MATH  Google Scholar 

  45. Dünner, C., Forte, S., Takac, M., Jaggi, M.: Primal-dual rates and certificates. In: Proceedings of The 33rd International Conference on Machine Learning, PMLR, vol. 48, pp. 783–792 (2016)

    Google Scholar 

  46. Ekeland, I., Témam, R.: Roger. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, Convex analysis and variational problems (1999)

    Google Scholar 

  47. Ermoliev, Yu.M.: On the method of generalized stochastic gradients and quasi-Fejér sequences. Cybernetics 5, 208–220 (1969)

    Google Scholar 

  48. Fenchel, W.: Convex Cones, Sets, and Functions. Princeton University (1953)

    Google Scholar 

  49. Foucart, S., Rauhut, H.: A mathematical introduction to compressive sensing. Birkäuser. Springer, New York (2010)

    Google Scholar 

  50. Frankel, P., Garrigos, G., Peypouquet, J.: Splitting methods with variable metric for Kurdyka-Łojasiewicz functions and general convergence rates. J. Optim. Theory Appl. 165, 874–900 (2015)

    MathSciNet  MATH  Google Scholar 

  51. Gabay, D.: Applications of the method of multipliers to variational inequalities. In: Fortin, M., Glowinski, R. (eds.) Augmented Lagrangian Methods: Applications to the Numerical Solution of Boundary-Value Problems, North-Holland, Amsterdam, vol. 15, pp. 299–331 (1983)

    Google Scholar 

  52. Garrigos, G., Rosasco, L., Villa, S.: Convergence of the Forward-Backward Algorithm: Beyond the Worst Case with the Help of Geometry (2017). https://arxiv.org/abs/1703.09477

  53. Goldstein, A.A.: Convex programming in Hilbert space. Bull. Am. Math. Soc. 70, 709–710 (1964)

    MathSciNet  MATH  Google Scholar 

  54. Groetsch, C.W.: A note on segmenting Mann iterates. J. Math. Anal. Appl. 40, 369–372 (1972)

    MathSciNet  MATH  Google Scholar 

  55. Guler, O.: New proximal point algorithms for convex minimization. SIAM J. Optim. 2, 649–664 (1992)

    MathSciNet  MATH  Google Scholar 

  56. Blatt, D., Hero, A., Gauchman, H.: A convergent incremental gradient method with a constant step size. SIAM J. Optim. 18, 29–51 (2007)

    MathSciNet  MATH  Google Scholar 

  57. Hiriart-Urruty, J.-B., Lemaréchal, C.: Fundamentals of Convex Analysis. Springer, Berlin (2001)

    Google Scholar 

  58. Jensen, J.L.W.V.: Sur les fonctions convexes et les inégalités entre les valeurs moyennes. Acta Math. 30, 175–193 (1906)

    MathSciNet  MATH  Google Scholar 

  59. Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. Adv. Neural Inf. Process. Syst. 26, 315–323 (2013)

    Google Scholar 

  60. Karimi, H., Nutini, J., Schmidt, M.: Linear Convergence of gradient and proximal-gradient methods under the Polyak-Łojasiewicz condition. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds.), Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2016. Lecture Notes in Computer Science, vol. 9851. Springer, Cham

    Google Scholar 

  61. Kiefer, J., Wolfowitz, J.: Stochastic estimation of the maximum of a regression function. Ann. Math. Stat. 23, 462–466 (1952)

    MathSciNet  MATH  Google Scholar 

  62. Kingma, D.P., Ba, L.J.: Adam: a method for stochastic optimization. In: Proceedings of Conference on Learning Representations (ICLR), San Diego (2015)

    Google Scholar 

  63. Krasnoselski, M.A.: Two remarks on the method of successive approximations. Uspekhi Mat. Nauk. 10, 123–127 (1955)

    MathSciNet  Google Scholar 

  64. Levitin, E.S., Polyak, B.T.: Constrained minimization methods. U.S.S.R. Comput. Math. Math. Phys. 6, 1–50 (1966)

    Google Scholar 

  65. Li, W.: Error bounds for piecewise convex quadratic programs and applications. SIAM J. Control Optim 33, 1510–1529 (1995)

    MathSciNet  MATH  Google Scholar 

  66. Li, G.: Global error bounds for piecewise convex polynomials. Math. Prog. Ser. A 137, 37–64 (2013)

    MathSciNet  MATH  Google Scholar 

  67. Lions, P.L., Mercier, I.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16, 964–979 (1979)

    MathSciNet  MATH  Google Scholar 

  68. Luo, Z.Q., Tseng, P.: Error bounds and convergence analysis of feasible descent methods: a general approach. Ann. Oper. Res. 46, 157–178 (1993)

    MathSciNet  MATH  Google Scholar 

  69. Luque, F.: Asymptotic convergence analysis of the proximal point algorithm. SIAM J. Control Optim. 22, 277–293 (1984)

    MathSciNet  MATH  Google Scholar 

  70. Mann, W.R.: Mean value methods in iteration. Proc. Am. Math. Soc. 4, 506–510 (1953)

    MathSciNet  MATH  Google Scholar 

  71. Martinet, B.: Régularisation d’in Opér. 4, Sér. R-3, pp. 154–158 (1970)

    Google Scholar 

  72. Mercier, B.: Inéquations Variationnelles de la Mécanique. No. 80.01 in Publications Mathématiques d’Orsay. Université de Paris-XI, Orsay, France (1980)

    Google Scholar 

  73. Minkowski, H.: Theorie der konvexen Körper, insbesondere Begründung ihres Oberflächenbegriffs. In: Hilbert, D. (ed.) Gesammelte abhandlungen von Hermann Minkowski [Collected Papers of Hermann Minkowski], vol. 2, pp. 131–229. B.G. Teubner, Leipzig (1911)

    Google Scholar 

  74. Moreau, J.J.: Fonctions convexes duales et points proximaux dans un espace hilbertien, C. R. Acad. Sci. Paris Ser. A Math. 255, 2897–2899 (1962)

    Google Scholar 

  75. Moreau, J.J.: Propriétés des applications “prox”, C. R. Acad. Sci. Paris Ser. A Math. 256, 1069–1071 (1963)

    Google Scholar 

  76. Moreau, J.J.: Proximité et dualité dans un espace Hilbertien. Bull. de la Société Mathématique de France 93, 273–299 (1965)

    MathSciNet  MATH  Google Scholar 

  77. Mosci, S., Rosasco, L., Santoro, M., Verri, A., Villa, S.: Solving structured sparsity regularization with proximal methods. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 418–433. Springer, Berlin, Heidelberg (2010)

    Google Scholar 

  78. Necoara, I., Clipici, D.: Parallel random coordinate descent method for composite minimization: convergence analysis and error bounds. SIAM J. Optim. 26, 197–226 (2016)

    MathSciNet  MATH  Google Scholar 

  79. Necoara, I., Nesterov, Y., Glineur, F.: Random block coordinate descent methods for linearly constrained optimization over networks. J. Optim. Theory Appl. 173, 227–254 (2017)

    MathSciNet  MATH  Google Scholar 

  80. Nemirovski, A., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19, 1574–1609 (2009)

    MathSciNet  MATH  Google Scholar 

  81. Nemirovsij, A.S., Yudin, D.B.: Problem Complexity and Method Efficiency in Optimization. Wiley-Interscience, New York (1983)

    Google Scholar 

  82. Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course. Kluwer Academic Publishers, London (2004)

    Google Scholar 

  83. Nesterov, Y.: A method for solving the convex programming problem with convergence rate \(O(1/k^2)\). Dokl. Akad. Nauk SSSR 269, 543–547 (1983)

    MathSciNet  Google Scholar 

  84. Nesterov, Y.: Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22, 341–362 (2012)

    MathSciNet  MATH  Google Scholar 

  85. Opial, Z.: Weak convergence of the sequence of successive approximations for nonexpansive mappings. Bull. Am. Math. Soc. 73, 591–597 (1967)

    MathSciNet  MATH  Google Scholar 

  86. Osher, S., Burger, M., Goldfarb, D., Xu, J., Yin, W.: An iterative regularization method for total variation- based image restoration. Multiscale Model. Sim. 4, 460–489 (2005)

    MathSciNet  MATH  Google Scholar 

  87. Passty, G.B.: Ergodic convergence of a zero of the sum of monotone operators in Hilbert space. J. Math. Anal. Appl. 72, 383–390 (1979)

    MathSciNet  MATH  Google Scholar 

  88. Peypouquet, J.: Convex Optimization in Normed Spaces. Springer, Cham (2015)

    Google Scholar 

  89. Phelps, R.R.: Convex Functions, Monotone Operators and Differentiability. Springer, Berlin (1993)

    Google Scholar 

  90. Polyak, B.T.: Dokl. Akad. Nauk SSSR 174

    Google Scholar 

  91. Polyak, B.T.: Gradient methods for minimizing functionals. Zh. Vychisl. Mat. Mat. Fiz. 3, 643–653 (1963)

    MathSciNet  MATH  Google Scholar 

  92. Polyak, B.T.: Subgradient methods: a survey of Soviet research. In: Lemaréchal, C.L., Mifflin, R. (eds.) Proceedings of a IIASA Workshop, Nonsmooth Optimization, pp. 5–28. Pergamon Press, New York (1977)

    Google Scholar 

  93. Polyak, B.T.: Introduction to Optimization. Optimization Software, Inc. (1987)

    Google Scholar 

  94. Richtàrik, P., Takàc̆, M.: Parallel coordinate descent methods for big data optimization. Math. Program. Ser. A 156, 56–484 (2016)

    Google Scholar 

  95. Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22, 400–407 (1951)

    MathSciNet  MATH  Google Scholar 

  96. Robbins, H., Siegmund, D.: A convergence theorem for non negative almost supermartingales and some applications. In: Optimizing Methods in Statistics, pp. 233–257. Academic Press (1971)

    Google Scholar 

  97. Rockafellar, T.: Monotone operators and the proximal point algorithm. SIAM J. Optim. 14, 877–898 (1976)

    MathSciNet  MATH  Google Scholar 

  98. Rockafellar, T.: Convex Analysis. Princeton University Press, Princeton (1970)

    Google Scholar 

  99. Rockafellar, T.: Conjugate duality and optimization. Society for Industrial and Applied Mathematics, Philadelphia (1974)

    Google Scholar 

  100. Rosasco, L., Villa, S., Vũ, B.C.: Convergence of stochastic proximal gradient method. Appl. Math. Optim. 82, 891–917 (2020)

    MathSciNet  MATH  Google Scholar 

  101. Rudin, L.I., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Physica D 60, 259–268 (1992)

    MathSciNet  MATH  Google Scholar 

  102. Salzo, S.: The variable metric forward-backward splitting algorithm under mild differentiability assumptions. SIAM J. Optim. 27(4), 2153–2181 (2017)

    MathSciNet  MATH  Google Scholar 

  103. Salzo, S. Villa, S.: Parallel random block-coordinate forward-backward algorithm: a unified convergence analysis. Math. Program. Ser. A. https://doi.org/10.1007s10107-020-01602-1

    Google Scholar 

  104. Schaefer, H.: Über die Methode sukzessiver Approximationen. Jber. Deutsch. Math.-Verein. 59, 131–140 (1957)

    MathSciNet  MATH  Google Scholar 

  105. Shamir, O., Zhang, T.: Stochastic gradient descent for non-smooth optimization: convergence results and optimal averaging schemes. In: Proceedings of the 30th International Conference on Machine Learning, pp. 71–79 (2013)

    Google Scholar 

  106. Shalev-Shwartz, S., Singer, Y., Srebro, N., Cotter, A.: Pegasos: primal estimated sub-gradient solver for SVM. Math. Program. 127, 3–30 (2011)

    MathSciNet  MATH  Google Scholar 

  107. Shalev-Shwartz, S., Zhang, T.: Stochastic dual coordinate ascent methods for regularized loss minimization. J. Mach. Learn. Res. 14, 567–599 (2013)

    MathSciNet  MATH  Google Scholar 

  108. Shor, N.: Minimization Methods for Non-differentiable Functions. Springer, New York (1985)

    Google Scholar 

  109. Sibony, M.: Méthodes itéraratives pour les équations et inéquations aux dérivées partielles non linéaires de type monotone. Calcolo 7, 65–183 (1970)

    MathSciNet  MATH  Google Scholar 

  110. Steinwart, I., Christmann, A.: Support Vector Machines. Springer, New York (2008)

    Google Scholar 

  111. Su, W., Boyd, S., Candès, E.J.: A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. J. Mach. Learn. Res. 17, 1–43 (2016)

    MathSciNet  MATH  Google Scholar 

  112. Tseng, P.: Applications of a splitting algorithm to decomposition in convex programming and variational inequalities. SIAM J. Control Optim. 29, 119–138 (1991)

    MathSciNet  MATH  Google Scholar 

  113. Wolfe, P.: A method of conjugate subgradients for minimizing nondifferentiable functions. Nondifferentiable optimization. Math. Program. Stud. 3, 145–173 (1975)

    Google Scholar 

  114. Wright, S.: Coordinate descent algorithms. Math. Program. 151, 3–34 (2015)

    MathSciNet  MATH  Google Scholar 

  115. Zălinescu, C.: Convex Analysis in General Vector Spaces. World Scientific Publishing Co. Inc, River Edge, NJ (2002)

    Google Scholar 

  116. Zhang, X., Burger, M., Bresson, X., Osher, S.: Bregmanized nonlocal regularization for deconvolution and sparse reconstruction. SIAM J. Imaging Sci. 3, 253–276 (2010)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The work of S. Villa has been supported by the ITN-ETN project TraDE-OPT funded by the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska–Curie grant agreement No 861137 and by the project “Processi evolutivi con memoria descrivibili tramite equazioni integro-differenziali” funded by Gruppo Nazionale per l’ Analisi Matematica, la Probabilità e le loro Applicazioni (GNAMPA) of the Istituto Nazionale di Alta Matematica (INdAM).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Saverio Salzo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Verify currency and authenticity via CrossMark

Cite this chapter

Salzo, S., Villa, S. (2021). Proximal Gradient Methods for Machine Learning and Imaging. In: De Mari, F., De Vito, E. (eds) Harmonic and Applied Analysis. Applied and Numerical Harmonic Analysis. Birkhäuser, Cham. https://doi.org/10.1007/978-3-030-86664-8_4

Download citation