Large Scale Optimization with Proximal Stochastic Newton-Type Gradient Descent

Shi, Ziqiang; Liu, Rujie

doi:10.1007/978-3-319-23528-8_43

Ziqiang Shi¹⁰ &
Rujie Liu¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9284))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

4796 Accesses
2 Citations

Abstract

In this work, we generalized and unified two recent completely different works of Jascha [10] and Lee [2] respectively into one by proposing the proximal stochastic Newton-type gradient (PROXTONE) method for optimizing the sums of two convex functions: one is the average of a huge number of smooth convex functions, and the other is a nonsmooth convex function. Our PROXTONE incorporates second order information to obtain stronger convergence results, that it achieves a linear convergence rate not only in the value of the objective function, but also for the solution. The proofs are simple and intuitive, and the results and technique can be served as a initiate for the research on the proximal stochastic methods that employ second order information. The methods and principles proposed in this paper can be used to do logistic regression, training of deep neural network and so on. Our numerical experiments shows that the PROXTONE achieves better computation performance than existing methods.

Download to read the full chapter text

Chapter PDF

A stochastic extra-step quasi-Newton method for nonsmooth nonconvex optimization

Article 13 March 2021

A new accelerated proximal technique for regression with high-dimensional datasets

Article 28 March 2017

Proximal Gradient Method with Extrapolation and Line Search for a Class of Non-convex and Non-smooth Problems

Article 19 December 2023

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Bertsekas, D.P.: Incremental gradient, subgradient, and proximal methods for convex optimization: a survey. Optimization for Machine Learning 2010, 1–38 (2011)
Google Scholar
Lee, J., Sun, Y., Saunders, M.: Proximal newton-type methods for convex optimization. In: Advances in Neural Information Processing Systems, pp. 836–844 (2012)
Google Scholar
Mairal, J.: Optimization with first-order surrogate functions. arXiv preprint arXiv:1305.3120 (2013)
Nesterov, Y.: Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM Journal on Optimization 22(2), 341–362 (2012)
Article MathSciNet MATH Google Scholar
Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course. Kluwer, Boston (2004)
Book Google Scholar
Parikh, N., Boyd, S.: Proximal algorithms. Foundations and Trends in Optimization 1(3), 123–231 (2013)
Google Scholar
Schmidt, M., Roux, N.L., Bach, F.: Minimizing finite sums with the stochastic average gradient. arXiv preprint arXiv:1309.2388 (2013)
Shalev-Shwartz, S., Zhang, T.: Proximal stochastic dual coordinate ascent. arXiv preprint arXiv:1211.2717 (2012)
Shalev-Shwartz, S., Zhang, T.: Stochastic dual coordinate ascent methods for regularized loss. The Journal of Machine Learning Research 14(1), 567–599 (2013)
MathSciNet MATH Google Scholar
Sohl-Dickstein, J., Poole, B., Ganguli, S.: Fast large-scale optimization by unifying stochastic gradient and quasi-newton methods. In: Proceedings of the 31st International Conference on Machine Learning (ICML 2014), pp. 604–612 (2014)
Google Scholar
Xiao, L., Zhang, T.: A proximal stochastic gradient method with progressive variance reduction. arXiv preprint arXiv:1403.4699 (2014)

Download references

Author information

Authors and Affiliations

Fujitsu Research and Development Center, Beijing, China
Ziqiang Shi & Rujie Liu

Authors

Ziqiang Shi
View author publications
You can also search for this author in PubMed Google Scholar
Rujie Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ziqiang Shi .

Editor information

Editors and Affiliations

University of Bari Aldo Moro, Bari, Italy
Annalisa Appice
University of Porto, Porto, Portugal
Pedro Pereira Rodrigues
University of Porto - CRACS/INESC TEC, Porto, Portugal
Vítor Santos Costa
University of Porto - INESC TEC, Porto, Portugal
Carlos Soares
University of Porto - INESC TEC, Porto, Portugal
João Gama
University of Porto - INESC TEC, Porto, Portugal
Alípio Jorge

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shi, Z., Liu, R. (2015). Large Scale Optimization with Proximal Stochastic Newton-Type Gradient Descent. In: Appice, A., Rodrigues, P., Santos Costa, V., Soares, C., Gama, J., Jorge, A. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2015. Lecture Notes in Computer Science(), vol 9284. Springer, Cham. https://doi.org/10.1007/978-3-319-23528-8_43

Download citation

DOI: https://doi.org/10.1007/978-3-319-23528-8_43
Published: 29 August 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23527-1
Online ISBN: 978-3-319-23528-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Large Scale Optimization with Proximal Stochastic Newton-Type Gradient Descent

Abstract

Chapter PDF

Similar content being viewed by others

A stochastic extra-step quasi-Newton method for nonsmooth nonconvex optimization

A new accelerated proximal technique for regression with high-dimensional datasets

Proximal Gradient Method with Extrapolation and Line Search for a Class of Non-convex and Non-smooth Problems

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Large Scale Optimization with Proximal Stochastic Newton-Type Gradient Descent

Abstract

Chapter PDF

Similar content being viewed by others

A stochastic extra-step quasi-Newton method for nonsmooth nonconvex optimization

A new accelerated proximal technique for regression with high-dimensional datasets

Proximal Gradient Method with Extrapolation and Line Search for a Class of Non-convex and Non-smooth Problems

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation