Parallel Boosting with Momentum

Mukherjee, Indraneel; Canini, Kevin; Frongillo, Rafael; Singer, Yoram

doi:10.1007/978-3-642-40994-3_2

Indraneel Mukherjee²³,
Kevin Canini²³,
Rafael Frongillo²⁴ &
…
Yoram Singer²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8190))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

6190 Accesses
9 Citations

Abstract

We describe a new, simplified, and general analysis of a fusion of Nesterov’s accelerated gradient with parallel coordinate descent. The resulting algorithm, which we call BOOM, for boosting with momentum, enjoys the merits of both techniques. Namely, BOOM retains the momentum and convergence properties of the accelerated gradient method while taking into account the curvature of the objective function. We describe a distributed implementation of BOOM which is suitable for massive high dimensional datasets. We show experimentally that BOOM is especially effective in large scale learning problems with rare yet informative features.

Download to read the full chapter text

Chapter PDF

Support Vector Machines on Large Data Sets: Simple Parallel Approaches

Accelerated gradient boosting

Article 04 February 2019

A New Computational Approach to the Levenberg-Marquardt Learning Algorithm

Keywords

References

Beck, A., Teboulle, M.: Fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal of Imaging Sciences 2, 183–202 (2009)
Article MathSciNet MATH Google Scholar
Collins, M., Schapire, R.E., Singer, Y.: Logistic regression, AdaBoost and Bregman distances. Machine Learning 47(2/3), 253–285 (2002)
Article Google Scholar
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley (1991)
Google Scholar
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 2121–2159 (2011)
Google Scholar
Duchi, J., Singer, Y.: Boosting with structural sparsity. In: Proceedings of the 26th International Conference on Machine Learning (2009)
Google Scholar
McMahan, H.B., Streeter, M.: Adaptive bound optimization for online convex optimization. In: Proceedings of the Twenty Third Annual Conference on Computational Learning Theory (2010)
Google Scholar
Nemirovski, A., Yudin, D.: Problem complexity and method efficiency in optimization. John Wiley and Sons (1983)
Google Scholar
Nesterov, Y.: A method of solving a convex programming problem with convergence rate o(1/k ²). Soviet Mathematics Doklady 27(2), 372–376 (1983)
MATH Google Scholar
Nesterov, Y.: Introductory Lectures on Convex Optimization. Kluwer Academic Publishers (2004)
Google Scholar
Nesterov, Y.: Smooth minimization of nonsmooth functions. Mathematical Programming 103, 127–152 (2005)
Article MathSciNet MATH Google Scholar
Nesterov, Y.: Primal-dual subgradient methods for convex problems. Mathematical Programming 120(1), 221–259 (2009)
Article MathSciNet MATH Google Scholar
Nocedal, J., Wright, S.: Numerical Optimization, 2nd edn. Springer Series in Operations Research and Financial Engineering (2006)
Google Scholar
Salton, G., Buckley, C.: Term weighting approaches in automatic text retrieval. Information Processing and Management 24(5) (1988)
Google Scholar
Shalev-Shwartz, S., Singer, Y.: On the equivalence of weak learnability and linear separability: new relaxations and efficient algorithms. In: Proceedings of the Twenty First Annual Conference on Computational Learning Theory (2008)
Google Scholar
Svore, K., Burges, C.: Large-scale learning to rank using boosted decision trees. In: Bekkerman, R., Bilenko, M., Langford, J. (eds.) Scaling Up Machine Learning. Cambridge University Press (2012)
Google Scholar
Tseng, P.: On accelerated proximal gradient methods for convex-concave optimization. Submitted to SIAM Journal on Optimization (2008)
Google Scholar
Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Mathematical Programming Series B 117, 387–423 (2007)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Google Inc., USA
Indraneel Mukherjee, Kevin Canini & Yoram Singer
Computer Science Division, University of California Berkeley, USA
Rafael Frongillo

Authors

Indraneel Mukherjee
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Canini
View author publications
You can also search for this author in PubMed Google Scholar
Rafael Frongillo
View author publications
You can also search for this author in PubMed Google Scholar
Yoram Singer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Katholieke Universiteit Leuven, Celestijnenlaan 200A, 3001, Leuven, Belgium
Hendrik Blockeel
Fraunhofer IAIS, Department of Knowledge Discovery, Schloss Birlinghoven, University of Bonn, 53754, Sankt Augustin, Germany
Kristian Kersting
LIACS, Universiteit Leiden, Niels Bohrweg 1, 2333 CA, Leiden, The Netherlands
Siegfried Nijssen
Department of Computer Science and Engineering, Czech Technical University, Technicka 2, 16627, Prague 6, Czech Republic
Filip Železný

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mukherjee, I., Canini, K., Frongillo, R., Singer, Y. (2013). Parallel Boosting with Momentum. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2013. Lecture Notes in Computer Science(), vol 8190. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40994-3_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-40994-3_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40993-6
Online ISBN: 978-3-642-40994-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Parallel Boosting with Momentum

Abstract

Chapter PDF

Similar content being viewed by others

Support Vector Machines on Large Data Sets: Simple Parallel Approaches

Accelerated gradient boosting

A New Computational Approach to the Levenberg-Marquardt Learning Algorithm

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Parallel Boosting with Momentum

Abstract

Chapter PDF

Similar content being viewed by others

Support Vector Machines on Large Data Sets: Simple Parallel Approaches

Accelerated gradient boosting

A New Computational Approach to the Levenberg-Marquardt Learning Algorithm

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation