Skip to main content

Distributed Newton Methods for Regularized Logistic Regression

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9078))

Included in the following conference series:

Abstract

Regularized logistic regression is a very useful classification method, but for large-scale data, its distributed training has not been investigated much. In this work, we propose a distributed Newton method for training logistic regression. Many interesting techniques are discussed for reducing the communication cost and speeding up the computation. Experiments show that the proposed method is competitive with or even faster than state-of-the-art approaches such as Alternating Direction Method of Multipliers (ADMM) and Vowpal Wabbit (VW). We have released an MPI-based implementation for public use.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Agarwal, A., Chapelle, O., Dudik, M., Langford, J.: A reliable effective terascale linear learning system. JMLR 15, 1111–1133 (2014)

    MATH  MathSciNet  Google Scholar 

  • Bian, Y., Li, X., Cao, M., Liu, Y.: Bundle CDN: A highly parallelized approach for large-scale l1-regularized logistic regression. In ECML/PKDD

    Google Scholar 

  • Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. and Trend. in ML 3(1), 1–122 (2011)

    Article  Google Scholar 

  • Bradley, J.K., Kyrola, A., Bickson, D., Guestrin, C.: Parallel coordinate descent for l1-regularized loss minimization. In: ICML

    Google Scholar 

  • Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. JMLR 12, 2121–2159 (2011)

    MATH  MathSciNet  Google Scholar 

  • Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., Lin, C.-J.: LIBLINEAR: A library for large linear classification. JMLR 9, 1871–1874 (2008)

    MATH  Google Scholar 

  • Gabriel, E., Fagg, G.E., Bosilca, G., Angskun, T., Dongarra, J.J., Squyres, J.M., Sahay, V., Kambadur, P., Barrett, B., Lumsdaine, A., Castain, R.H., Daniel, D.J., Graham, R.L., Woodall, T.S.: Open MPI: Goals, concept, and design of a next generation MPI implementation. In: European PVM/MPI Users’ Group Meeting, pp. 97–104 (2004)

    Google Scholar 

  • Keerthi, S.S., DeCoste, D.: A modified finite Newton method for fast solution of large scale linear SVMs. JMLR 6, 341–361 (2005)

    MATH  MathSciNet  Google Scholar 

  • Langford, J., Li, L., Strehl, A.: Vowpal Wabbit (2007). https://github.com/JohnLangford/vowpal_wabbit/wiki

  • Lin, C.-J., Moré, J.J.: Newton’s method for large-scale bound constrained problems. SIAM J. Optim. 9, 1100–1127 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  • Lin, C.-J., Weng, R.C., Keerthi, S.S.: Trust region Newton method for large-scale logistic regression. JMLR 9, 627–650 (2008)

    MATH  MathSciNet  Google Scholar 

  • Lin, C.-Y., Tsai, C.-H., Lee, C.-P., Lin, C.-J.: Large-scale logistic regression and linear support vector machines using spark. In: IEEE BigData (2014)

    Google Scholar 

  • Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Program. 45(1), 503–528 (1989)

    Article  MATH  MathSciNet  Google Scholar 

  • Richtárik, P., Takáč, M.: Parallel coordinate descent methods for big data optimization. Math. Program (2012) (Under revision)

    Google Scholar 

  • Snir, M., Otto, S.: MPI-The Complete Reference: The MPI Core. MIT Press, Cambridge (1998)

    Google Scholar 

  • Steihaug, T.: The conjugate gradient method and trust regions in large scale optimization. SIAM J. on Num. Ana. 20, 626–637 (1983)

    Article  MATH  MathSciNet  Google Scholar 

  • Yu, H.-F., Huang, F.-L., Lin, C.-J.: Dual coordinate descent methods for logistic regression and maximum entropy models. MLJ 85, 41–75 (2011)

    MATH  MathSciNet  Google Scholar 

  • Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., Lin, C.-J.: A comparison of optimization methods and software for large-scale l1-regularized linear classification. JMLR 11, 3183–3234 (2010)

    MATH  MathSciNet  Google Scholar 

  • Zhang, C., Lee, H., Shin, K.G.: Efficient distributed linear classification algorithms via the alternating direction method of multipliers. In: AISTATS (2012)

    Google Scholar 

  • Zinkevich, M., Weimer, M., Smola, A., Li, L.: Parallelized stochastic gradient descent. In NIPS (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chih-Jen Lin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Zhuang, Y., Chin, WS., Juan, YC., Lin, CJ. (2015). Distributed Newton Methods for Regularized Logistic Regression. In: Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D., Motoda, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2015. Lecture Notes in Computer Science(), vol 9078. Springer, Cham. https://doi.org/10.1007/978-3-319-18032-8_54

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18032-8_54

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18031-1

  • Online ISBN: 978-3-319-18032-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics