Advertisement

Distributed Coordinate Descent for L1-regularized Logistic Regression

  • Ilya Trofimov
  • Alexander Genkin
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 542)

Abstract

Logistic regression is a widely used technique for solving classification and class probability estimation problems in text mining, biometrics and clickstream data analysis. Solving logistic regression with L1-regularization in distributed settings is an important problem. This problem arises when training dataset is very large and cannot fit the memory of a single machine. We present d-GLMNET, a new algorithm solving logistic regression with L1-regularization in the distributed settings. We empirically show that it is superior over distributed online learning via truncated gradient.

Keywords

Large-scale learning Logistic regression L1-regularization Sparsity 

Notes

Acknowledgments

We would like to thank John Langford for the advices on Vowpal Wabbit and Ilya Muchnik for his continuous support.

References

  1. 1.
    Yuan, G.-X., Ho, C.-H., Lin, C.-J.: Recent advances of large-scale linear classification. Proc. IEEE 100(9), 2584–2603 (2012)CrossRefGoogle Scholar
  2. 2.
    Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., Lin, C.-J.: A comparison of optimization methods and software for large-scale L1-regularized linear classification. J. Mach. Learn. Res. 11, 3183–3234 (2010)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Genkin, A., Lewis, D.D., Madigan, D.: Large-scale Bayesian logistic regression for text categorization. Technometrics 49(3), 291–304 (2007)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33(1), 1–22 (2010)CrossRefGoogle Scholar
  5. 5.
    Yuan, G.-X., Ho, C.-H., Hsieh, C.-J., Lin, C.-J.: An improved GLMNET for L1-regularized logistic regression. J. Mach. Learn. Res. 13, 1999–2030 (2012)MathSciNetzbMATHGoogle Scholar
  6. 6.
    Balakrishnan, S., Madigan, D.: Algorithms for sparse linear classifiers in the massive data setting. J. Mach. Learn. Res. 1, 1–26 (2007)zbMATHGoogle Scholar
  7. 7.
    Langford, J., Li, L., Zhang, T.: Sparse online learning via truncated gradient. J. Mach. Learn. Res. 10, 777–801 (2009)MathSciNetzbMATHGoogle Scholar
  8. 8.
    McMahan, H.B.: Follow-the-regularized-leader and mirror descent : equivalence theorems and L1 regularization. In: AISTATS 2011 (2011)Google Scholar
  9. 9.
    Agarwal, A., Chapelle, O., Dudík, M., Langford, J.: A reliable effective terascale linear learning system. Technical report (2011). http://arxiv.org/abs/1110.4198
  10. 10.
    Peng, Z., Yan, M., Yin, W.: Parallel and distributed sparse optimization. In: STATOS 2013 (2013)Google Scholar
  11. 11.
    Bradley, J.K., Kyrola, A., Bickson, D., Guestrin, C.: Parallel coordinate descent for L1-regularized loss minimization. In: ICML 2011, Bellevue, WA, USA (2011)Google Scholar
  12. 12.
    Ho, Q., Cipar, J., Cui, H., Kim, J.K., Lee, S., Gibbons, P.B., Gibson, G.A., Ganger, G.R., Xing, E.P.: More effective distributed ML via a stale synchronous parallel parameter server. In: NIPS 2013 (2013)Google Scholar
  13. 13.
    Richtárik, P., Takáč, M.: Parallel coordinate descent methods for big data optimization. Technical report (2012). http://arxiv.org/abs/1212.0873
  14. 14.
    Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. 117, 387–423 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Zinkevich, M., Weimer, M., Smola, A., Li, L.: Parallelized stochastic gradient descent. In: NIPS 2010 (2010)Google Scholar
  16. 16.
    Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: OSDI 2004, San Francisco (2004)Google Scholar
  17. 17.
    Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: Graphlab: a new framework for parallel machine learning. In: UAI 2010, Cataline Island, California (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.YandexMoscowRussia
  2. 2.AVG ConsultingBrooklynUSA

Personalised recommendations