Computational Management Science

, Volume 8, Issue 4, pp 415–428 | Cite as

Kernel logistic regression using truncated Newton method

  • Maher MaaloufEmail author
  • Theodore B. Trafalis
  • Indra Adrianto
Original Paper


Kernel logistic regression (KLR) is a powerful nonlinear classifier. The combination of KLR and the truncated-regularized iteratively re-weighted least-squares (TR-IRLS) algorithm, has led to a powerful classification method using small-to-medium size data sets. This method (algorithm), is called truncated-regularized kernel logistic regression (TR-KLR). Compared to support vector machines (SVM) and TR-IRLS on twelve benchmark publicly available data sets, the proposed TR-KLR algorithm is as accurate as, and much faster than, SVM and more accurate than TR-IRLS. The TR-KLR algorithm also has the advantage of providing direct prediction probabilities.


Classification Logistic regression Kernel methods Truncated Newton method 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Asuncion A, Newman DJ (2007) UCI machine learning repository. University of california, irvine, School of information and computer sciences.
  2. Canu S, Smola A (2006) Kernel methods and the exponential family. Neurocomputing 69(7–9): 714–720CrossRefGoogle Scholar
  3. Chang CC, Lin CJ (2001) LIBSVM: a library for support vector machines. Software available at
  4. Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, LondonGoogle Scholar
  5. Garthwaite P, Jolliffe I, Jones B (2002) Statistical inference. Oxford University Press, LondonGoogle Scholar
  6. Gunn SR (1998) MATLAB support vector machine toolbox. Software available at
  7. Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, BerlinGoogle Scholar
  8. Hosmer DW, Lemeshow S (2000) Applied logistic regression, 2nd edn. Wiley, LondonCrossRefGoogle Scholar
  9. Jaakkola TS, Haussler D (1999) Probabilistic kernel regression models. In: Proceedings of the 1999 conference on AI and statistics. Morgan Kaufmann, CambridgeGoogle Scholar
  10. Karsmakers P, Pelckmans K, Suykens JAK (2007) Multi-class kernel logistic regression: a fixed-size implementation. In: IJCNN: Proceedings of the international joint conference on neural networks, IEEE, pp 1756–1761Google Scholar
  11. Keerthi SS, Duan KB, Shevade SK, Poo AN (2005) A fast dual algorithm for kernel logistic regression. J Mach Learning 61(1–3): 151–165. doi: 10.1007/s10994-005-0768-5 CrossRefGoogle Scholar
  12. Kressel UHG (1999) Pairwise classification and support vector machines. In: Advances in kernel methods: support vector learning. MIT Press, Cambridge, pp 255–268Google Scholar
  13. Komarek P, Moore A (2005) Making LR a core data mining tool with TR-IRLS. In: ICDM: proceedings of the fifth IEEE international conference on data mining, IEEE Computer Society, Washington, USA, pp 685–688Google Scholar
  14. Koh K, Kim S, Boyd S (2007) An interior-point method for large-scale ℓ1-regularized logistic regression. J Mach Learn Res 8: 1519–1555Google Scholar
  15. Komarek P, Moore A (2005) Making logistic regression a core data mining tool: a practical investigation of accuracy, speed, and simplicity. Tech. Rep. CMU-RI-TR-05-27, Robotics Institute, Carnegie Mellon University, Pittsburgh, PAGoogle Scholar
  16. Lin CJ, Weng RC, Keerthi SS (2007) Trust region newton methods for large-scale logistic regression. In: ICML ’07 proceedings of the 24th international conference on machine learning, ACM, New York, pp 561–568Google Scholar
  17. Malouf R (2002) A comparison of algorithms for maximum entropy parameter estimation. In: COLING-02 proceeding of the 6th conference on natural language learning. Association for Computational Linguistics, Morristown, USA, pp 1–7. doi: 10.3115/1118853.1118871
  18. Maalouf M, Trafalis TB (2011) Robust weighted kernel logistic regression in imbalanced and rare events data. Comput Stat Data Anal 55(1): 168–183CrossRefGoogle Scholar
  19. Minka TP (2003) A comparison of numerical optimizers for logistic regression. Tech rep, Deptartment of Statistics, Carnegie Mellon UniversityGoogle Scholar
  20. Platt JC, Cristianini N, Shawe-taylor J (2000) Large margin DAGs for multiclass classification. In: Advances in neural information processing systems, MIT Press, Cambridge, pp 547–553Google Scholar
  21. Rifkin R, Klautau A (2004) In defense of one-vs-all classification. J Mach Learning Res 5: 101–141Google Scholar
  22. Roth V (2001) Probabilistic discriminative kernel classifiers for multi-class problems. In: Proceedings of the 23rd DAGM-symposium on pattern recognition. Springer, London, pp 246–253Google Scholar
  23. Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, LondonCrossRefGoogle Scholar
  24. Vapnik V (1995) The Nature of Statistical Learning. Springer, New YorkGoogle Scholar
  25. Zhu J, Hastie T (2005) Kernel logistic regression and the import vector machine. J Comput Graphic Stat 14: 185–205CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2010

Authors and Affiliations

  • Maher Maalouf
    • 1
    Email author
  • Theodore B. Trafalis
    • 1
  • Indra Adrianto
    • 1
  1. 1.School of Industrial EngineeringUniversity of OklahomaNormanUSA

Personalised recommendations