Machine Learning

, Volume 61, Issue 1–3, pp 151–165 | Cite as

A Fast Dual Algorithm for Kernel Logistic Regression

  • S. S. Keerthi
  • K. B. Duan
  • S. K. Shevade
  • A. N. Poo


This paper gives a new iterative algorithm for kernel logistic regression. It is based on the solution of a dual problem using ideas similar to those of the Sequential Minimal Optimization algorithm for Support Vector Machines. Asymptotic convergence of the algorithm is proved. Computational experiments show that the algorithm is robust and fast. The algorithmic ideas can also be used to give a fast dual algorithm for solving the optimization problem arising in the inner loop of Gaussian Process classifiers.


classification logistic regression kernel methods SMO algorithm 


  1. Bailey, R. R., Pettit, E. J., Borochoff, R. T., Manry, M. T., & Jiang, X. (1993). Automatic recognition of USGS land use/cover categories using statistical and neural network classifiers. In Proceedings of SPIE, Vol. 1944.Google Scholar
  2. Cauwenberghs, G. (2001). Kernel machine learning: A systems perspective. Tutorial presented at ISCAS. Available at
  3. Jaakkola T., & Haussler, D. (1999). Probabilistic kernel regression models. In Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics. San Francisco: Morgan Kaufmann, See
  4. Keerthi, S. S., Shevade, S. K., Bhattacharyya, C., & Murthy, K. R. K. (2001). Improvements to Platt's SMO algorithm for SVM classifier design. Neural Computation, 13:3, 637–649.CrossRefGoogle Scholar
  5. Merz C. J., & Murphy, P. M. (1998). UCI repository of machine learning databases. Department of Information and Computer Science, University of California, Irvine, CA. See
  6. Platt, J. (1998). Sequential minimal optimization: A fast algorithm for training support vector machines. Tech. Rept. MSR-TR-98-14, Microsoft Research, Redmond, 1998.Google Scholar
  7. Platt, J. (1999). Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In A. Smola, B. Schölkopf, & D. Schuurmans (Eds.), Advances in large margin classifiers. Cambridge, MA: MIT Press.Google Scholar
  8. Rätsch, G. (1999). Benchmark datasets. Available at
  9. Roth, V. (2001). Probabilistic discriminative kernel classifiers for multi-class problems. In B. Radig & S. Florczyk (Eds.), Pattern recognition–DAGM'01 (pp. 246–253). Springer, 2001. Available at
  10. Vapnik, V. (1995). The Nature of Statistical Learning Theory. New York: Springer Verlag.Google Scholar
  11. Wahba, G. (1997). Support vector machines, reproducing kernel Hilbert spaces and the randomized GACV. Technical Report 984, Department of Statistics, University of Wisconsin, Madison.Google Scholar
  12. Williams, C. K. I., & Barber, D. (1998). Bayesian classification with Gaussian processes. IEEE Transactions on PAMI, 20, 1342–1351.Google Scholar
  13. Zhu, J., & T. Hastie, (2001). Kernel logistic regression and the import vector machine. In Advances in Neural Information Processing Systems 13. Available at

Copyright information

© Springer Science + Business Media, Inc. 2005

Authors and Affiliations

  • S. S. Keerthi
    • 1
  • K. B. Duan
    • 2
  • S. K. Shevade
    • 3
  • A. N. Poo
    • 4
  1. 1.Yahoo! Research LabsPasadenaUSA
  2. 2.Control Division, Department of Mechanical EngineeringNational University of SingaporeSingapore
  3. 3.Department of Computer Science and AutomationIndian Institute of ScienceBangaloreIndia
  4. 4.Control Division, Department of Mechanical EngineeringNational University of SingaporeSingapore

Personalised recommendations