Discriminative vs. Generative Classifiers for Cost Sensitive Learning

  • Chris Drummond
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4013)


This paper experimentally compares the performance of discriminative and generative classifiers for cost sensitive learning. There is some evidence that learning a discriminative classifier is more effective for a traditional classification task. This paper explores the advantages, and disadvantages, of using a generative classifier when the misclassification costs, and class frequencies, are not fixed. The paper details experiments built around commonly used algorithms modified to be cost sensitive. This allows a clear comparison to the same algorithm used to produce a discriminative classifier. The paper compares the performance of these different variants over multiple data sets and for the full range of misclassification costs and class frequencies. It concludes that although some of these variants are better than a single discriminative classifier, the right choice of training set distribution plus careful calibration are needed to make them competitive with multiple discriminative classifiers.


Support Vector Machine Cost Curve Expected Cost Class Frequency Decision Tree Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Rubinstein, Y.D., Hastie, T.: Discriminative vs informative learning. Knowledge Discovery and Data Mining, 49–53 (1997)Google Scholar
  2. 2.
    Vapnik, V.: Statistical Learning Theory. Wiley, Chichester (1998)MATHGoogle Scholar
  3. 3.
    Tong, S., Koller, D.: Restricted Bayes optimal classifiers. In: Proceedings of the 17th National Conference on Artificial Intelligence (2000)Google Scholar
  4. 4.
    Ng, A.Y., Jordan, M.: On discriminative vs. generative classifiers: A comparison of logistic regression and naive Bayes. In: Advances in Neural Information Processing Systems 14, MIT Press, Cambridge (2002)Google Scholar
  5. 5.
    Jaakkola, T.S., Haussler, D.: Exploiting generative models in discriminative classiers. In: Advances in Neural Information Processing Systems, pp. 487–493. MIT Press, Cambridge (1999)Google Scholar
  6. 6.
    Provost, F., Domingos, P.: Tree induction for probability-based ranking. Machine Learning 52 (2003)Google Scholar
  7. 7.
    Platt, J.: Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. In: Advances in Large-Margin Classifiers, pp. 61–74. MIT Press, Cambridge (2000)Google Scholar
  8. 8.
    Drummond, C., Holte, R.C.: Cost curves: An improved method for visualizing classifier performance. In: Machine Learning (in Press)Google Scholar
  9. 9.
    Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, San Francisco (2005)MATHGoogle Scholar
  10. 10.
    Drummond, C.: Discriminative vs. generative classifiers: An in-depth experimental comparison using cost curves (2006), http://iit-iti.nrc-cnrc.gc.ca/personnel/drummond_christopher_e.html
  11. 11.
    Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
  12. 12.
    Quinlan, J.R.: C4.5 Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)Google Scholar
  13. 13.
    Zadrozny, B., Elkan, C.: Transforming classifier scores into accurate multiclass probability estimates. In: Proceedings of the Eighth International Conference on Knowledge Discovery & Data Mining (2002)Google Scholar
  14. 14.
    Rumelhart, D.E., McClelland, J.L.: Parallel distributed processing: explorations in the microstructure of cognition. MIT Press, Cambridge (1986)Google Scholar
  15. 15.
    Bishop, C.M.: Neural networks for pattern recognition. OUP (1996)Google Scholar
  16. 16.
    Drummond, C., Holte, R.C.: Exploiting the cost (in)sensitivity of decision tree splitting criteria. In: Proceedings of the 17th International Conference on Machine Learning, pp. 239–246 (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Chris Drummond
    • 1
  1. 1.Institute for Information TechnologyNational Research Council CanadaOttawa, OntarioCanada

Personalised recommendations