CSDTM A Cost Sensitive Decision Tree Based Method

  • Walid Erray
  • Hakim Hacid
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4243)


Making a decision has often many results and repercussions. These results don’t have the same importance according to the considered phenomenon. This situation can be described by the introduction of the cost concept in the learning process. In this article, we propose a method able to integrate the costs in the automatic learning process. We focus our work on the misclassification cost and we use decision trees as a supervised learning technique. Promising results are obtained using the proposed method.


Decision Tree Learning Process Cost Matrix Misclassification Cost Minimum Object 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bradford, J.P., Kunz, C., Kohavi, R., Brunk, C., Brodley, C.E.: Pruning decision trees with misclassification costs. In: Proceedings of the 10th European Conference on Machine Learning, London, UK, pp. 131–136. Springer, Heidelberg (1998)Google Scholar
  2. 2.
    Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth (1984)Google Scholar
  3. 3.
    Domingos, P.: Metacost: A general method for making classifiers cost-sensitive (1999)Google Scholar
  4. 4.
    Dummond, C., Holte, R.C.: Exploiting the cost (in)sensitivity of decision tree spliting criteria. In: Kaufmann, M. (ed.) Machine Learning: Proceedings of the Seventeeth International Conference, pp. 239–246. Morgan Kaufmann, San Francisco (2000)Google Scholar
  5. 5.
    Erray, W.: Faur: Méthode de réduction unidimensionnelle d’un tableau de contingence. In: 12ème rencontres de la Société Francophone de Classification, Montreal, Canada (May 2005)Google Scholar
  6. 6.
    Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: International Conference on Machine Learning, pp. 148–156 (1996)Google Scholar
  7. 7.
    Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting (1998)Google Scholar
  8. 8.
    Kass, G.: An exploratory technique for investigating large quantities of categorical data. j-APPL-STAT 29(2), 119–127 (1980)CrossRefGoogle Scholar
  9. 9.
    Platt, J.: Probabilistic outputs for support vector machines and comparison to regularize likelihood methods. In: Smola, A.J., Bartlett, P., Schoelkopf, B., Schuurmans, D. (eds.) Advances in Large Margin Classifiers, pp. 61–74 (2000)Google Scholar
  10. 10.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)Google Scholar
  11. 11.
    Hettich, C.B.S., Merz, C.: UCI repository of machine learning databases (1998)Google Scholar
  12. 12.
    Shannon, C., Weaver, W.: The Mathematical Theory of Communication. The University of Illinois Press (1949)Google Scholar
  13. 13.
    Tschuprow, A.: On the mathematical expectation of moments of frequency distribution. Biometrika, 185–210 (1921)Google Scholar
  14. 14.
    Turney, P.D.: Types of cost in inductive concept learning. cs.LG/0212034 (2002)Google Scholar
  15. 15.
    Zadrozny, B., Elkan, C.: Learning and making decisions when costs and probabilities are both unknown. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 204–213 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Walid Erray
    • 1
  • Hakim Hacid
    • 1
  1. 1.ERIC LaboratoryLyon 2 UniversityBronFrance

Personalised recommendations