Abstract
We study cost-sensitive learning of decision trees that incorporate both test costs and misclassification costs. In particular, we first propose a lazy decision tree learning that minimizes the total cost of tests and misclassifications. Then assuming test examples may contain unknown attributes whose values can be obtained at a cost (the test cost), we design several novel test strategies which attempt to minimize the total cost of tests and misclassifications for each test example. We empirically evaluate our tree-building and various test strategies, and show that they are very effective. Our results can be readily applied to real-world diagnosis tasks, such as medical diagnosis where doctors must try to determine what tests (e.g., blood tests) should be ordered for a patient to minimize the total cost of tests and misclassifications (misdiagnosis). A case study on heart disease is given throughout the paper.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases (website). University of California, Irvine (1998)
Chai, X., Deng, L., Yang, Q., Ling, C.X.: Test-Cost Sensitive Naïve Bayesian Classification. In: Proceedings of the Fourth IEEE International Conference on Data Mining. IEEE Computer Society Press, Brighton (2004)
Special Issue on Learning from Imbalanced Datasets. In: Chawla, N.V., Japkowicz, N., Kolcz, A. (eds.) SIGKDD, vol. 6(1), ACM Press, New York (2004)
Domingos, P.: MetaCost: A General Method for Making Classifiers Cost-Sensitive. In: Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining, pp. 155–164. ACM Press, San Diego (1999)
Elkan, C.: The Foundations of Cost-Sensitive Learning. In: Proceedings of the Seventeenth International Joint Conference of Artificial Intelligence, pp. 973–978. Morgan Kaufmann, Seattle (2001)
Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the 13th International Joint Conference on Artificial Intelligence, pp. 1022–1027. Morgan Kaufmann, France (1993)
Ting, K.M.: Inducing Cost-Sensitive Trees via Instance Weighting. In: Żytkow, J.M. (ed.) PKDD 1998. LNCS, vol. 1510, pp. 23–26. Springer, Heidelberg (1998)
Ling, C.X., Yang, Q., Wang, J., Zhang, S.: Decision Trees with Minimal Costs. In: Proceedings of the Twenty-First International Conference on Machine Learning. Morgan Kaufmann, Banff (2004)
Lizotte, D., Madani, O., Greiner, R.: Budgeted Learning of Naïve-Bayes Classifiers. In: Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann, Acapulco (2003)
Quinlan, J.R. (ed.): C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Tan, M.: Cost-sensitive learning of classification knowledge and its applications in robotics. Machine Learning Journal 13, 7–33 (1993)
Turney, P.D.: Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm. Journal of Artificial Intelligence Research 2, 369–409 (1995)
Turney, P.D.: Types of cost in inductive concept learning. In: Proceedings of the Workshop on Cost-Sensitive Learning at the Seventeenth International Conference on Machine Learning, Stanford University, California (2000)
Zubek, V.B., Dietterich, T.: Pruning improves heuristic search for cost-sensitive learning. In: Proceedings of the Nineteenth International Conference of Machine Learning, pp. 27–35. Morgan Kaufmann, Sydney (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sheng, S., Ling, C.X., Yang, Q. (2005). Simple Test Strategies for Cost-Sensitive Decision Trees. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds) Machine Learning: ECML 2005. ECML 2005. Lecture Notes in Computer Science(), vol 3720. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11564096_36
Download citation
DOI: https://doi.org/10.1007/11564096_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29243-2
Online ISBN: 978-3-540-31692-3
eBook Packages: Computer ScienceComputer Science (R0)