Any-Cost Discovery: Learning Optimal Classification Rules
Fully taking into account the hints possibly hidden in the absent data, this paper proposes a new criterion when selecting attributes for splitting to build a decision tree for a given dataset. In our approach, it must pay a certain cost to obtain an attribute value and pay a cost if a prediction is error. We use different scales for the two kinds of cost instead of the same cost scale defined by previous works. We propose a new algorithm to build decision tree with null branch strategy to minimize the misclassification cost. When consumer offers finite resources, we can make the best use of the resources as well as optimal results obtained by the tree. We also consider discounts in test costs when groups of attributes are tested together. In addition, we also put forward advice about whether it is worthy of increasing resources or not. Our results can be readily applied to real-world diagnosis tasks, such as medical diagnosis where doctors must try to determine what tests should be performed for a patient to minimize the misclassification cost in certain resources.
Unable to display preview. Download preview PDF.
- 2.Chai, X., Deng, L., Yang, Q., Ling, C.X.: Test-Cost Sensitive Naive Bayes Classification. In: Proceedings of The IEEE International Conference on Data Mining, ICDM 2004 (2004)Google Scholar
- 3.Zhang, S., Qin, Z., Ling, C.X., Sheng, S.: “Missing is Useful”: Missing Values in Cost-sensitive Decision Trees. IEEE Transactions on Knowledge and Data EngineeringGoogle Scholar
- 4.Turney, P.D.: Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm. Journal of Artificial Intelligence Research 2, 369–409 (1995)Google Scholar
- 5.Ling, C.X., Yang, Q., Wang, J., Zhang, S.: Decision Trees with Minimal Costs. In: Proceedings of the 21th International Conference on Machine Learning (ICML), Banff, Canada (2004)Google Scholar
- 6.Turney, P.: Types of Cost in Inductive Concept Learning. In: Proceedings of the Cost-Sensitive Learning Workshop at the 17th ICML-2000 Conference, Standford, CA (2000)Google Scholar
- 7.Zubek, V.B., Dietterich, T.: Pruning improves heuristic search for cost-sensitive learning. In: Proceedings of the Nineteenth International Conference of Machine Learning, Sydney, Australia, pp. 20–35. Morgan Kaufmann, San Francisco (2002)Google Scholar
- 8.Greiner, R., Grove, A.J.: Learning Cost-Sensitive Active Classifiers. Artificial Intelligence 139(2) (August 2002)Google Scholar