Abstract
Uncertainty exists widely in real-word applications. Recently, the research for uncertain data has attracted more and more attention. While not enough attention has been paid to the research of cost- sensitive algorithm on uncertain data. In this paper, we propose a simple but effective method to extend traditional cost-sensitive decision tree to uncertain data, and the algorithm can deal with both certain and uncertain data. In our experiment, we compare the proposed algorithm with DTU[18] on UCI datasets. The experimental result proves that the proposed algorithm performs better than DTU, with lower computational complexity. It keeps low cost even at high level of uncertainty, which makes it applicable to real-life applications for data uncertainty.
This work is supported by the National Natural Science Foundation of China (60873196) and Chinese Universities Scientific Fund (QN2009092).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases (website). University of California, Department of Information and Computer Science, Irvine, CA (1998)
Turney, P.D.: Types of Cost in Inductive Concept Learning. Workshop on Cost-Sensitive Learning. In: ICML (2000)
Turney, P.D.: Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm. JAIR 2, 369–409 (1995)
Ling, C.X., Yang, Q., Wang, J., Zhang, S.: Decision Trees with Minimal Costs. In: ICML (2004)
Zhang, S., Qin, Z., Ling, C.X., Sheng, S.: Missing is Useful: Missing Values in Cost-sensitive Decision Trees. IEEE Transactions on Knowledge and Data Engineering, TKDE (2005)
Ling, C.X., Sheng, S., Yang, Q.: Intelligent Test Strategies for Cost-sensitive Decision Trees. IEEE Transactions on Knowledge and Data Engineering, TKDE (2005)
Sheng, S., Ling, C.X., Yang, Q.: Simple Test Strategies for Cost-Sensitive Decision Trees. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 365–376. Springer, Heidelberg (2005)
Sheng, S., Ling, C.X.: Hybrid Cost-Sensitive Decision Tree. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 274–284. Springer, Heidelberg (2005)
Ling, C.X., Sheng, S., Yang, Q.: Test Strategies for Cost-Sensitive Decision Trees. IEEE Transactions on Knowledge and Data Engineering (TKDE) 18(8) (2006)
Elkan, C.: The Foundations of Cost-Sensitive Learning. In: Proceedings of the 17th International Joint Conference of Artificial Intelligence, Seattle, pp. 973–978 (2001)
Domingos, P.: MetaCost: A General Method for Making Classifiers Cost-Sensitive. In: Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining, pp. 155–164 (1999)
Greiner, R., Grove, A., Roth, D.: Learning Cost-sensitive Active Classifiers. Artificial Intelligence 139(2), 137–174 (2002)
Tan, M.: Cost-sensitive Learning of Classification Knowledge and its Applications in Robotics. Machine Learning Journal 13, 7–33 (1993)
Qin, Z., Zhang, S., Zhang, C.: Cost-Sensitive Decision Trees with Multiple Cost Scales. In: Webb, G.I., Yu, X. (eds.) AI 2004. LNCS (LNAI), vol. 3339, pp. 380–390. Springer, Heidelberg (2004)
Chai, X., Deng, L., Yang, Q., et al.: Test- cost sensitive Naive Bayes Classification. In: IEEE International Conference on Data Mining (ICDM) (2004)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
Aggarwal, C.C., Yu, P.S.: A Survey of Uncertain Data Algorithms and Applications. IEEE Transactions on Knowledge and Data Engineering(TKDE) 21(5), 609–623 (2009)
Qin, B., Xia, Y., Li, F.: DTU: A Decision Tree for Uncertain Data. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS, vol. 5476, pp. 4–15. Springer, Heidelberg (2009)
Tsang, S., Kao, B., et al.: Decision Trees for Uncertain Data. IEEE Transactions on Knowledge and Data Engineering, August 11 (2009)
Qin, B., Xia, Y., Prabhakar S., Tu, Y.: A Rule-based Classification Algorithm for Uncertain Data. In: IEEE International Conference on Data Engineering (2009)
Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the 13th International Joint Conference on Artificial Intelligence, pp. 1022–1027. Morgan Kaufmann (1993)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, M., Zhang, Y., Zhang, X., Wang, Y. (2011). Cost-Sensitive Decision Tree for Uncertain Data. In: Tang, J., King, I., Chen, L., Wang, J. (eds) Advanced Data Mining and Applications. ADMA 2011. Lecture Notes in Computer Science(), vol 7120. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25853-4_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-25853-4_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25852-7
Online ISBN: 978-3-642-25853-4
eBook Packages: Computer ScienceComputer Science (R0)