Abstract
In current supervised machine learning research spectrum, there are several attribute reduction methodologies to acquire reducts with low test cost. They can deal with symbolic data, or numeric data with error ranges. In many cases, they consider the situation with only one type of cost; therefore the problem is single-objective. This paper addresses the attribute reduction problem on data with multi-type-costs and error ranges. First, we define the multi-objective attribute reduction problem where multi-type-costs are involved. Second, we propose three metrics to evaluate the quality of a reduct set. Third, we design a backtrack algorithm to compute the Pareto optimal set, and a heuristic algorithm to find a sub-optimal reduct set. Finally, we compare these algorithms on seven UCI (University of California-Irvine) datasets. Experimental results indicate that our heuristic algorithm has good capability of tackling the proposed problem.
Similar content being viewed by others
References
Du Y, Hu Q, Zhu P, Ma P (2011) Rule learning for classification based on neighborhood covering reduction. Inf Sci 181(24):5457–5467
Greiner R, Grove A, Roth D (2002) Learning cost-sensitive active classifiers. Artif Intell J 139(2):137–174
Hu Q, Yu D, Liu J, Wu C (2008) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178(18):3577–3594
Hunt EB, Marin J, Stone PJ (1966) Experiments in induction. Academic Press, New York. ISBN:0123623502, 978-0123623508
John GH, Langley P (1995) Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the eleventh conference on uncertainty in artificial intelligence. Morgan Kaufmann, San Mateo, pp 338–345
Kukar M, Kononenko I (1998) Cost-sensitive learning with neural networks. In: Proceedings of the 13th european conference on artificial intelligence. pp 445–449
Li JK, Zhao H, Zhu W (2014) Fast randomized algorithm with restart strategy for minimal test cost feature selection. Int J Mach Learn Cybern 5(3):234–556
Li L, Chen H, Zhu W (2012) Attribute reduction in time–cost-sensitive decision systems through backtracking
Li XJ, Zhao H, Zhu W (2014) An exponent weighted algorithm for the minimal cost feature selection. Int J Mach Learn Cybern. doi:10.1007/s13042-014-0279-4
Liu J, Liao S, Min F, Zhu W (2013) Test cost constraint attribute reduction through a genetic approach. J Inf Comput Sci 10(3):839–849
Min F, He H, Qiao Y, Zhu W (2011) Test-cost-sensitive attribute reduction. Inf Sci 181:4928–4942
Min F, Liu QH (2009) A hierarchical model for test-cost-sensitive decision systems. Inf Sci 179(14):2442–2452
Min F, Zhu W (2012) Attribute reduction of data with error ranges and test cost. Inf Sci 211(30):48–67
Pawlak Z (1982) Rough sets. Int J Parallel Progr 11(5):341–356
Quinlan J (1986) Induction of decision trees. Mach Learn 1:81–106
Quinlan J (1993) C4.5: programs for machine learning. Morgan Kaufmann, Los Altos
Salvatore G, Benedetto M, Roman S (2001) Rough sets theory for multicriteria decision analysis. Eur J Oper Res 129(1):1–47
Sun L, Xu J, Tian Y (2012) Feature selection using rough entropy-based uncertainty measures in incomplete decision systems. Knowl-based Syst 36:206–216
Turney P (1995) Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm. J Artif Intell Res 2:369–409
Turney P (2000) Types of cost in inductive concept learning. In: Workshop on cost-sensitive learning at the 17th international conference on machine learning. Stanford University, California
Wang G, Du H, Yang D (2002) Reduction of decision table based on condition information entropy. Chin J Comput 25(7):759–766
Wong S, Ziarko W (1985) Optimal decision rules in decision table. Bull Polish Acad Sci 33(11–12):693–696
Xu B, Chen H, Zhu W (2013) Multi-objective cost-sensitive attribute reduction. In: 2013 IFSA world congress NAFIPS annual meeting Edmonton, Canada
Xu B, Min F, Zhu W, Chen H (2014) A genetic algorithm to multi-objective cost-sensitive attribute reduction. J Comput Inf Syst 10(7):3011–3022
Yao YY (2004) A partition model of granular computing. In: Lecture notes in computer science, vol 3100. pp 232–253
Zhao H, Min F, Zhu W (2013) Cost-sensitive feature selection of numeric data with measurement errors. J Appl Math 2013:1–13
Zhao H, Min F, Zhu W (2013) Test-cost-sensitive attribute reduction of data with normal distribution measurement errors. Math Prob Eng 2013:1–12. doi:10.1155/2013/946070
Zhu W (2009) Relationship among basic concepts in covering-based rough sets. Inf Sci 17(14):2478–2486
Zhu W, Wang F (2003) Reduction and axiomization of covering generalized rough sets. Inf Sci 152(1):217–230
Zubek V, Dietterich T (2002) Pruning improves heuristic search for cost-sensitive learning. In: Proceedings of the 19th international conference on machine learning, Sydney, Australia. pp 27–34
Acknowledgments
This work is supported in part by the Natural Science Foundation of Sichuan Province Ministry of Education under Grant No. 11ZB018 and 12ZA292, National Science Foundation of China under Grant No. 61379089 and Scientific Research Startingproject of SWPU under Grant No. 2014QHZ025.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fang, Y., Liu, ZH. & Min, F. Multi-objective cost-sensitive attribute reduction on data with error ranges. Int. J. Mach. Learn. & Cyber. 7, 783–793 (2016). https://doi.org/10.1007/s13042-014-0296-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-014-0296-3