Soft Computing

, Volume 21, Issue 20, pp 6159–6173 | Cite as

A set-cover-based approach for the test-cost-sensitive attribute reduction problem

Methodologies and Application
  • 163 Downloads

Abstract

In data mining application, the test-cost-sensitive attribute reduction is an important task which aims to decrease the test cost of data. In operational research, the set cover problem is a typical optimization problem and has a long investigation history compared to the attribute reduction problem. In this paper, we employ the methods of set cover problem to deal with the test-cost-sensitive attribute reduction. First, we equivalently transform the test-cost-sensitive reduction problem into the set cover problem by using a constructive approach. It is shown that computing a reduct of a decision system with minimal test cost is equal to computing an optimal solution of the set cover problem. Then, a set-cover-based heuristic algorithm is introduced to solve the test-cost-sensitive reduction problem. In the end, we conduct several numerical experiments on data sets from UCI machine learning repository. Experimental results indicate that the set-cover-based algorithm has superior performances in most cases, and the algorithm is efficient on data sets with many attributes.

Keywords

Attribute reduction Decision table Rough set Set cover problem Test cost 

Notes

Acknowledgments

This work is supported by Grants from National Natural Science Foundation of China (Nos. 61573321, 61272021, 61202206 and 61173181), Zhejiang Provincial Natural Science Foundation of China (Nos. LZ12F03002, LY14F030001), Open Foundation from Marine Sciences in the Most Important Subjects of Zhejiang (No. 20130109), and Scientific Research Start-up Fund of Zhejiang Ocean University (No. 21065014715).

Compliance with ethical standards

Conflict of interest

Author Anhui Tan declares that he has no conflict of interest. Author Weizhi Wu declares that he has no conflict of interest. Author Yuzhi Tao declares that she has no conflict of interest.

Ethical standard

This article does not contain any studies with human participants or animals performed by any of the authors.

References

  1. Bolón-Canedo V, Porto-Díaz I, Sánchez-Maroño N, Alonso-Betanzos A (2014) A framework for cost-based feature selection. Pattern Recogn 47:2481–2489CrossRefGoogle Scholar
  2. Brown G, Pocock A, Zhao MJ, Luján M (2012) Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J Mach Learn Res 13:27–66MathSciNetMATHGoogle Scholar
  3. Caprara A, Toth P, Fischetti M (2000) Algorithms for the set covering problem. Ann Oper Res 98:353–371MathSciNetCrossRefMATHGoogle Scholar
  4. Chen CY, Li ZG (2004) A study of reduction of attributes and set covering problem. Comput Eng Appl 2:1–14Google Scholar
  5. Chen DG, Zhao SY, Zhang L, Yang YP, Zhang X (2012) Sample pair selection for attribute reduction with rough set. IEEE Trans Knowl Data Eng 24:2080–2093CrossRefGoogle Scholar
  6. Chen JK, Lin YJ, Lin GP, Li JJ, Ma ZM (2015) The relationship between attribute reducts in rough sets and minimal vertex covers of graphs. Inf Sci 325:87–97MathSciNetCrossRefGoogle Scholar
  7. Chvatal V (1979) A greedy-heuristic for the set covering problem. Math Oper Res 4:233–235MathSciNetCrossRefMATHGoogle Scholar
  8. Fan AJ, Zhao H, Zhu W (2015) Test-cost-sensitive attribute reduction on heterogeneous data for adaptive neighborhood model. Soft Comput. doi:10.1007/s00500-015-1770-x MATHGoogle Scholar
  9. Gao C, Yao X, Weise T, Li JL (2015) An efficient local search heuristic with row weighting for the unicost set covering problem. Eur J Oper Res 246:750–761MathSciNetCrossRefMATHGoogle Scholar
  10. Hu QH, Pan WW, Zhang L, Zhang D, Song YP, Guo MZ, Yu DR (2012) Feature selection for monotonic classification. IEEE Trans Fuzzy Syst 20(1):69–81CrossRefGoogle Scholar
  11. Jia XY, Liao WH, Tang ZM, Shang L (2013) Minimum cost attribute reduction in decision-theoretic rough set models. Inf Sci 219:151–167MathSciNetCrossRefMATHGoogle Scholar
  12. Jing SY (2014) A hybrid genetic algorithm for feature subset selection in rough set theory. Soft Comput 18(7):1373–1382CrossRefGoogle Scholar
  13. Kusunoki Y, Inuiguchi M (2010) A unified approach to reducts in dominance-based rough set approach. Soft Comput 14(5):507–515CrossRefMATHGoogle Scholar
  14. Lavrac N, Gamberger D, Turney P (1996) Cost-sensitive feature reduction applied to a hybrid genetic algorithm. In: Proceedings of the 7th international workshop on algorithmic learning theory, ALTGoogle Scholar
  15. Liang JY, Shi ZZ (2004) The information entropy, rough entropy and knowledge granulation in rough set theory. Int J Uncertain Fuzziness Knowl Based Syst 12:37–46MathSciNetCrossRefMATHGoogle Scholar
  16. Liu JNK, Hua YX, He YL (2014) A set covering based approach to find the reduct of variable precision rough set. Inf Sci 275:83–100MathSciNetCrossRefMATHGoogle Scholar
  17. Mi JS, Leung Y, Wu WZ (2011) Dependence-space-based attribute reduction in consistent decision tables. Soft Comput 15:261–268CrossRefMATHGoogle Scholar
  18. Miao DQ, Zhao Y, Yao YY, Li H, Xu F (2009) Relative reducts in consistent and inconsistent decision tables of the Pawlak rough set model. Inf Sci 179(24):4140–4150MathSciNetCrossRefMATHGoogle Scholar
  19. Min F, Liu QH (2009) A hierarchical model for test-cost-sensitive decision systems. Inf Sci 179:2442–2452MathSciNetCrossRefMATHGoogle Scholar
  20. Min F, He HP, Qian YH, Zhu W (2011) Test-cost-sensitive attribute reduction. Inf Sci 181:4928–4942CrossRefGoogle Scholar
  21. Min F, Zhu W (2012) Attribute reduction of data with error ranges and test costs. Inf Sci 211:48–67MathSciNetCrossRefMATHGoogle Scholar
  22. Min F, Hu QH, Zhu W (2014) Feature selection with test cost constraint. Int J Approx Reason 55:167–179MathSciNetCrossRefMATHGoogle Scholar
  23. Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer, DordrechtCrossRefMATHGoogle Scholar
  24. Qian YH, Liang JY, Pedrycz W, Dang CY (2010) Positive approximation: an accelerator for attribute reduction in rough set theory. Artif Intell 174:597–618MathSciNetCrossRefMATHGoogle Scholar
  25. Qian YH, Liang JY, Dang CY (2010) Incomplete multigranulation rough set. IEEE Trans Syst Man Cybern A 20:420–431CrossRefGoogle Scholar
  26. Qian YH, Wang Q, Cheng HH, Liang JY, Dang CY (2015) Fuzzy-rough feature selection accelerator. Fuzzy Sets Syst 258:61–78MathSciNetCrossRefMATHGoogle Scholar
  27. Quan GR, Hong BR, Ye F, Ren SJ (1998) A heuristic function algorithm for minimum set-covering problem. J Softw 9:156–160Google Scholar
  28. Skowron A, Rauszer C (1992) The discernibility matrices and functions in information systems. In: Slowinski R (ed) Intelligent decision support, theory and decision library, vol 11. Springer, Netherlands, pp 331–362CrossRefGoogle Scholar
  29. Slavík P (1996) A tight analysis of the greedy algorithm for set cover. In: Proceedings of the 28th annual ACM symposium on theory of computing, STOC ’96, ACM, pp 435–441Google Scholar
  30. Slezak D (2002) Approximate entropy reducts. Fundam Informat 53:365–390MathSciNetMATHGoogle Scholar
  31. Xu YT, Wang LS, Zhang RY (2011) A dynamic attribute reduction algorithm based on 0–1 integer programming. Knowl-Based Syst 24:1341–1347CrossRefGoogle Scholar
  32. Yang XB, Qi YS, Song XN, Yang JY (2013) Test cost sensitive multigranulation rough set: model and minimal cost selection. Inf Sci 250:184–199MathSciNetCrossRefMATHGoogle Scholar
  33. Yao YY, Zhao Y (2009) Discernibility matrix simplification for constructing attribute reducts. Inf Sci 179:867–882MathSciNetCrossRefMATHGoogle Scholar
  34. Zhao H, Zhu W (2014) Optimal cost-sensitive granularization based on rough sets for variable costs. Knowl-Based Syst 65:72–82CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  1. 1.School of Mathematics, Physics and Information ScienceZhejiang Ocean UniversityZhoushanChina
  2. 2.Key Laboratory of Oceanographic Big Data Mining and Application of Zhejiang ProvinceZhoushanChina

Personalised recommendations