Multi-level rough set reduction for decision rule mining
Most previous studies on rough sets focused on attribute reduction and decision rule mining on a single concept level. Data with attribute value taxonomies (AVTs) are, however, commonly seen in real-world applications. In this paper, we extend Pawlak’s rough set model, and propose a novel multi-level rough set model (MLRS) based on AVTs and a full-subtree generalization scheme. Paralleling with Pawlak’s rough set model, some conclusions related to the MLRS are given. Meanwhile, a novel concept of cut reduction based on MLRS is presented. A cut reduction can induce the most abstract multi-level decision table with the same classification ability on the raw decision table, and no other multi-level decision table exists that is more abstract. Furthermore, the relationships between attribute reduction in Pawlak’s rough set model and cut reduction in MLRS are discussed. We also prove that the problem of cut reduction generation is NP-hard, and develop a heuristic algorithm named CRTDR for computing the cut reduction. Finally, an approach named RMTDR for mining multi-level decision rule is provided. It can mine decision rules from different concept levels. Example analysis and comparative experiments show that the proposed methods are efficient and effective in handling the problems where data is associated with AVTs.
KeywordsRough set theory Multi-level data mining Attribute value taxonomy Data generalization Concept level
The authors would like to thank the anonymous referees for their valuable comments. This paper is in part supported by the National High Technology Research and Development Program (863 Program) of China under Grant 2012AA011005, the National 973 Program of China under Grant 2013CB329604, the National Natural Science Foundation of China (NSFC) under Grants 60975034, 61272540 and 61229301 and the US National Science Foundation (NSF) under Grant CCF-0905337.
- 11.Han J, Fu Y (1995) Discovery of multiple-level association rules from large databases. In: Proceedings of the international conference on very large data bases, pp 420–431 Google Scholar
- 17.Jo H, Na Y, Oh B, Yang J, Honavar V (2011) Attribute value taxonomy generation through matrix based adaptive genetic algorithm. In: Proceedings of the 20th IEEE international conference on tools with artificial intelligence, vol 1, pp 393–400 Google Scholar
- 19.Kang DK, Silvescu A, Zhang J, Honavar V (2004) Generation of attribute value taxonomies from data for data-driven construction of accurate and compact classifiers. In: Proceedings of the 4th international conference on data mining, pp 130–137 Google Scholar
- 30.Riquelme JC, Aguilar JS, Toro M (2000) Discovering hierarchical decision rules with evolutive algorithms in supervised learning. Int J Comput Syst Signals 1(1):73–84 Google Scholar
- 39.Wang SKM, Ziarko W (1985) On optimal decision rules in decision tables. Bull Pol Acad Sci 33(11–12):693–696 Google Scholar