Advertisement

Applied Intelligence

, Volume 39, Issue 3, pp 642–658 | Cite as

Multi-level rough set reduction for decision rule mining

  • Mingquan Ye
  • Xindong Wu
  • Xuegang Hu
  • Donghui Hu
Article

Abstract

Most previous studies on rough sets focused on attribute reduction and decision rule mining on a single concept level. Data with attribute value taxonomies (AVTs) are, however, commonly seen in real-world applications. In this paper, we extend Pawlak’s rough set model, and propose a novel multi-level rough set model (MLRS) based on AVTs and a full-subtree generalization scheme. Paralleling with Pawlak’s rough set model, some conclusions related to the MLRS are given. Meanwhile, a novel concept of cut reduction based on MLRS is presented. A cut reduction can induce the most abstract multi-level decision table with the same classification ability on the raw decision table, and no other multi-level decision table exists that is more abstract. Furthermore, the relationships between attribute reduction in Pawlak’s rough set model and cut reduction in MLRS are discussed. We also prove that the problem of cut reduction generation is NP-hard, and develop a heuristic algorithm named CRTDR for computing the cut reduction. Finally, an approach named RMTDR for mining multi-level decision rule is provided. It can mine decision rules from different concept levels. Example analysis and comparative experiments show that the proposed methods are efficient and effective in handling the problems where data is associated with AVTs.

Keywords

Rough set theory Multi-level data mining Attribute value taxonomy Data generalization Concept level 

Notes

Acknowledgements

The authors would like to thank the anonymous referees for their valuable comments. This paper is in part supported by the National High Technology Research and Development Program (863 Program) of China under Grant 2012AA011005, the National 973 Program of China under Grant 2013CB329604, the National Natural Science Foundation of China (NSFC) under Grants 60975034, 61272540 and 61229301 and the US National Science Foundation (NSF) under Grant CCF-0905337.

References

  1. 1.
    Chen H, Li TR, Ruan D (2012) Maintenance of approximations in incomplete ordered decision systems while attribute values coarsening or refining. Knowl-Based Syst 31:140–161 CrossRefGoogle Scholar
  2. 2.
    Chen YL, Wu YY, Chang RI (2007) From data to global generalized knowledge. Decis Support Syst 52(2):295–307 CrossRefGoogle Scholar
  3. 3.
    Czarnowski I (2012) Cluster-based instance selection for machine classification. Knowl Inf Syst 30(1):113–133 CrossRefGoogle Scholar
  4. 4.
    DesJardins M, Rathod P, Getoor L (2008) Learning structured Bayesian networks: combining abstraction hierarchies and tree-structured conditional probability. Comput Intell 24(1):1–22 MathSciNetCrossRefGoogle Scholar
  5. 5.
    Dioşan L, Rogozan A, Pecuchet JP (2012) Improving classification performance of support vector machine by genetically optimising kernel shape and hyper-parameters. Appl Intell 36:280–294 CrossRefGoogle Scholar
  6. 6.
    Feng L, Li TR, Ruan D, Gou S (2011) A vague-rough set approach for uncertain knowledge acquisition. Knowl-Based Syst 24(6):837–843 CrossRefGoogle Scholar
  7. 7.
    Feng Q, Miao D, Cheng Y (2010) Hierarchical decision rules mining. Expert Syst Appl 37(3):2081–2091 CrossRefGoogle Scholar
  8. 8.
    Foithong S, Pinngern O, Attachoo B (2012) Feature subset selection wrapper based on mutual information and rough sets. Expert Syst Appl 39(1):574–584 CrossRefGoogle Scholar
  9. 9.
    Guan JW, Bell DA (1998) Rough computational methods for information systems. Artif Intell 105(1/2):77–103 CrossRefzbMATHGoogle Scholar
  10. 10.
    Guan YY, Wang HK, Wang Y, Yang F (2009) Attribute reduction and optimal decision rules acquisition for continuous valued information systems. Inf Sci 179(17):2974–2984 MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Han J, Fu Y (1995) Discovery of multiple-level association rules from large databases. In: Proceedings of the international conference on very large data bases, pp 420–431 Google Scholar
  12. 12.
    Han Y, Lam W (2007) Utilizing hierarchical feature domain values for prediction. Data Knowl Eng 61(3):540–553 CrossRefGoogle Scholar
  13. 13.
    Hassan YF (2011) Rough sets for adapting wavelet neural networks as a new classifier system. Appl Intell 35(2):260–268 CrossRefGoogle Scholar
  14. 14.
    Hong TP, Lin CE, Lin JH, Wang SL (2008) Learning cross-level certain and possible rules by rough sets. Expert Syst Appl 34(3):1698–1706 CrossRefGoogle Scholar
  15. 15.
    Hong TP, Liou YL, Wang SL (2009) Fuzzy rough sets with hierarchical quantitative attribute. Expert Syst Appl 36(3):6790–6799 CrossRefGoogle Scholar
  16. 16.
    Hu XH, Cercone N (2001) Discovering maximal generalized decision rules through horizontal and vertical data reduction. Comput Intell 17(4):685–702 CrossRefGoogle Scholar
  17. 17.
    Jo H, Na Y, Oh B, Yang J, Honavar V (2011) Attribute value taxonomy generation through matrix based adaptive genetic algorithm. In: Proceedings of the 20th IEEE international conference on tools with artificial intelligence, vol 1, pp 393–400 Google Scholar
  18. 18.
    Kang DK, Kim MJ (2011) Propositionalized attribute taxonomies from data for data-driven construction of concise classifiers. Expert Syst Appl 38(10):12739–12746 CrossRefGoogle Scholar
  19. 19.
    Kang DK, Silvescu A, Zhang J, Honavar V (2004) Generation of attribute value taxonomies from data for data-driven construction of accurate and compact classifiers. In: Proceedings of the 4th international conference on data mining, pp 130–137 Google Scholar
  20. 20.
    Kang DK, Sohn K (2009) Learning decision trees with taxonomy of propositionalized attributes. Pattern Recognit 42(1):84–92 CrossRefzbMATHGoogle Scholar
  21. 21.
    Li T, Ruan D, Geert W, Song J, Xu Y (2007) A rough sets based characteristic relation approach for dynamic attribute generalization in data mining. Knowl-Based Syst 20(5):485–494 CrossRefGoogle Scholar
  22. 22.
    Liu C, Miao D, Zhang N (2012) Graded rough set model based on two universes and its properties. Knowl-Based Syst 33:65–72 CrossRefGoogle Scholar
  23. 23.
    Parthalain N, Shen Q, Jensen R (2010) A distance measure approach to exploring the rough set boundary region for attribute reduction. IEEE Trans Knowl Data Eng 22(3):305–317 CrossRefGoogle Scholar
  24. 24.
    Pawlak Z (1982) Rough sets. Int J Comput Inf Sci 11(5):341–356 MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Pawlak Z, Skowron A (2007) Rough sets: some extensions. Inf Sci 177(1):28–40 MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Qian YH, Liang JY, Dang CY (2008) Converse approximation and rule extraction from decision tables in rough set theory. Comput Math Appl 55(8):1754–1765 MathSciNetCrossRefzbMATHGoogle Scholar
  27. 27.
    Qian YH, Liang JY, Pedrycz W, Dang CY (2010) Positive approximation: an accelerator for attribute reduction in rough set theory. Artif Intell 174(9/10):597–618 MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Qian YH, Liang JY, Pedrycz W, Dang CY (2011) An efficient accelerator for attribute reduction from incomplete data in rough set framework. Pattern Recognit 44(8):1658–1670 CrossRefzbMATHGoogle Scholar
  29. 29.
    Ramentol E, Caballero Y, Bello R, Herrera F (2012) SMOTE-RSB: a hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory. Knowl Inf Syst 33(2):245–265 CrossRefGoogle Scholar
  30. 30.
    Riquelme JC, Aguilar JS, Toro M (2000) Discovering hierarchical decision rules with evolutive algorithms in supervised learning. Int J Comput Syst Signals 1(1):73–84 Google Scholar
  31. 31.
    Rzepakowski P, Jaroszewicz S (2012) Decision trees for uplift modeling with single and multiple treatments. Knowl Inf Syst 32(2):303–327 CrossRefGoogle Scholar
  32. 32.
    Sadoghi Yazdi H, Rowhanimanesh A, Modares H (2012) A general insight into the effect of neuron structure on classification [J]. Knowl Inf Syst 30(1):135–154 CrossRefGoogle Scholar
  33. 33.
    Salamó M, López-Sánchez M (2011) Rough set based approaches to feature selection for case-based reasoning classifiers. Pattern Recognit Lett 32(2):280–292 CrossRefGoogle Scholar
  34. 34.
    Srikant R, Agrawal R (1997) Mining generalized association rules. Future Gener Comput Syst 13(2):161–180 CrossRefGoogle Scholar
  35. 35.
    Tsumoto S (2003) Automated extraction of hierarchical decision rules from clinical databases using rough set model. Expert Syst Appl 24(2):189–197 CrossRefGoogle Scholar
  36. 36.
    Verma B, Hassan SZ (2011) Hybrid ensemble approach for classification. Appl Intell 34(2):258–278 CrossRefGoogle Scholar
  37. 37.
    Vinh LT, Lee S, Park YT, d’Auriol BJ (2012) A novel feature selection method based on normalized mutual information. Appl Intell 37(1):100–120 CrossRefGoogle Scholar
  38. 38.
    Wang L, Yang B, Chen Y et al (2012) Improvement of neural network classifier using floating centroids. Knowl Inf Syst 31(3):433–454 CrossRefGoogle Scholar
  39. 39.
    Wang SKM, Ziarko W (1985) On optimal decision rules in decision tables. Bull Pol Acad Sci 33(11–12):693–696 Google Scholar
  40. 40.
    Yao Y, Zhao Y (2008) Attribute reduction in decision-theoretic rough set models. Inf Sci 178(17):3356–3373 MathSciNetCrossRefzbMATHGoogle Scholar
  41. 41.
    Ye MQ, Wu XD, Hu XG, Hu DH (2013) Anonymizing classification data using rough set theory. Knowl-Based Syst. doi: 10.1016/j.knosys.2013.01.007 Google Scholar
  42. 42.
    Zhang J, Kang DK, Silvescu A, Honavar V (2006) Learning accurate and concise naive Bayes classifiers from attribute value taxonomies and data. Knowl Inf Syst 9(2):157–179 CrossRefGoogle Scholar
  43. 43.
    Zhang X, Chen G, Wei Q (2011) Building a highly-compact and accurate associative classifier. Appl Intell 34(1):74–86 CrossRefGoogle Scholar
  44. 44.
    Ziarko W (2003) Acquisition of hierarchy-structured probabilistic decision tables and rules from data. Expert Syst 20(5):305–310 CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Mingquan Ye
    • 1
    • 3
  • Xindong Wu
    • 1
    • 2
  • Xuegang Hu
    • 1
  • Donghui Hu
    • 1
  1. 1.Department of Computer ScienceHefei University of TechnologyHefeiP.R. China
  2. 2.Department of Computer ScienceUniversity of VermontBurlingtonUSA
  3. 3.Department of Computer ScienceWannan Medical CollegeWuhuP.R. China

Personalised recommendations