Advertisement

Complexity of Rule Sets Induced from Data Sets with Many Lost and Attribute-Concept Values

  • Patrick G. Clark
  • Cheng Gao
  • Jerzy W. Grzymala-BusseEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9693)

Abstract

In this paper we present experimental results on rule sets induced from 12 data sets with many missing attribute values. We use two interpretations of missing attribute values: lost values and attribute-concept values. Our main objective is to check which interpretation of missing attribute values is better from the view point of complexity of rule sets induced from the data sets with many missing attribute values. The better interpretation is the attribute-value. Our secondary objective is to test which of the three probabilistic approximations used for the experiments provide the simplest rule sets: singleton, subset or concept. The subset probabilistic approximation is the best, with 5 % significance level.

Keywords

Incomplete data Lost values Attribute-concept values Probabilistic approximations MLEM2 rule induction algorithm 

References

  1. 1.
    Pawlak, Z., Wong, S.K.M., Ziarko, W.: Rough sets: probabilistic versus deterministic approach. Int. J. Man Mach. Stud. 29, 81–95 (1988)CrossRefzbMATHGoogle Scholar
  2. 2.
    Pawlak, Z., Skowron, A.: Rough sets: Some extensions. Inf. Sci. 177, 28–40 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Yao, Y.Y.: Probabilistic rough set approximations. Int. J. Approximate Reasoning 49, 255–271 (2008)CrossRefzbMATHGoogle Scholar
  4. 4.
    Yao, Y.Y., Wong, S.K.M.: A decision theoretic framework for approximate concepts. Int. J. Man Mach. Stud. 37, 793–809 (1992)CrossRefGoogle Scholar
  5. 5.
    Ziarko, W.: Probabilistic approach to rough sets. Int. J. Approximate Reasoning 49, 272–284 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Grzymala-Busse, J.W.: Rough set strategies to data with missing attribute values. In: Notes of the Workshop on Foundations and New Directions of Data Mining, in conjunction with the Third International Conference on Data Mining, pp. 56–63 (2003)Google Scholar
  7. 7.
    Grzymala-Busse, J.W.: Data with missing attribute values: generalization of indiscernibility relation and rule induction. Trans. Rough Sets 1, 78–95 (2004)zbMATHGoogle Scholar
  8. 8.
    Grzymała-Busse, J.W.: Generalized parameterized approximations. In: Yao, J.T., Ramanna, S., Wang, G., Suraj, Z. (eds.) RSKT 2011. LNCS, vol. 6954, pp. 136–145. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  9. 9.
    Clark, P.G., Grzymala-Busse, J.W.: Experiments on probabilistic approximations. In: Proceedings of the 2011 IEEE International Conference on Granular Computing, pp. 144–149 (2011)Google Scholar
  10. 10.
    Clark, P.G., Grzymala-Busse, J.W., Rzasa, W.: Mining incomplete data with singleton, subset and concept approximations. Inf. Sci. 280, 368–384 (2014)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Clark, P.G., Grzymala-Busse, J.W.: Complexity of rule sets induced from incomplete data with lost values and attribute-concept values. In: Proceedings of the Third International Conference on Intelligent Systems and Applications, pp. 91–96 (2014)Google Scholar
  12. 12.
    Clark, P.G., Grzymala-Busse, J.W.: Mining incomplete data with lost values and attribute-concept values. In: Proceedings of the IEEE International Conference on Granular Computing, pp. 49–54 (2014)Google Scholar
  13. 13.
    Clark, P.G., Grzymala-Busse, J.W.: Mining incomplete data with many lost and attribute-concept values. In: Ciucci, D., Wang, G., Mitra, S., Wu, W.-Z. (eds.) RSKT 2015. LNCS, vol. 9436, pp. 100–109. Springer, Heidelberg (2015)CrossRefGoogle Scholar
  14. 14.
    Clark, P.G., Grzymala-Busse, J.W.: On the number of rules and conditions in mining incomplete data with lost values and attribute-concept values. In: Proceedings of the DBKDA 7-th International Conference on Advances in Databases, Knowledge, and Data Applications, pp. 121–126 (2015)Google Scholar
  15. 15.
    Grzymala-Busse, J.W., Wang, A.Y.: Modified algorithms LEM1 and LEM2 for rule induction from data with missing attribute values. In: Proceedings of the 5-th International Workshop on Rough Sets and Soft Computing in Conjunction with the Third Joint Conference on Information Sciences, pp. 69–72 (1997)Google Scholar
  16. 16.
    Stefanowski, J., Tsoukias, A.: Incomplete information tables and rough classification. Comput. Intell. 17(3), 545–566 (2001)CrossRefzbMATHGoogle Scholar
  17. 17.
    Pawlak, Z.: Rough sets. Int. J. Comput. Inform. Sci. 11, 341–356 (1982)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Grzymala-Busse, J.W.: A new version of the rule induction system LERS. Fundamenta Informaticae 31, 27–39 (1997)zbMATHGoogle Scholar
  19. 19.
    Grzymala-Busse, J.W., Rzasa, W.: Definability and other properties of approximations for generalized indiscernibility relations. Trans. Rough Sets 11, 14–39 (2010)zbMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Patrick G. Clark
    • 1
  • Cheng Gao
    • 1
  • Jerzy W. Grzymala-Busse
    • 1
    • 2
    Email author
  1. 1.Department of Electrical Engineering and Computer ScienceUniversity of KansasLawrenceUSA
  2. 2.Department of Expert Systems and Artificial IntelligenceUniversity of Information Technology and ManagementRzeszowPoland

Personalised recommendations