Mining Incomplete Data—A Rough Set Approach

  • Jerzy W. Grzymała-Busse
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6954)

Abstract

A rough set approach to mining incomplete data is presented in this paper. Our main tool is an attribute-value pair block. A characteristic set, a generalization of the elementary set well-known in rough set theory, may be computed using such blocks. For incomplete data sets three different types of global approximations: singleton, subset and concept are defined. Additionally, for incomplete data sets a local approximation is defined as well.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Cyran, K.A.: Modified indiscernibility relation in the theory of rough sets with real-valued attributes: Application to recognition of fraunhofer diffraction patterns. Transactions on Rough Sets 9, 14–34 (2008)Google Scholar
  2. 2.
    Dai, J., Xu, Q., Wang, W.: A comparative study on strategies of rule induction for incomplete data based on rough set approach. International Journal of Advancements in Computing Technology 3, 176–183 (2011)Google Scholar
  3. 3.
    Dardzinska, A., Ras, Z.W.: Chasing unknown values in incomplete information systems. In: Workshop Notes, Foundations and New Directions of Data Mining, in Conjunction with the 3-rd International Conference on Data Mining, pp. 24–30 (2003)Google Scholar
  4. 4.
    Dardzinska, A., Ras, Z.W.: On rule discovery from incomplete information systems. In: Workshop Notes, Foundations and New Directions of Data Mining, in conjunction with the 3-rd International Conference on Data Mining, pp. 24–30 (2003)Google Scholar
  5. 5.
    Greco, S., Matarazzo, B., Slowinski, R.: Dealing with missing data in rough set analysis of multi-attribute and multi-criteria decision problems. In: Zanakis, H., Doukidis, G., Zopounidised, Z. (eds.) Decision Making: Recent Developments and Worldwide Applications, pp. 295–316. Kluwer Academic Publishers, Dordrecht (2000)CrossRefGoogle Scholar
  6. 6.
    Grzymala-Busse, J.W.: On the unknown attribute values in learning from examples. In: Proceedings of the ISMIS-1991, 6th International Symposium on Methodologies for Intelligent Systems, pp. 368–377 (1991)Google Scholar
  7. 7.
    Grzymala-Busse, J.W.: Rough set strategies to data with missing attribute values. In: Workshop Notes, Foundations and New Directions of Data Mining, in Conjunction with the 3-rd International Conference on Data Mining, pp. 56–63 (2003)Google Scholar
  8. 8.
    Grzymala-Busse, J.W.: Data with missing attribute values: Generalization of indiscernibility relation and rule induction. Transactions on Rough Sets 1, 78–95 (2004)MATHGoogle Scholar
  9. 9.
    Grzymała-Busse, J.W.: Characteristic relations for incomplete data: A generalization of the indiscernibility relation. In: Tsumoto, S., Słowiński, R., Komorowski, J., Grzymała-Busse, J.W. (eds.) RSCTC 2004. Current Trends, vol. 3066, pp. 244–253. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  10. 10.
    Grzymala-Busse, J.W.: Three approaches to missing attribute values—a rough set perspective. In: Proceedings of the Workshop on Foundation of Data Mining, in Conjunction with the Fourth IEEE International Conference on Data Mining, pp. 55–62 (2004)Google Scholar
  11. 11.
    Grzymała-Busse, J.W.: Incomplete data and generalization of indiscernibility relation, definability, and approximations. In: Ślęzak, D., Wang, G., Szczuka, M.S., Düntsch, I., Yao, Y. (eds.) RSFDGrC 2005. LNCS (LNAI), vol. 3641, pp. 244–253. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  12. 12.
    Grzymala-Busse, J.W.: A comparison of traditional and rough set approaches to missing attribute values in data mining. In: Proceedings of the 10-th International Conference on Data Mining, Detection, Protection and Security, Royal Mare Village, Crete, pp. 155–163 (2009)Google Scholar
  13. 13.
    Grzymala-Busse, J.W.: Mining data with missing attribute values: A comparison of probabilistic and rough set approaches. In: Proceedings of the 4-th International Conference on Intelligent Systems and Knowledge Engineering, pp. 153–158 (2009)Google Scholar
  14. 14.
    Grzymala-Busse, J.W., Grzymala-Busse, W.J.: Handling missing attribute values. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 37–57. Springer-Verlag, Heidelberg (2005)CrossRefGoogle Scholar
  15. 15.
    Grzymala-Busse, J.W., Grzymala-Busse, W.J.: An experimental comparison of three rough set approaches to missing attribute values. Transactions on Rough Sets 6, 31–50 (2007)MATHGoogle Scholar
  16. 16.
    Grzymala-Busse, J.W., Grzymala-Busse, W.J.: Improving quality of rule sets by increasing incompleteness of data sets. In: Cordeiro, J., Shishkov, B., Ranchordas, A., Helfert, M. (eds.) ICSOFT 2008. Communications in Computer and Information Science, vol. 47, pp. 241–248. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  17. 17.
    Grzymala-Busse, J.W., Grzymala-Busse, W.J., Goodwin, L.K.: A comparison of three closest fit approaches to missing attribute values in preterm birth data. International Journal of Intelligent Systems 17(2), 125–134 (2002)CrossRefMATHGoogle Scholar
  18. 18.
    Grzymala-Busse, J.W., Grzymala-Busse, W.J., Hippe, Z.S., Rzasa, W.: An improved comparison of three rough set approaches to missing attribute values. In: Proceedings of the 16-th Int. Conference on Intelligent Information Systems, pp. 141–150 (2008)Google Scholar
  19. 19.
    Grzymała-Busse, J.W., Hu, M.: A comparison of several approaches to missing attribute values in data mining. In: Ziarko, W.P., Yao, Y. (eds.) RSCTC 2000. LNCS (LNAI), vol. 2005, pp. 378–385. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  20. 20.
    Grzymala-Busse, J.W., Rzasa, W.: Local and global approximations for incomplete data. In: Greco, S., Hata, Y., Hirano, S., Inuiguchi, M., Miyamoto, S., Nguyen, H.S., Słowiński, R. (eds.) RSCTC 2006. LNCS (LNAI), vol. 4259, pp. 244–253. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  21. 21.
    Grzymala-Busse, J.W., Rzasa, W.: Local and global approximations for incomplete data. Transactions on Rough Sets 8, 21–34 (2008)MathSciNetMATHGoogle Scholar
  22. 22.
    Grzymala-Busse, J.W., Wang, A.Y.: Modified algorithms LEM1 and LEM2 for rule induction from data with missing attribute values. In: Proceedings of the Fifth International Workshop on Rough Sets and Soft Computing (RSSC 1997) at the Third Joint Conference on Information Sciences (JCIS 1997), pp. 69–72 (1997)Google Scholar
  23. 23.
    Hong, T.P., Tseng, L.H., Chien, B.C.: Learning coverage rules from incomplete data based on rough sets. In: Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, pp. 3226–3231 (2004)Google Scholar
  24. 24.
    Kryszkiewicz, M.: Rough set approach to incomplete information systems. In: Proceedings of the Second Annual Joint Conference on Information Sciences, pp. 194–197 (1995)Google Scholar
  25. 25.
    Kryszkiewicz, M.: Rules in incomplete information systems. Information Sciences 113(3-4), 271–292 (1999)MathSciNetCrossRefMATHGoogle Scholar
  26. 26.
    Latkowski, R.: On decomposition for incomplete data. Fundamenta Informaticae 54, 1–16 (2003)MathSciNetMATHGoogle Scholar
  27. 27.
    Latkowski, R., Mikołajczyk, M.: Data decomposition and decision rule joining for classification of data with missing values. In: Tsumoto, S., Słowiński, R., Komorowski, J., Grzymała-Busse, J.W. (eds.) RSCTC 2004. LNCS (LNAI), vol. 3066, pp. 254–263. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  28. 28.
    Li, H., Yao, Y., Zhou, X., Huang, B.: Two-phase rule induction from incomplete data. In: Wang, G., Li, T., Grzymala-Busse, J.W., Miao, D., Skowron, A., Yao, Y. (eds.) RSKT 2008. LNCS (LNAI), vol. 5009, pp. 47–54. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  29. 29.
    Li, D., Deogun, I., Spaulding, W., Shuart, B.: Dealing with missing data: Algorithms based on fuzzy set and rough set theories. Transactions on Rough Sets 4, 37–57 (2005)MATHGoogle Scholar
  30. 30.
    Peng, H., Zhu, S.: Handling of incomplete data sets using ICA and SOM in data mining. Neural Computing and Applications 16, 167–172 (2007)CrossRefGoogle Scholar
  31. 31.
    Li, T., Ruan, D., Geert, W., Song, J., Xu, Y.: A rough sets based characteristic relation approach for dynamic attribute generalization in data mining. Knowledge-Based Systems 20(5), 485–494 (2007)CrossRefGoogle Scholar
  32. 32.
    Nakata, M., Sakai, H.: Rough sets handling missing values probabilistically interpreted. In: Ślęzak, D., Wang, G., Szczuka, M.S., Düntsch, I., Yao, Y. (eds.) RSFDGrC 2005. LNCS (LNAI), vol. 3641, pp. 325–334. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  33. 33.
    Qi, Y.S., Sun, H., Yang, X.B., Song, Y., Sun, Q.: Approach to approximate distribution reduct in incomplete ordered decision system. Journal of Information and Computing Science 3, 189–198 (2008)Google Scholar
  34. 34.
    Qi, Y.S., Wei, L., Sun, H.J., Song, Y.Q., Sun, Q.S.: Characteristic relations in generalized incomplete information systems. In: International Workshop on Knowledge Discovery and Data Mining, pp. 519–523 (2008)Google Scholar
  35. 35.
    Song, J., Li, T., Ruan, D.: A new decision tree construction using the cloud transform and rough sets. In: Wang, G., Li, T., Grzymala-Busse, J.W., Miao, D., Skowron, A., Yao, Y. (eds.) RSKT 2008. LNCS (LNAI), vol. 5009, pp. 524–531. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  36. 36.
    Stefanowski, J., Tsoukiàs, A.: On the extension of rough sets under incomplete information. In: Zhong, N., Skowron, A., Ohsuga, S. (eds.) RSFDGrC 1999. LNCS (LNAI), vol. 1711, pp. 73–82. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  37. 37.
    Stefanowski, J., Tsoukias, A.: Incomplete information tables and rough classification. Computational Intelligence 17(3), 545–566 (2001)CrossRefMATHGoogle Scholar
  38. 38.
    Wang, G.: Extension of rough set under incomplete information systems. In: Proceedings of the IEEE International Conference on Fuzzy Systems, pp. 1098–1103 (2002)Google Scholar
  39. 39.
    Pawlak, Z.: Rough sets. International Journal of Computer and Information Sciences 11, 341–356 (1982)MathSciNetCrossRefMATHGoogle Scholar
  40. 40.
    Pawlak, Z.: Rough Sets. In: Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht (1991)Google Scholar
  41. 41.
    Pawlak, Z., Grzymala-Busse, J.W., Slowinski, R., Ziarko, W.: Rough sets. Communications of the ACM 38, 89–95 (1995)CrossRefGoogle Scholar
  42. 42.
    Grzymala-Busse, J.W., Rzasa, W.: A local version of the MLEM2 algorithm for rule induction. Fundamenta Informaticae 100, 99–116 (2010)MathSciNetMATHGoogle Scholar
  43. 43.
    Grzymala-Busse, J.W.: MLEM2: A new algorithm for rule induction from imperfect data. In: Proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, pp. 243–250 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Jerzy W. Grzymała-Busse
    • 1
    • 2
  1. 1.Department of Electrical Engineering and Computer ScienceUniversity of KansasLawrenceUSA
  2. 2.Institute of Computer SciencePolish Academy of SciencesWarsawPoland

Personalised recommendations