Skip to main content

Mining Incomplete Data with Many Lost and Attribute-Concept Values

  • Conference paper
  • First Online:
Rough Sets and Knowledge Technology (RSKT 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9436))

Included in the following conference series:

Abstract

This paper presents experimental results on twelve data sets with many missing attribute values, interpreted as lost values and attribute-concept values. Data mining was accomplished using three kinds of probabilistic approximations: singleton, subset and concept. We compared the best results, using all three kinds of probabilistic approximations, for six data sets with lost values and six data sets with attribute-concept values, where missing attribute values were located in the same places. For five pairs of data sets the error rate, evaluated by ten-fold cross validation, was significantly smaller for lost values than for attribute-concept values (5 % significance level). For the remaining pair of data sets both interpretations of missing attribute values do not differ significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Clark, P.G., Grzymala-Busse, J.W.: Experiments on probabilistic approximations. In: Proceedings of the 2011 IEEE International Conference on Granular Computing, pp. 144–149 (2011)

    Google Scholar 

  2. Clark, P.G., Grzymala-Busse, J.W.: Rule induction using probabilistic approximations and data with missing attribute values. In: Proceedings of the 15-th IASTED International Conference on Artificial Intelligence and Soft Computing ASC 2012, pp. 235–242 (2012)

    Google Scholar 

  3. Clark, P.G., Grzymała-Busse, J.W.: An experimental comparison of three interpretations of missing attribute values using probabilistic approximations. In: Ciucci, D., Inuiguchi, M., Yao, Y., śȩzak, D. (eds.) RSFDGrC 2013. LNCS, vol. 8170, pp. 77–86. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  4. Clark, P.G., Grzymala-Busse, J.W.: Mining incomplete data with lost values and attribute-concept values. In: Proceedings of the 2014 IEEE International Conference on Granular Computing, pp. 49–54 (2014)

    Google Scholar 

  5. Grzymala-Busse, J.W.: LERS–a system for learning from examples based on rough sets. In: Slowinski, R. (ed.) Intelligent Decision Support. Handbook of Applications and Advances of the Rough Set Theory, pp. 3–18. Kluwer Academic Publishers, Dordrecht (1992)

    Google Scholar 

  6. Grzymala-Busse, J.W.: MLEM2: a new algorithm for rule induction from imperfect data. In: Proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, pp. 243–250 (2002)

    Google Scholar 

  7. Grzymała-Busse, J.W.: Generalized parameterized approximations. In: Yao, J.T., Ramanna, S., Wang, G., Suraj, Z. (eds.) RSKT 2011. LNCS, vol. 6954, pp. 136–145. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  8. Grzymala-Busse, J.W., Wang, A.Y.: Modified algorithms LEM1 and LEM2 for rule induction from data with missing attribute values. In: Proceedings of the 5-th International Workshop on Rough Sets and Soft Computing in conjunction with the Third Joint Conference on Information Sciences, pp. 69–72 (1997)

    Google Scholar 

  9. Grzymala-Busse, J.W., Ziarko, W.: Data mining based on rough sets. In: Wang, J. (ed.) Data Mining: Opportunities and Challenges, pp. 142–173. Idea Group Publ., Hershey (2003)

    Chapter  Google Scholar 

  10. Pawlak, Z., Skowron, A.: Rough sets: some extensions. Inf. Sci. 177, 28–40 (2007)

    Article  MathSciNet  Google Scholar 

  11. Pawlak, Z., Wong, S.K.M., Ziarko, W.: Rough sets: probabilistic versus deterministic approach. Int. J. Man-Mach. Stud. 29, 81–95 (1988)

    Article  Google Scholar 

  12. Ślȩzak, D., Ziarko, W.: The investigation of the bayesian rough set model. Int. J. Approximate Reasoning 40, 81–91 (2005)

    Article  MathSciNet  Google Scholar 

  13. Stefanowski, J., Tsoukias, A.: Incomplete information tables and rough classification. Comput. Intell. 17(3), 545–566 (2001)

    Article  Google Scholar 

  14. Wang, G.: Extension of rough set under incomplete information systems. In: Proceedings of the IEEE International Conference on Fuzzy Systems, pp. 1098–1103 (2002)

    Google Scholar 

  15. Yao, Y.Y.: Probabilistic rough set approximations. Int. J. Approximate Reasoning 49, 255–271 (2008)

    Article  Google Scholar 

  16. Yao, Y.Y., Wong, S.K.M.: A decision theoretic framework for approximate concepts. Int. J. Man Mach. Studies 37, 793–809 (1992)

    Article  Google Scholar 

  17. Ziarko, W.: Variable precision rough set model. J. Comput. Sys. Sci. 46(1), 39–59 (1993)

    Article  MathSciNet  Google Scholar 

  18. Ziarko, W.: Probabilistic approach to rough sets. Int. J. Approximate Reasoning 49, 272–284 (2008)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jerzy W. Grzymala-Busse .

Editor information

Editors and Affiliations

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (http://creativecommons.org/licenses/by-nc/2.5/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Clark, P.G., Grzymala-Busse, J.W. (2015). Mining Incomplete Data with Many Lost and Attribute-Concept Values. In: Ciucci, D., Wang, G., Mitra, S., Wu, WZ. (eds) Rough Sets and Knowledge Technology. RSKT 2015. Lecture Notes in Computer Science(), vol 9436. Springer, Cham. https://doi.org/10.1007/978-3-319-25754-9_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25754-9_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25753-2

  • Online ISBN: 978-3-319-25754-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics