Mining Incomplete Data with Many Lost and Attribute-Concept Values
This paper presents experimental results on twelve data sets with many missing attribute values, interpreted as lost values and attribute-concept values. Data mining was accomplished using three kinds of probabilistic approximations: singleton, subset and concept. We compared the best results, using all three kinds of probabilistic approximations, for six data sets with lost values and six data sets with attribute-concept values, where missing attribute values were located in the same places. For five pairs of data sets the error rate, evaluated by ten-fold cross validation, was significantly smaller for lost values than for attribute-concept values (5 % significance level). For the remaining pair of data sets both interpretations of missing attribute values do not differ significantly.
KeywordsIncomplete data Lost values Attribute-concept values Probabilistic approximations MLEM2 rule induction algorithm
- 1.Clark, P.G., Grzymala-Busse, J.W.: Experiments on probabilistic approximations. In: Proceedings of the 2011 IEEE International Conference on Granular Computing, pp. 144–149 (2011)Google Scholar
- 2.Clark, P.G., Grzymala-Busse, J.W.: Rule induction using probabilistic approximations and data with missing attribute values. In: Proceedings of the 15-th IASTED International Conference on Artificial Intelligence and Soft Computing ASC 2012, pp. 235–242 (2012)Google Scholar
- 3.Clark, P.G., Grzymała-Busse, J.W.: An experimental comparison of three interpretations of missing attribute values using probabilistic approximations. In: Ciucci, D., Inuiguchi, M., Yao, Y., śȩzak, D. (eds.) RSFDGrC 2013. LNCS, vol. 8170, pp. 77–86. Springer, Heidelberg (2013) CrossRefGoogle Scholar
- 4.Clark, P.G., Grzymala-Busse, J.W.: Mining incomplete data with lost values and attribute-concept values. In: Proceedings of the 2014 IEEE International Conference on Granular Computing, pp. 49–54 (2014)Google Scholar
- 5.Grzymala-Busse, J.W.: LERS–a system for learning from examples based on rough sets. In: Slowinski, R. (ed.) Intelligent Decision Support. Handbook of Applications and Advances of the Rough Set Theory, pp. 3–18. Kluwer Academic Publishers, Dordrecht (1992)Google Scholar
- 6.Grzymala-Busse, J.W.: MLEM2: a new algorithm for rule induction from imperfect data. In: Proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, pp. 243–250 (2002)Google Scholar
- 8.Grzymala-Busse, J.W., Wang, A.Y.: Modified algorithms LEM1 and LEM2 for rule induction from data with missing attribute values. In: Proceedings of the 5-th International Workshop on Rough Sets and Soft Computing in conjunction with the Third Joint Conference on Information Sciences, pp. 69–72 (1997)Google Scholar
- 14.Wang, G.: Extension of rough set under incomplete information systems. In: Proceedings of the IEEE International Conference on Fuzzy Systems, pp. 1098–1103 (2002)Google Scholar
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (http://creativecommons.org/licenses/by-nc/2.5/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.