Advertisement

A Study of Incomplete Data – A Review

  • S. S. Gantayat
  • Ashok Misra
  • B. S. Panda
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 247)

Abstract

Incomplete data are questions without answers or variables without observations. Even a small percentage of missing data can cause serious problems with the analysis leading to draw wrong conclusions and imperfect knowledge. There are many techniques to overcome the imperfect knowledge and manage data with incomplete items, but no one is absolutely better than the others.

To handle such problems, researchers are trying to solve it in different directions and then proposed to handle the information system. The attribute values are important for information processing. In the field of databases, various efforts have been made for the improvement and enhance of database query process to handle the data. The different researchers have tried and are trying to handle the imprecise and/or uncertainty in databases. The methodology followed by different approaches like: Fuzzy sets, Rough sets, Boolean Logic, Possibility Theory, Statistically Similarity etc.

Keywords

Data Uncertainty Incomplete Information Missing Data Expert Systems 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Han J. and M. Kamber, 2001
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2001)Google Scholar
  2. Grzymala-Busse, 2004.
    Grzymala-Busse, J.W.: Three Approaches to Missing Attribute Values- A Rough Set Approach. In: Workshop on Foundations of Data Mining, Associated with 4th IEEE International Conference on Data Mining, Brighton, UK (2004)Google Scholar
  3. Grzymala-Busse and Hu, 2001.
    Grzymała-Busse, J.W., Hu, M.: A Comparison of Several Approaches to Missing Attribute Values in Data Mining. In: Ziarko, W.P., Yao, Y. (eds.) RSCTC 2000. LNCS (LNAI), vol. 2005, pp. 378–385. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  4. Grzymala-Busse, 2005.
    Grzymała-Busse, J.W.: Incomplete Data and Generalization of Indiscernibility Relation, Definability, and Approximations. In: Ślęzak, D., Wang, G., Szczuka, M.S., Düntsch, I., Yao, Y. (eds.) RSFDGrC 2005. LNCS (LNAI), vol. 3641, pp. 244–253. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  5. Grzymala-Busse and Goodwin, 2001.
    Grzymala-Busse, J.W., Goodwin, L.K.: Coping with Missing Attribute Values Based on Closest Fit in Preterm Birth Data: A Rough Set Approach. Computation Intelligence 17(3), 425–434 (2001)CrossRefGoogle Scholar
  6. Grzymala-Busse and Wang, 1997.
    Grzymala-Busse, J.W., Wang, A.Y.: Modified algorithms LEM1 and LEM2 for rule induction from data with missing attribute values. In: Proc. of the Fifth International Workshop on Rough Sets and Soft Computing (RSSC 1997) at the Third Joint Conference on Information Sciences (JCIS 1997), Research Triangle Park, NC, March 2-5, pp. 69–72 (1997)Google Scholar
  7. Kerdprasop, Saiveaw and Pumrungreong, 2003.
    Kerdprasop, N., Saiveaw, K.Y., Pumrungreong, P.: A comparative study of techniques to handle missing values in the classification task of data mining. In: 29th Congress on Science and Technology of Thailand, Khon Kaen University, Thailand (2003)Google Scholar
  8. Kryszkiewicz, 1995.
    Kryszkiewicz, M.: Rough set approach to incomplete information systems. In: Proceedings of the Second Annual Joint Conference on Information Sciences, Wrightsville Beach, NC, September 28-October 1, pp. 194–197 (1995)Google Scholar
  9. Little and Rubin, 2002.
    Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data (2002)Google Scholar
  10. Kantadzic, 2003.
    Kantadzic, M.: Data Mining: Concepts, Models, Methods & Algorithms. John Wiley & Sons, NY (2003)Google Scholar
  11. Nakata and Sakai, 2005.
    Nakata, M., Sakai, H.: Rough sets handling missing values probabilistically interpreted. In: Ślęzak, D., Wang, G., Szczuka, M.S., Düntsch, I., Yao, Y. (eds.) RSFDGrC 2005. LNCS (LNAI), vol. 3641, pp. 325–334. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  12. Quinlan , 1989.
    Quinlan, J.R.: Unknown attribute values in induction. In: Proc. Sixth Intl. Workshop on Machine Learning, pp. 164–168 (1989)Google Scholar
  13. Slowinski and Stefanowski, 1989.
    Slowinski, R., Stefanowski, J.: Rough classification in incomplete information systems. Mathematical and Computer Modelling 12(10-11), 1347–1357 (1989)CrossRefGoogle Scholar
  14. Stefanowski and Tsouki‘as, 1999.
    Stefanowski, J., Tsoukiàs, A.: On the extension of rough sets under incomplete information. In: Zhong, N., Skowron, A., Ohsuga, S. (eds.) RSFDGrC 1999. LNCS (LNAI), vol. 1711, pp. 73–82. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  15. Stefanowski and Tsouki‘as, 2001.
    Stefanowski, J., Tsoukiàs, A.: Incomplete information tables and rough classification. Computational Intelligence 17(3), 545–566 (2001)CrossRefGoogle Scholar
  16. Wu, Wun and Chou, 1997.
    Wu, C.-H., Wun, C.-H., Chou, H.-J.: Using association rules for completing missing data. In: HIS, pp. 236–241. IEEE Computer Society (2004)Google Scholar
  17. Lavrajc, Keravnou and Zupan, 1997.
    Lavrajc, N., Keravnou, E., Zupan, B.: Intelligent Data Analysis in Medicine and Pharmacology. Kluwer Academic Publishers (1997)Google Scholar
  18. Cheesemen and Stutz, 1996.
    Cheesemen, P., Stutz, J.: Bayesian classification (AutoClass): theory and results. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthunsamy, R. (eds.) Advances in Knowledge Discovery and Data Mining. AAAI Press/MIT Press (1996)Google Scholar
  19. Saar-Tsechansky and Foster, 2007.
    Maytal, S.-T., Provost, F.: Handling Missing Values when Applying Classification Models. Journal of Machine Learning Research 8, 1625–1657 (2007)Google Scholar
  20. Ding and Simonoff, 2006.
    Ding, Y., Simonoff, J.: An investigation of missing data methods for classification trees. Working paper 2006-SOR-3, Stern School of Business, New York University (2006)Google Scholar
  21. E. A. Rady et al., 2007
    Rady, E.A., Abd El-Monsef, M.M.E., Abd El-Latif, W.A.: A Modified Rough Set Approach to Incomplete Information Systems. Research Article Received (October 30, 2006) (revised January 27, 2007) (accepted March 27, 2007)Google Scholar
  22. Fujikawa and Ho, 2002.
    Fujikawa, Y., Ho, T.-B.: Cluster-Based Algorithms for Dealing with Missing Values. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 549–554. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  23. Greco et al., 2000
    Greco, S., Matarazzo, B., Slowinski, R.: Rough set processing of vague information using fuzzy similarity relations. In: Calude, C.S., Paun, G. (eds.) Finite Versus Infinite: Contributions to an Eternal Dilemma. Discrete Mathematics and Theoretical Computer Science (London), pp. 149–173. Springer, London (2000)CrossRefGoogle Scholar
  24. Kryszkiewicz, 1998.
    Kryszkiewicz, M.: Rough set approach to incomplete information systems. Information Sciences 112, 39–49 (1998)MathSciNetCrossRefMATHGoogle Scholar
  25. Kryszkiewicz, 1998a.
    Kryszkiewicz, M.: Rules in Incomplete Information Systems. Information Sciences 113(3-4), 271–292 (1999)MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.GMRITRajamIndia
  2. 2.CUTMParlakhemundiIndia
  3. 3.MITS Engineering CollegeRayagadaIndia

Personalised recommendations