Abstract
Many data mining and data analysis techniques function with large datasets. These large data sets have missing values which result in biased estimates, imprecise statistical results or unacceptable conclusions. Data mining and data analysis techniques cannot be directly applied to datasets with missing values. For this purpose, different imputation techniques are proposed by different authors for both categorical and continuous variables. The existing imputation techniques have many limitations such as (a) methods like conditional mean imputation results in biased parameter estimation. (b) Too much variation is discovered in the inference of any single value or distance between particular samples in the case of random draw imputation. (c) In case of multiple imputations it is not easy to determine the posterior distribution of samples to draw from. In this paper, we present an unsupervised learning technique based on a Kohonen self-organizing map used for both categorical and numerical data values. In this paper, our aim is to achieve the highest accuracy. To achieve this, we trained our model by using the splitting approach to make the learning model and use this model to predict the accuracy. The proposed algorithm can map the missing values closed to original by adjusting the weights by improving accuracy when compared to classification without missing values and with missing values.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Farhangfar, A., Kurgan, L.A., Pedrycz, W.: A novel framework for imputation of missing values in databases. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 37(5), 692–709 (2007)
Amit, G., Monica, L.: The weight decay back-propagation for generalizations with missing values. Springer Ann. Oper. Res. 78, 165–187 (1998)
Mohammad-sale, J., Shola, Brain: Improved neural network performance using principle component analysis on Mat lab. J. comput. internet manage. 6(2), 1–8 (2008)
Ennett, C.M., Frize, M., Robin C., Walker, Influence of Missing Values on Artificial Neural Network Performance, In: Proceedings of Med info, pp. 449–453 (2001)
Josse, J., Husson, F.: Handling missing values in exploratory multivariate data analysis methods. Int. Joint Conf. Neural Networks (IJCNN) 153(2), 1–10 (2012)
Farhangfara, A.: Impact of imputation of missing values on classification error for discrete data mining. Elsevier J. Pattern Recognit. 41(12), 3692–3705 (2008)
Shivnandam, S.N., Deepak, S.N.: Computing missing values using different clustering techniques. Int. J. Adv. Res. Artif. Intell. (IJARAI). 2(2), (2012)
Sathya, R., Abraham, A.: Comparison of Supervised and Unsupervised Learning Algorithms for Pattern Classification. Int. J. Adv. Res. Artif. Int. (IJARAI) 2(2), 34–38 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer India
About this paper
Cite this paper
Singh, N., Javeed, A., Chhabra, S., Kumar, P. (2015). Missing Value Imputation with Unsupervised Kohonen Self Organizing Map. In: Shetty, N., Prasad, N., Nalini, N. (eds) Emerging Research in Computing, Information, Communication and Applications. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2550-8_7
Download citation
DOI: https://doi.org/10.1007/978-81-322-2550-8_7
Published:
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-2549-2
Online ISBN: 978-81-322-2550-8
eBook Packages: EngineeringEngineering (R0)