Abstract
Clustering algorithms are used to identify groups of similar data objects within large data sets. Since traditional clustering methods were developed to analyse complete data sets, they cannot be applied to many practical problems, e.g. on incomplete data. Approaches proposed for adapting clustering algorithms for dealing with missing values work well on uniformly distributed data sets. But in real world applications clusters are generally differently sized. In this paper we present an extension for existing fuzzy c-means clustering algorithms for incomplete data, which uses the information about the dispersion of clusters. In experiments on artificial and real data sets we show that our approach outperforms other clustering methods for incomplete data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Academic Publishers, Norwell (1981)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm. J. of the Royal Stat. Society Series B 39, 1–38 (1977)
Dixon, J.K.: Pattern Recognition with Partly Missing Data. IEEE Transactions on System, Man and Cybernetics 9, 617–621 (1979)
Freedman, D., Pisani, R., Purves, R.: Statistics. Norton, New York (1998)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. M. Kaufmann, San Francisco (2000)
Hathaway, R.J., Bezdek, J.C.: Fuzzy c-means Clustering of Incomplete Data. IEEE Transactions on Systems, Man, and Cybernetics, Part B, 735–744 (2001)
Himmelspach, L.: Clustering with missing values: Analysis and Comparison. Master’s thesis, Institut für Informatik, Heinrich-Heine-Universität Düsseldorf (2008)
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, Chichester (1990)
Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data. John Wiley & Sons, Chichester (2002)
Sarkar, M., Leong, T.-Y.: Fuzzy k-means Clustering with Missing Values. In: Proc. Am. Medical Informatics Association Ann. Fall Symp. (AMIA), pp. 588–592 (2001)
Timm, H., Döring, C., Kruse, R.: Different approaches to fuzzy clustering of incomplete datasets. Int. Journal of Approximate Reasoning, 239–249 (2004)
Wagstaff, K.: Clustering with Missing Values: No Imputation Required. In: Proc. Meeting of the Int. Federation of Classification Societies, pp. 649–658 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Himmelspach, L., Conrad, S. (2010). Fuzzy Clustering of Incomplete Data Based on Cluster Dispersion. In: Hüllermeier, E., Kruse, R., Hoffmann, F. (eds) Computational Intelligence for Knowledge-Based Systems Design. IPMU 2010. Lecture Notes in Computer Science(), vol 6178. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14049-5_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-14049-5_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14048-8
Online ISBN: 978-3-642-14049-5
eBook Packages: Computer ScienceComputer Science (R0)