Fuzzy Clustering of Incomplete Data Based on Cluster Dispersion

Himmelspach, Ludmila; Conrad, Stefan

doi:10.1007/978-3-642-14049-5_7

Ludmila Himmelspach²² &
Stefan Conrad²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6178))

Included in the following conference series:

International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems

2117 Accesses
14 Citations

Abstract

Clustering algorithms are used to identify groups of similar data objects within large data sets. Since traditional clustering methods were developed to analyse complete data sets, they cannot be applied to many practical problems, e.g. on incomplete data. Approaches proposed for adapting clustering algorithms for dealing with missing values work well on uniformly distributed data sets. But in real world applications clusters are generally differently sized. In this paper we present an extension for existing fuzzy c-means clustering algorithms for incomplete data, which uses the information about the dispersion of clusters. In experiments on artificial and real data sets we show that our approach outperforms other clustering methods for incomplete data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Academic Publishers, Norwell (1981)
MATH Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm. J. of the Royal Stat. Society Series B 39, 1–38 (1977)
MATH MathSciNet Google Scholar
Dixon, J.K.: Pattern Recognition with Partly Missing Data. IEEE Transactions on System, Man and Cybernetics 9, 617–621 (1979)
Article Google Scholar
Freedman, D., Pisani, R., Purves, R.: Statistics. Norton, New York (1998)
Google Scholar
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. M. Kaufmann, San Francisco (2000)
Google Scholar
Hathaway, R.J., Bezdek, J.C.: Fuzzy c-means Clustering of Incomplete Data. IEEE Transactions on Systems, Man, and Cybernetics, Part B, 735–744 (2001)
Google Scholar
Himmelspach, L.: Clustering with missing values: Analysis and Comparison. Master’s thesis, Institut für Informatik, Heinrich-Heine-Universität Düsseldorf (2008)
Google Scholar
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, Chichester (1990)
Google Scholar
Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data. John Wiley & Sons, Chichester (2002)
MATH Google Scholar
Sarkar, M., Leong, T.-Y.: Fuzzy k-means Clustering with Missing Values. In: Proc. Am. Medical Informatics Association Ann. Fall Symp. (AMIA), pp. 588–592 (2001)
Google Scholar
Timm, H., Döring, C., Kruse, R.: Different approaches to fuzzy clustering of incomplete datasets. Int. Journal of Approximate Reasoning, 239–249 (2004)
Google Scholar
Wagstaff, K.: Clustering with Missing Values: No Imputation Required. In: Proc. Meeting of the Int. Federation of Classification Societies, pp. 649–658 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Computer Science, Heinrich-Heine-Universität Düsseldorf, D – 40225, Düsseldorf, Germany
Ludmila Himmelspach & Stefan Conrad

Authors

Ludmila Himmelspach
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Conrad
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Fachbereich Mathematik und Informatik, Philipps-Universität Marburg, Hans-Meerwein-Str., 35032, Marburg, Germany
Eyke Hüllermeier
Fakultät Informatik, Otto-von-Guericke-Universität Magdeburg, Universitätsplatz 2, 39106, Magdeburg, Germany
Rudolf Kruse
Fakultät für Elektrotechnik und Informationstechnik, Technische Universität Dortmund, Otto-Hahn-Str. 4, 44227, Dortmund, Germany
Frank Hoffmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Himmelspach, L., Conrad, S. (2010). Fuzzy Clustering of Incomplete Data Based on Cluster Dispersion. In: Hüllermeier, E., Kruse, R., Hoffmann, F. (eds) Computational Intelligence for Knowledge-Based Systems Design. IPMU 2010. Lecture Notes in Computer Science(), vol 6178. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14049-5_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-14049-5_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14048-8
Online ISBN: 978-3-642-14049-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics