Summary
Ensemble clustering is a novel research field that extends to unsupervised learning the approach originally developed for classification and supervised learning problems. In particular ensemble clustering methods have been developed to improve the robustness and accuracy of clustering algorithms, as well as the ability to capture the structure of complex data. In many clustering applications an example may belong to multiple clusters, and the introduction of fuzzy set theory concepts can improve the level of flexibility needed to model the uncertainty underlying real data in several application domains. In this paper, we propose an unsupervised fuzzy ensemble clustering approach that permit to dispose both of the flexibility of the fuzzy sets and the robustness of the ensemble methods. Our algorithmic scheme can generate different ensemble clustering algorithms that allow to obtain the final consensus clustering both in crisp and fuzzy formats.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Fern XZ, Brodley CE (2003) Random projection for high dimensional data clustering: a cluster ensemble approach. In: Fawcett T, Mishra N (eds) Proc 20th Int Conf Mach Learning, Washington, DC, USA. AAAI Press, Menlo Rark
Monti S, Tamayo P, Mesirov J, Golub T (2003) Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data. Mach Learning 52:91–118
Topchy A, Jain A, Puch W (2005) Clustering ensembles: models of consensus and weak partitions. IEEE Trans Pattern Analysis Mach Intell 27:1866–1881
Kuncheva L, Vetrov D (2006) Evaluation of stability of k-means cluster ensembles with respect to random initialization. IEEE Trans Pattern Analysis Mach Intell 28:1798–1808
Kuncheva L (2004) Combining pattern classifiers: methods and algorithms. Wiley-Interscience, New York
Strehl A, Ghosh J (2002) Cluster ensembles – a knowledge reuse framework for combining multiple partitions. J Mach Learn Research 3:583–618
Hu X, Yoo I (2004) Cluster ensemble and its applications in gene expression analysis. In: Chen YPP (ed) Proc. 2nd Asia-Pacific Bioinformatics Conf, Dunedin, New-Zealand. Australian Computer Society, Darlinghurst, pp 297–302
Dudoit S, Fridlyand J (2003) Bagging to improve the accuracy of a clustering procedure. Bioinformatics 19:1090–1099
Bertoni A, Valentini G (2006) Ensembles based on random projections to improve the accuracy of clustering algorithms. In: Apolloni B, Marinaro M, Nicosia G, Tagliaferri R (eds) Proc 16th Italian Workshop Neural Nets, Vietri sul Mare, Italy. Springer, Berlin/Heidelberg, pp 31–37
Hadjitodorov S, Kuncheva L, Todorova L (2006) Moderate diversity for better cluster ensembles. Inf Fusion 7:264–275
Achlioptas D (2003) Database-friendly random projections: Johnson-Lindenstrauss with binary coins. J Comp Sys Sci 66:671–687
Johnson WB, Lindenstrauss J (1984) Extensions of Lipshitz mapping into Hilbert space. Contemporary Math 26:189–206
Bertoni A, Valentini G (2006) Randomized maps for assessing the reliability of patients clusters in DNA microarray data analyses. Artif Intell Medicine 37:85–109
Eisen M, Spellman P, Brown P, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. PNAS 95:14863–14868
Zadeh, L (1965) Fuzzy sets. Inf Control 8:338–353
Klement EP, Mesiar R, Pap E (2000) Triangular norms. Kluwer Academic Publishers, Dordrecht
Yang L, Lv H, Wang W (2006) Soft cluster ensemble based on fuzzy similarity measure. In: IMACS Multiconf Comp Eng Systems Appl, Beijing, China, pp 1994–1997
Golub T, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531–537
Bittner M, Meltzer P, Chen Y, Jiang Y, Seftor E, Hendrix M, Radmacher M, Simon R, Yakhini Z, Ben-Dor A, Sampas N, Dougherty E, Wang E, Marincola F, Gooden C, Lueders J, Glatfelter A, Pollock P, Carpten J, Gillanders E, Leja D, Dietrich K, Beaudry C, Berens M, Alberts D, Sondak V (2000) Molecular classification of malignant melanoma by gene expression profiling. Nature 406:536–540
Dietterich T (2000) An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting and randomization. Mach Learn 40:139–158
Avogadri R, Valentini G (2007) Fuzzy ensemble clustering for DNA microarray data analysis. In: Proc 4th Int Conf Bioinformatics and Biostatistics, Portofino, Italy. Springer, Berlin/Heidelberg, pp 537–543
Valentini G (2006) Clusterv: a tool for assessing the reliability of clusters discovered in DNA microarray data. Bioinformatics 22:369–370
Bertoni A, Valentini G (2007) Model order selection for bio-molecular data clustering. BMC Bioinformatics 8:S7
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Avogadri, R., Valentini, G. (2008). Ensemble Clustering with a Fuzzy Approach. In: Okun, O., Valentini, G. (eds) Supervised and Unsupervised Ensemble Methods and their Applications. Studies in Computational Intelligence, vol 126. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78981-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-78981-9_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78980-2
Online ISBN: 978-3-540-78981-9
eBook Packages: EngineeringEngineering (R0)