A Quality-Driven Ensemble Approach to Automatic Model Selection in Clustering

  • Raffaella Rosasco
  • Hassan Mahmoud
  • Stefano Rovetta
  • Francesco Masulli
Part of the Smart Innovation, Systems and Technologies book series (SIST, volume 26)

Abstract

A fundamental limitation of the data clustering task is that it has an inherent, ill-defined model selection problem: the choice of a clustering technique also implies some a-priori decision on cluster geometry. In this work we explore the combined use of two different clustering paradigms and their combination by means of an ensemble technique. Mixing coefficients are computed on the basis of partition quality, so that the ensemble is automatically tuned so as to give more weight to the best-performing (in terms of the selected quality indices) clustering method.

Keywords

Central clustering Fuzzy clustering Possibilistic c-Means Spectral clustering Clustering quality Ensemble clustering 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Asuncion, A., Newman, D.J.: UCI machine learning repository (2007)Google Scholar
  2. 2.
    Bach, F.R., Jordan, M.I.: Learning spectral clustering. Tech. Rep. UCB/CSD-03-1249, EECS Department, University of California, Berkeley (2003)Google Scholar
  3. 3.
    Baraldi, A., Blonda, P.: A survey of fuzzy clustering algorithms for pattern recognition. I. IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics) 29, 778–785 (1999)CrossRefGoogle Scholar
  4. 4.
    Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Academic Publishers, Norwell (1981)CrossRefMATHGoogle Scholar
  5. 5.
    Chaudhuri, K., Chung, F., Tsiatas, A.: Spectral clustering of graphs with general degrees in the extended planted partition model. Journal of Machine Learning Research 2012, 1–23 (2012)Google Scholar
  6. 6.
    Chung, F.R.K.: Spectral Graph Theory. CBMS Regional Conference Series in Mathematics, vol. 92. American Mathematical Society (February 1997)Google Scholar
  7. 7.
    Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI 1(2), 224–227 (1979)Google Scholar
  8. 8.
    Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. Journal of Cybernetics 3, 32–57 (1974)CrossRefGoogle Scholar
  9. 9.
    Fiedler, M.: Algebraic connectivity of graphs. Czechoslovak Mathematical Journal 23(2), 298–305 (1973)MathSciNetGoogle Scholar
  10. 10.
    Filippone, M., Camastra, F., Masulli, F., Rovetta, S.: A survey of kernel and spectral methods for clustering. Pattern Recognition 40(1), 176–190 (2008)CrossRefGoogle Scholar
  11. 11.
    Fischer, I., Poland, J.: New methods for spectral clustering. Tech. rep., IDSIA/USI-SUPSI (2004)Google Scholar
  12. 12.
    Fred, A.L.N., Jain, A.K.: Data clustering using evidence accumulation. In: International Conference on Pattern Recognition, vol. 4 (2002)Google Scholar
  13. 13.
    Gower, J.C., Ross, G.J.S.: Minimum spanning trees and single linkage cluster analysis. Journal of the Royal Statistical Society 18(1), 54–64 (1969)MathSciNetGoogle Scholar
  14. 14.
    Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs (1988)MATHGoogle Scholar
  15. 15.
    Kriegel, H.P., Kröger, P., Sander, J., Zimek, A.: Density-based clustering. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 1(3), 231–240 (2011)Google Scholar
  16. 16.
    Krishnapuram, R., Keller, J.M.: A possibilistic approach to clustering. IEEE Transactions on Fuzzy Systems 1(2), 98–110 (1993)CrossRefGoogle Scholar
  17. 17.
    Kuncheva, L.: Combining pattern classifiers. Methods and Algorithms. Wiley, Chichester (2004)Google Scholar
  18. 18.
    Masulli, F., Rovetta, S.: Soft transition from probabilistic to possibilistic fuzzy clustering. IEEE Transactions on Fuzzy Systems 14(4), 516–527 (2006)CrossRefGoogle Scholar
  19. 19.
    Nadler, B., Galun, M.: Fundamental limitations of spectral clustering. In: Advances in Neural Information Processing Systems, vol. 19, p. 1017 (2007)Google Scholar
  20. 20.
    Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: Analysis and an algorithm. In: Dietterich, T.G., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems, vol. 14. MIT Press, Cambridge (2002)Google Scholar
  21. 21.
    Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8), 888–905 (2000)CrossRefGoogle Scholar
  22. 22.
    Strehl, A., Ghosh, J.: Cluster ensembles — a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2003)MATHMathSciNetGoogle Scholar
  23. 23.
    Von Luxburg, U.: A tutorial on spectral clustering. Statistics and Computing 17(4), 395–416 (2007)CrossRefMathSciNetGoogle Scholar
  24. 24.
    Xie, X.L., Beni, G.: A validity measure for fuzzy clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 13(8), 841–847 (1991)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Raffaella Rosasco
    • 1
  • Hassan Mahmoud
    • 1
  • Stefano Rovetta
    • 1
  • Francesco Masulli
    • 1
    • 2
  1. 1.DIBRISUniversity of GenoaGenoaItaly
  2. 2.Center for BiotechnologyTemple UniversityPhiladelphiaUSA

Personalised recommendations