Abstract

The interest in the analysis and study of clustering techniques have grown since the introduction of new algorithms based on the continuity of the data, where problems related to image segmentation and tracking, amongst others, makes difficult the correct classification of data into their appropriate groups, or clusters. Some new techniques, such as Spectral Clustering (SC), uses graph theory to generate the clusters through the spectrum of the graph created by a similarity function applied to the elements of the database. The approach taken by SC allows to handle the problem of data continuity though the graph representation. Based on this idea, this study uses genetic algorithms to select the groups using the same similarity graph built by the Spectral Clustering method. The main contribution is to create a new algorithm which improves the robustness of the Spectral Clustering algorithm reducing the dependency of the similarity metric parameters that currently affects to the performance of SC approaches. This algorithm, named Genetic Graph-based Clustering (GGC), has been tested with different synthetic and real-world datasets, the experimental results have been compared against classical clustering algorithms like K-Means, EM and SC.

Keywords

Machine Learning Clustering Spectral Clustering Genetic Algorithms 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bello, G., Menéndez, H., Camacho, D.: Using the Clustering Coefficient to Guide a Genetic-Based Communities Finding Algorithm. In: Yin, H., Wang, W., Rayward-Smith, V. (eds.) IDEAL 2011. LNCS, vol. 6936, pp. 160–169. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  2. 2.
    Chang, H., Yeung, D.-Y.: Robust path-based spectral clustering. Pattern Recogn. 41(1), 191–203 (2008)MathSciNetMATHCrossRefGoogle Scholar
  3. 3.
    Cios, K.J., Swiniarski, R.W., Pedrycz, W., Kurgan, L.A.: Unsupervised learning: Clustering. In: Data Mining, pp. 257–288. Springer, US (2007)CrossRefGoogle Scholar
  4. 4.
    Coley. An Introduction to Genetic Algorithms for scientists and engineers. World Scientific Publishing (1999)Google Scholar
  5. 5.
    Frank, A., Asuncion, A.: UCI machine learning repository (2010)Google Scholar
  6. 6.
    Gionis, A., Mannila, H., Tsaparas, P.: Clustering aggregation. ACM Trans. Knowl. Discov. Data 1(1) (March 2007)Google Scholar
  7. 7.
    Hruschka, E.R., Campello, R.J.G.B., Freitas, A.A., de Carvalho, A.C.P.L.F.: A survey of evolutionary algorithms for clustering. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews 39(2), 133–155 (2009)CrossRefGoogle Scholar
  8. 8.
    Karatzoglou, A., Smola, A., Hornik, K., Zeileis, A.: kernlab – an S4 package for kernel methods in R. Journal of Statistical Software 11(9), 1–20 (2004)Google Scholar
  9. 9.
    Larose, D.T.: Discovering Knowledge in Data. John Wiley & Sons (2005)Google Scholar
  10. 10.
    Ng, A., Jordan, M., Weiss, Y.: On Spectral Clustering: Analysis and an algorithm. In: Dietterich, T., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems, pp. 849–856. MIT Press (2001)Google Scholar
  11. 11.
    Schaeffer, S.E.: Graph clustering. Computer Science Review 1(1), 27–64 (2007)MathSciNetCrossRefGoogle Scholar
  12. 12.
    von Luxburg, U.: A tutorial on spectral clustering. Statistics and Computing 17(4), 395–416 (2007)MathSciNetCrossRefGoogle Scholar
  13. 13.
    von Luxburg, U., Belkin, M., Bousquet, O.: Consistency of spectral clustering. The Annals of Statistics 36(2), 555–586 (2008)MathSciNetMATHCrossRefGoogle Scholar
  14. 14.
    Wang, H., Chen, J., Guo, K.: A genetic spectral clustering algorithm. Journal of Computational Information Systems 7(9), 3245–3252 (2011)Google Scholar
  15. 15.
    Zahn, C.T.: Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Transactions on Computers C-20(1), 68–86 (1971)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Héctor Menéndez
    • 1
  • David Camacho
    • 1
  1. 1.Departamento de Ingeniería Informática, Escuela Politécnica SuperiorUniversidad Autónoma de MadridMadridSpain

Personalised recommendations