Skip to main content

Cluster Ensembles Based on Vector Space Embeddings of Graphs

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5519))

Abstract

Cluster ensembles provide us with a versatile alternative to individual clustering algorithms. In structural pattern recognition, however, cluster ensembles have been rarely studied. In the present paper a general methodology for creating structural cluster ensembles is proposed. Our representation formalism is based on graphs and includes strings and trees as special cases. The basic idea of our approach is to view the dissimilarities of an input graph g to a number of prototype graphs as a vectorial description of g. Randomized prototype selection offers a convenient possibility to generate m different vector sets out of the same graph set. Applying any available clustering algorithm to these vector sets results in a cluster ensemble with m clusterings which can then be combined with an appropriate consensus function. In several experiments conducted on different graph sets, the cluster ensemble shows superior performance over two single clustering procedures.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kuncheva, L., Vetrov, D.: Evaluation of stability of k-means cluster ensembles with respect to random initialization. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(11), 1798–1808 (2006)

    Article  Google Scholar 

  2. Jain, A., Murty, M., Flynn, P.: Data clustering: A review. ACM Computing Surveys 31(3), 264–323 (1999)

    Article  Google Scholar 

  3. Dudoit, S.: Fridlyand: Bagging to improve the accuracy of a clustering procedure. Bioinformatics 19(9), 1090–1099 (2003)

    Article  Google Scholar 

  4. Fred, A., Jain, A.: Combining multiple clusterings using evidence accumulation. IEEE Trans. on Pattern Analysis and Machine Intelligence 27(6), 835–850 (2005)

    Article  Google Scholar 

  5. Ayad, H., Kamel, M.: Finding natural clusters using multiclusterer combiner based on shared nearest neighbors. In: Windeatt, T., Roli, F. (eds.) MCS 2003. LNCS, vol. 2709, pp. 166–175. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  6. Strehl, A., Gosh, J., Cardie, C.: Cluster ensembles–a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research 3, 583–617 (2002)

    MathSciNet  Google Scholar 

  7. Englert, R., Glantz, R.: Towards the clustering of graphs. In: Kropatsch, W., Jolion, J. (eds.) Proc. 2nd Int. Workshop on Graph Based Representations in Pattern Recognition, pp. 125–133 (2000)

    Google Scholar 

  8. Bunke, H., Dickinson, P., Kraetzl, M., Wallis, W.: A Graph-Theoretic Approach to Enterprise Network Dynamics. In: Progress in Computer Science and Applied Logic (PCS), vol. 24. Birkhäuser, Basel (2007)

    Google Scholar 

  9. Mahé, P., Ueda, N., Akutsu, T.: Graph kernels for molecular structures – activity relationship analysis with support vector machines. Journal of Chemical Information and Modeling 45(4), 939–951 (2005)

    Article  Google Scholar 

  10. Schenker, A., Bunke, H., Last, M., Kandel, A.: Graph-Theoretic Techniques for Web Content Mining. World Scientific, Singapore (2005)

    Book  MATH  Google Scholar 

  11. Conte, D., Foggia, P., Sansone, C., Vento, M.: Thirty years of graph matching in pattern recognition. Int. Journal of Pattern Recognition and Artificial Intelligence 18(3), 265–298 (2004)

    Article  Google Scholar 

  12. Schölkopf, B., Smola, A.: Learning with Kernels. MIT Press, Cambridge (2002)

    MATH  Google Scholar 

  13. Gärtner, T.: Kernels for Structured Data. World Scientific, Singapore (2008)

    Book  MATH  Google Scholar 

  14. Pekalska, E., Duin, R.: The Dissimilarity Representation for Pattern Recognition: Foundations and Applications. World Scientific, Singapore (2005)

    Book  MATH  Google Scholar 

  15. Spillmann, B., Neuhaus, M., Bunke, H., Pekalska, E., Duin, R.: Transforming strings to vector spaces using prototype selection. In: Yeung, D.Y., Kwok, J., Fred, A., Roli, F., de Ridder, D. (eds.) SSPR 2006 and SPR 2006. LNCS, vol. 4109, pp. 287–296. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  16. Riesen, K., Bunke, H.: Graph classification based on vector space embedding. Int. Journal of Pattern Recognition and Artificial Intelligence (2008) (accepted for publication)

    Google Scholar 

  17. Riesen, K., Bunke, H.: Classifier ensembles for vector space embedding of graphs. In: Haindl, M., Kittler, J., Roli, F. (eds.) MCS 2007. LNCS, vol. 4472, pp. 220–230. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  18. Bunke, H., Allermann, G.: Inexact graph matching for structural pattern recognition. Pattern Recognition Letters 1, 245–253 (1983)

    Article  MATH  Google Scholar 

  19. Riesen, K., Bunke, H.: Approximate graph edit distance computation by means of bipartite graph matching. In: Image and Vision Computing (2008) (accepted for publication)

    Google Scholar 

  20. Riesen, K., Bunke, H.: IAM graph database repository for graph based pattern recognition and machine learning. In: da Vitoria, L., et al. (eds.) Structural, Syntactic, and Statistical Pattern Recognition. LNCS, vol. 5342, pp. 287–297. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  21. Nene, S., Nayar, S., Murase, H.: Columbia Object Image Library: COIL-100. Technical report, Department of Computer Science, Columbia University, New York (1996)

    Google Scholar 

  22. Watson, C., Wilson, C.: NIST Special Database 4, Fingerprint Database. National Institute of Standards and Technology (1992)

    Google Scholar 

  23. Berman, H., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T., Weissig, H., Shidyalov, I., Bourne, P.: The protein data bank. Nucleic Acids Research 28, 235–242 (2000)

    Article  Google Scholar 

  24. Dunn, J.: Well-separated clusters and optimal fuzzy partitions. Journal of Cybernetics 4, 95–104 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  25. Hubert, L., Schultz, J.: Quadratic assignment as a general data analysis strategy. British Journal of Mathematical and Statistical Psychology 29, 190–241 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  26. Rand, W.: Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association 66(336), 846–850 (1971)

    Article  Google Scholar 

  27. Riesen, K., Bunke, H.: Kernel k-means clustering applied to vector space embeddings of graphs. In: Prevost, L., Marinai, S., Schwenker, F. (eds.) ANNPR 2008. LNCS (LNAI), vol. 5064, pp. 24–35. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  28. Kuncheva, L.: Combining Pattern Classifiers: Methods and Algorithms. John Wiley, Chichester (2004)

    Book  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Riesen, K., Bunke, H. (2009). Cluster Ensembles Based on Vector Space Embeddings of Graphs. In: Benediktsson, J.A., Kittler, J., Roli, F. (eds) Multiple Classifier Systems. MCS 2009. Lecture Notes in Computer Science, vol 5519. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02326-2_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02326-2_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02325-5

  • Online ISBN: 978-3-642-02326-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics