Recovery Rate of Clustering Algorithms

  • Fajie Li
  • Reinhard Klette
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5414)


This article provides a simple and general way for defining the recovery rate of clustering algorithms using a given family of old clusters for evaluating the performance of the algorithm when calculating a family of new clusters.

Under the assumption of dealing with simulated data (i.e., known old clusters), the recovery rate is calculated using one proposed exact (but slow) algorithm, or one proposed approximate algorithm (with feasible run time).


Cluster Algorithm Recovery Rate Approximate Algorithm Video Retrieval Multiple Kernel Learning 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Allan, J., Feng, A., Bolivar, A.: Flexible Intrinsic Evaluation of Hierarchical Clustering for TDT. In: Proc. CIKM 2003, New Orleans, Louisiana, USA, November 3–8 (2003)Google Scholar
  2. 2.
    Borgelt, C.: Prototype-based Classification and Clustering. Ph.D. Thesis, University of Magdeburg, Germany (2006)Google Scholar
  3. 3.
    Brohee, S., van Helden, J.: Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinformatics 7, 488 (2006)CrossRefGoogle Scholar
  4. 4.
    Crabtree, D., Gao, X., Andreae, P.: Universal Evaluation Method for Web Clustering Results. Technical Report CS-IR-05-3, Department of Computer Science, Victoria University of Wellington, New Zealand (2005)Google Scholar
  5. 5.
    Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Proc. ECCV Workshop Statistical Learning Computer Vision, pp. 59–74 (2004)Google Scholar
  6. 6.
    Datta, S.: Evaluation of clustering algorithms for gene expression data. BMC Bioinformatics 7(suppl. 4), 17 (2006)CrossRefGoogle Scholar
  7. 7.
  8. 8.
    Datta, S.: Methods for evaluating clustering algorithms for gene expression data using a reference set of functional classes. BMC Bioinformatics 7, 397 (2006)CrossRefGoogle Scholar
  9. 9.
    Efstathiou, G., Frenk, C.S., White, S.D.M., Davis, M.: Gravitational clustering from scale-free initial conditions. Monthly Notices RAS 235, 715–748 (1988)CrossRefGoogle Scholar
  10. 10.
    Georgescu, B., Shimshoni, I., Meer, P.: Mean Shift Based Clustering in High Dimensions: A Texture Classification Example. In: Proc. 9th IEEE International Conference on Computer Vision (ICCV) (2003)Google Scholar
  11. 11.
    Helmi, A., de Zeeuw, P.T.: Mapping the substructure in the Galactic halo with the next generation of astrometric satellites. Astron. Soc. 319, 657–665 (2000)CrossRefGoogle Scholar
  12. 12.
  13. 13.
    Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: A review. ACM Computing Surveys 31(3), 264–323 (1999)CrossRefGoogle Scholar
  14. 14.
    Kehtarnavaz, N., Monaco, J., Nimtschek, J., Weeks, A.: Color image segmentation using multi-scale clustering. In: Proc. IEEE Southwest Symp. Image Analysis Interpretation, pp. 142–147 (1998)Google Scholar
  15. 15.
    Knebe, A., Gill, S.P.D., Kawata, D., Gibson, B.K.: Mapping substructures in dark matter haloes. Astron. Soc. 357, 35–39 (2005)Google Scholar
  16. 16.
    Larsen, B., Aone, C.: Fast and Effective Text Mining Using Linear Time Document Clustering. In: Proc. 5th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 16–22. ACM Press, San Diego (1999)Google Scholar
  17. 17.
    Law, H.C.: Clustering, Dimensionality Reduction, and Side Information. Ph.D. Thesis, Michigan State University, the United States (2006)Google Scholar
  18. 18.
    Leouski, A.V., Croft, W.B.: An Evaluation of Techniques for Clustering Search Results. Technical Report IR-76, Department of Computer Science, University of Massachusetts, Amherst (1996)Google Scholar
  19. 19.
    Li, F., Klette, R.: About the calculation of upper bounds for cluster recovery rates. Technical Report CITR-TR-224, Computer Science Department, The University of Auckland, Auckland, New Zealand (2008),
  20. 20.
    Lian, N.-X., Tan, Y.P., Chan, K.L.: Efficient video retrieval using shot clustering and alignment. In: Proc. ICICS-PCM, pp. 1801–1805 (2003)Google Scholar
  21. 21.
    Rand, W.M.: Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association 66, 846–850 (1971)CrossRefGoogle Scholar
  22. 22.
    Silverman, B.W.: Density Estimation. Chapman & Hall, London (1986)CrossRefzbMATHGoogle Scholar
  23. 23.
    Wang, Z., Chen, S.C., Sun, T.: MultiK-MHKS: a novel multiple kernel learning algorithm. IEEE PAMI 30, 348–353 (2008)CrossRefGoogle Scholar
  24. 24.
    Wu, K.L., Yang, M.S.: Mean shift-based clustering. Pattern Recognition 40, 3035–3052 (2007)CrossRefzbMATHGoogle Scholar
  25. 25.
    Zhao, Y., Karypis, G.: Evaluation of hierarchical clustering algorithms for document datasets. In: Proc. CIKM 2002, McLean, Virginia, USA, November 4–9 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Fajie Li
    • 1
  • Reinhard Klette
    • 2
  1. 1.Institute for Mathematics and Computing ScienceUniversity of GroningenGroningenThe Netherlands
  2. 2.Computer Science DepartmentThe University of AucklandAucklandNew Zealand

Personalised recommendations