Knowledge and Information Systems

, Volume 52, Issue 2, pp 341–378 | Cite as

The (black) art of runtime evaluation: Are we comparing algorithms or implementations?

  • Hans-Peter Kriegel
  • Erich Schubert
  • Arthur ZimekEmail author
Survey Paper


Any paper proposing a new algorithm should come with an evaluation of efficiency and scalability (particularly when we are designing methods for “big data”). However, there are several (more or less serious) pitfalls in such evaluations. We would like to point the attention of the community to these pitfalls. We substantiate our points with extensive experiments, using clustering and outlier detection methods with and without index acceleration. We discuss what we can learn from evaluations, whether experiments are properly designed, and what kind of conclusions we should avoid. We close with some general recommendations but maintain that the design of fair and conclusive experiments will always remain a challenge for researchers and an integral part of the scientific endeavor.


Methodology Efficiency evaluation Runtime experiments Implementation matters 


  1. 1.
    Achtert E, Bernecker T, Kriegel H-P, Schubert E, Zimek A (2009) ELKI in time: ELKI 0.2 for the performance evaluation of distance measures for time series. In: Proceedings of the 11th international symposium on spatial and temporal databases (SSTD), Aalborg, Denmark, pp 436–440Google Scholar
  2. 2.
    Achtert E, Böhm C, Kriegel H-P, Kröger P, Zimek A (2007) Robust, complete, and efficient correlation clustering. In: Proceedings of the 7th SIAM international conference on data mining (SDM), Minneapolis, MN, pp 413–418Google Scholar
  3. 3.
    Achtert E, Goldhofer S, Kriegel H-P, Schubert E, Zimek A (2012) Evaluation of clusterings—metrics and visual support. In: Proceedings of the 28th international conference on data engineering (ICDE), Washington, DC, pp 1285–1288Google Scholar
  4. 4.
    Achtert E, Hettab A, Kriegel H-P, Schubert E, Zimek A (2011) Spatial outlier detection: data, algorithms, visualizations. In: Proceedings of the 12th international symposium on spatial and temporal databases (SSTD), Minneapolis, MN, pp 512–516Google Scholar
  5. 5.
    Achtert E, Kriegel H-P, Reichert L, Schubert E, Wojdanowski R, Zimek A (2010) Visual evaluation of outlier detection models. In: Proceedings of the 15th international conference on database systems for advanced applications (DASFAA), Tsukuba, Japan, pp 396–399Google Scholar
  6. 6.
    Achtert E, Kriegel H-P, Schubert E, Zimek A (2013) Interactive data mining with 3D-parallel-coordinate-trees. In: Proceedings of the ACM international conference on management of data (SIGMOD), New York City, NY, pp 1009–1012Google Scholar
  7. 7.
    Achtert E, Kriegel H-P, Zimek A (2008) ELKI: a software system for evaluation of subspace clustering algorithms. In: Proceedings of the 20th international conference on scientific and statistical database management (SSDBM), Hong Kong, China, pp 580–585Google Scholar
  8. 8.
    Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large data bases (VLDB), Santiago de Chile, Chile, pp 487–499Google Scholar
  9. 9.
    Alsabti K, Ranka S, Singh V (1998) An efficient k-means clustering algorithm. In: Proceedings of IPPS/SPDP workshop on high performance data miningGoogle Scholar
  10. 10.
    Anderberg MR (1973) Cluster analysis for applications. Probability and mathematical statistics. Academic Press, CambridgezbMATHGoogle Scholar
  11. 11.
    Apache Software Foundation (2015) Apache Commons Math.
  12. 12.
    Apache Software Foundation (2015) Apache Mahout.
  13. 13.
    Apache Software Foundation (2015) Apache Spark.
  14. 14.
    Arthur D, Vassilvitskii S (2007) k-means\(++\): the advantages of careful seeding. In: Proceedings of the 18th Annual ACM-SIAM symposium on discrete algorithms (SODA), New Orleans, LA, pp 1027–1035Google Scholar
  15. 15.
    Arya S, Mount DM (1993) Approximate nearest neighbor queries in fixed dimensions. In: Proceedings of the 4th annual ACM/SIGACT-SIAM symposium on discrete algorithms (SODA), Austin, TX, pp 271–280Google Scholar
  16. 16.
    Bayardo Jr RJ, Goethals B, Zaki MJ (eds) (2005) FIMI ’04, Proceedings of the IEEE ICDM workshop on frequent itemset mining implementations, Brighton, UK, November 1, 2004, volume 126 of CEUR Workshop Proceedings. CEUR-WS.orgGoogle Scholar
  17. 17.
    Beckmann N, Kriegel H-P, Schneider R, Seeger B (1990) The R*-tree: an efficient and robust access method for points and rectangles. In: Proceedings of the ACM international conference on management of data (SIGMOD), Atlantic City, NJ, pp 322–331Google Scholar
  18. 18.
    Bentley JL (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18(9):509–517CrossRefzbMATHGoogle Scholar
  19. 19.
    Beygelzimer A, Kakade S, Langford J (2006) Cover trees for nearest neighbors. In: Proceedings of the 23rd international conference on machine learning (ICML), Pittsburgh, PA, pp 97–104Google Scholar
  20. 20.
    Bezanson J, Edelman A, Karpinski S, Shah VB (2014) Julia: a fresh approach to numerical computing. CoRR, arXiv:1411.1607
  21. 21.
    Bock H (2007) Clustering methods: a history of k-means algorithms. In: Brito P, Cucumel G, Bertrand P, Carvalho F (eds) Selected contributions in data analysis and classification. Springer, Berlin, pp 161–172CrossRefGoogle Scholar
  22. 22.
    Bodon F (2003) A fast APRIORI implementation. In: Proceedings of the ICDM workshop on frequent itemset mining implementations (FIMI ’03), Melbourne, Florida, USAGoogle Scholar
  23. 23.
    Borgelt C (2003) Efficient implementations of Apriori and Eclat. In: Proceedings of the ICDM workshop on frequent itemset mining implementations (FIMI ’03), Melbourne, Florida, USAGoogle Scholar
  24. 24.
    Breunig MM, Kriegel H-P, Ng R, Sander J (2000) LOF: identifying density-based local outliers. In: Proceedings of the ACM international conference on management of data (SIGMOD), Dallas, TX, pp 93–104Google Scholar
  25. 25.
    Budak C, Georgiou T, Agrawal D, El Abbadi A (2013) GeoScope: online detection of geo-correlated information trends in social networks. Proc VLDB Endow 7(4):229–240CrossRefGoogle Scholar
  26. 26.
    Campello RJGB, Moulavi D, Zimek A, Sander J (2015) Hierarchical density estimates for data clustering, visualization, and outlier detection. ACM Trans Knowl Discov Data (TKDD) 10(1):5:1–51Google Scholar
  27. 27.
    Campos GO, Zimek A, Sander J, Campello RJGB, Micenková B, Schubert E, Assent I, Houle ME (2016) On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study. Data Min Knowl Discov 30:891–927MathSciNetCrossRefGoogle Scholar
  28. 28.
    Ciaccia P, Patella M, Zezula P (1997) M-tree: an efficient access method for similarity search in metric spaces. In: Proceedings of the 23rd international conference on very large data bases (VLDB), Athens, Greece, pp 426–435Google Scholar
  29. 29.
    Cordeiro RLF, Traina AJM, Faloutsos C, Traina C Jr (2013) Halite: fast and scalable multiresolution local-correlation clustering. IEEE Trans Knowl Data Eng 25(2):387–401CrossRefGoogle Scholar
  30. 30.
    Eaton JW, Bateman D, Hauberg S, Wehbring R (2014) GNU Octave version 3.8.1 manual: a high-level interactive language for numerical computations. CreateSpace Independent Publishing PlatformGoogle Scholar
  31. 31.
    Elkan C (2003) Using the triangle inequality to accelerate k-means. In: Proceedings of the 20th international conference on machine learning (ICML), Washington, DC, pp 147–153Google Scholar
  32. 32.
    Eppstein D (1998) Fast hierarchical clustering and other applications of dynamic closest pairs. In: Proceedings of the 9th annual ACM-SIAM symposium on discrete algorithms (SODA), San Francisco, CA, pp 619–628Google Scholar
  33. 33.
    Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd ACM international conference on knowledge discovery and data mining (KDD), Portland, OR, pp 226–231Google Scholar
  34. 34.
    Forgy EW (1965) Cluster analysis of multivariate data: efficiency versus interpretability of classifications. Biometrics 21:768–769Google Scholar
  35. 35.
    Fournier-Viger P, Gomariz A, Gueniche T, Soltani A, Wu C, Tseng VS (2014) SPMF: a Java open-source pattern mining library. J Mach Learn Res 15(1):3389–3393zbMATHGoogle Scholar
  36. 36.
    Färber I, Günnemann S, Kriegel H-P, Kröger P, Müller E, Schubert E, Seidl T, Zimek A (2010) On using class-labels in evaluation of clusterings. In: MultiClust: 1st International workshop on discovering, summarizing and using multiple clusterings held in conjunction with KDD 2010, Washington, DCGoogle Scholar
  37. 37.
    Gan J, Tao Y (2015) DBSCAN revisited: mis-claim, un-fixability, and approximation. In: Proceedings of the ACM international conference on management of data (SIGMOD), Melbourne, Australia, pp 519–530Google Scholar
  38. 38.
    Geusebroek JM, Burghouts GJ, Smeulders AWM (2005) The Amsterdam library of object images. Int J Comput Vis 61(1):103–112CrossRefGoogle Scholar
  39. 39.
    Goethals B, Zaki MJ, (eds) (2003) FIMI ’03, Frequent Itemset Mining Implementations, Proceedings of the ICDM 2003 workshop on frequent itemset mining implementations, 19 December 2003, Melbourne, Florida, USA, volume 90 of CEUR workshop proceedings. CEUR-WS.orgGoogle Scholar
  40. 40.
    Haifeng L (2015) SmileMiner (statistical machine intelligence & learning engine).
  41. 41.
    Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. ACM SIGKDD Explor 11(1):10–18CrossRefGoogle Scholar
  42. 42.
    Hamerly G (2010) Making k-means even faster. In: Proceedings of the 10th SIAM international conference on data mining (SDM), Columbus, OH, pp 130–140Google Scholar
  43. 43.
    Hamerly G, Drake J (2015) Accelerating Lloyd’s algorithm for k-means clustering. In: Celebi ME (ed) Partitional clustering algorithms, chapter 2. Springer, Switzerland, pp 41–78Google Scholar
  44. 44.
    Hamerly G, Elkan C (2003) Learning the k in k-means. In: Proceedings of the Annual conference on neural information processing systems (NIPS), Vancouver, BC, pp 281–288Google Scholar
  45. 45.
    Hartigan JA (1975) Clustering algorithms. Wiley, New YorkzbMATHGoogle Scholar
  46. 46.
    Hartigan JA, Wong MA (1979) Algorithm AS 136: A k-means clustering algorithm. J R Stat Soc Ser C (Appl Stat) 28(1):100–108Google Scholar
  47. 47.
    Jones E, Oliphant T, Peterson P et al (2001) SciPy: open source scientific tools for PythonGoogle Scholar
  48. 48.
    Kanungo T, Mount DM, Netanyahu NS, Piatko CD, Silverman R, Wu AY (2002) An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24(7):881–892CrossRefzbMATHGoogle Scholar
  49. 49.
    Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New YorkCrossRefzbMATHGoogle Scholar
  50. 50.
    Kriegel H-P, Kröger P, Sander J, Zimek A (2011) Density-based clustering. Wiley Interdiscip Rev Data Min Knowl Discov 1(3):231–240CrossRefGoogle Scholar
  51. 51.
    Leutenegger ST, Edgington JM, Lopez MA (1997) STR: a simple and efficient algorithm for R-tree packing. In: Proceedings of the 13th international conference on data engineering (ICDE), Birmingham, UK, pp 497–506Google Scholar
  52. 52.
    Lloyd SP (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28(2):129–136MathSciNetCrossRefzbMATHGoogle Scholar
  53. 53.
    MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: 5th Berkeley symposium on mathematics, statistics, and probabilistics, vol 1, pp 281–297Google Scholar
  54. 54.
    Mahran S, Mahar K (2008) Using grid for accelerating density-based clustering. In: Proceedings of 8th IEEE international conference on computer and information technology, CIT 2008, Sydney, Australia, pp 35–40Google Scholar
  55. 55.
    Murtagh F (1985) A survey of algorithms for contiguity-constrained clustering and related problems. Comput J 28(1):82–88CrossRefGoogle Scholar
  56. 56.
    Müllner D (2011) Modern hierarchical, agglomerative clustering algorithms. arXiv preprint, arXiv:1207.0016
  57. 57.
    Nijssen S, Kok JN (2006) Frequent subgraph miners: runtimes don’t say everything. In: Proceedings of the 4th workshop on mining and learning with graphs (MLG), Berlin, Germany, pp 173–180Google Scholar
  58. 58.
    Olson CF (1995) Parallel algorithms for hierarchical clustering. Parallel Comput 21(8):1313–1325MathSciNetCrossRefzbMATHGoogle Scholar
  59. 59.
    Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830MathSciNetzbMATHGoogle Scholar
  60. 60.
    Pelleg D, Moore A (1999) Accelerating exact k-means algorithms with geometric reasoning. In: Proceedings of the 5th ACM international conference on knowledge discovery and data mining (SIGKDD), San Diego, CA, pp 277–281Google Scholar
  61. 61.
    Pelleg D, Moore A (2000) X-means: extending k-means with efficient estimation of the number of clusters. In: Proceedings of the 17th international conference on machine learning (ICML), Stanford University, CA, vol 1, pp 727–734Google Scholar
  62. 62.
    Phillips SJ (2002) Acceleration of k-means and related clustering algorithms. In: The 4th international workshop on algorithm engineering and experiments (ALENEX) 2002, San Francisco, CA, pp 166–177Google Scholar
  63. 63.
    Prim RC (1957) Shortest connection networks and some generalizations. Bell Syst Tech J 36(6):1389–1401CrossRefGoogle Scholar
  64. 64.
    R Core Team. R: a language and environment for statistical computing. R Foundation for Statistical Computing.
  65. 65.
    Raff E (2015) Java statistical analysis tool, a Java library for machine learning.
  66. 66.
    Rohlf FJ (1973) Algorithm 76: hierarchical clustering using the minimum spanning tree. Comput J 16(1):93–95Google Scholar
  67. 67.
    Sander J, Ester M, Kriegel H-P, Xu X (1998) Density-based clustering in spatial databases: the algorithm GDBSCAN and its applications. Data Min Knowl Discov 2(2):169–194CrossRefGoogle Scholar
  68. 68.
    Schubert E, Koos A, Emrich T, Züfle A, Schmid KA, Zimek A (2015) A framework for clustering uncertain data. Proc VLDB Endow 8(12):1976–1979CrossRefGoogle Scholar
  69. 69.
    Schubert E, Zimek A, Kriegel H-P (2013) Geodetic distance queries on R-trees for indexing geographic data. In: Proceedings of the 13th international symposium on spatial and temporal databases (SSTD), Munich, Germany, pp 146–164Google Scholar
  70. 70.
    Schubert E, Zimek A, Kriegel H-P (2014) Generalized outlier detection with flexible kernel density estimates. In: Proceedings of the 14th SIAM international conference on data mining (SDM), Philadelphia, PA, pp 542–550Google Scholar
  71. 71.
    Schubert E, Zimek A, Kriegel H-P (2014) Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection. Data Min Knowl Discov 28(1):190–237MathSciNetCrossRefzbMATHGoogle Scholar
  72. 72.
    Sculley D (2010) Web-scale k-means clustering. In: Proceedings of the 19th international conference on world wide web (WWW), Raleigh, NC, pp 1177–1178Google Scholar
  73. 73.
    Sibson R (1973) SLINK: an optimally efficient algorithm for the single-link cluster method. Comput J 16(1):30–34MathSciNetCrossRefGoogle Scholar
  74. 74.
    Šidlauskas D, Jensen CS (2014) Spatial joins in main memory: implementation matters!. Proc VLDB Endow 8(1):97–100CrossRefGoogle Scholar
  75. 75.
    Slonim N, Aharoni E, Crammer K (2013) Hartigan’s k-means versus Lloyd’s k-means-is it time for a change? In: Proceedings of the 23rd international joint conference on artificial intelligence (IJCAI), Beijing, ChinaGoogle Scholar
  76. 76.
    Sneath PHA (1957) The application of computers to taxonomy. J Gen Microbiol 17:201–226CrossRefGoogle Scholar
  77. 77.
    Sonnenburg S, Rätsch G, Henschel S, Widmer C, Behr J, Zien A, De Bona F, Binder A, Gehl C, Franc V (2010) The SHOGUN machine learning toolbox. J Mach Learn Res 11:1799–1802zbMATHGoogle Scholar
  78. 78.
    Sowell B, Salles MAV, Cao T, Demers AJ, Gehrke J (2013) An experimental analysis of iterated spatial joins in main memory. Proc VLDB Endow 6(14):1882–1893CrossRefGoogle Scholar
  79. 79.
    Steinbach M, Karypis G, Kumar V (2000) A comparison of document clustering techniques. In: KDD workshop on text mining, vol 400, pp 525–526Google Scholar
  80. 80.
    Steinhaus H (1956) Sur la division des corp materiels en parties. Bull Acad Pol Sci 1:801–804MathSciNetzbMATHGoogle Scholar
  81. 81.
    Ting KM, Zhou G-T, Liu FT, Tan SC (2013) Mass estimation. Mach Learn 90(1):127–160MathSciNetCrossRefzbMATHGoogle Scholar
  82. 82.
    Tomašev N (2015) hubminer: Hub miner v1.1.
  83. 83.
    Vreeken J, Tatti N (2014) Interesting patterns. In: Aggarwal CC, Han J (eds) Frequent pattern mining, chapter 5. Springer, pp 105–134Google Scholar
  84. 84.
    Ward JH Jr (1963) Hierarchical grouping to optimize an objective function. J Am Stat Assoc 58(301):236–244MathSciNetCrossRefGoogle Scholar
  85. 85.
    Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, BurlingtonzbMATHGoogle Scholar
  86. 86.
    Wolpert DH (1996) The lack of a priori distinctions between learning algorithms. Neural Comput 8(7):1341–1390CrossRefGoogle Scholar
  87. 87.
    Wörlein M, Meinl T, Fischer I, Philippsen M (2005) A quantitative comparison of the subgraph miners MoFa, gSpan, FFSM, and Gaston. In: Proceedings of the 9th European conference on principles and practice of knowledge discovery in databases PKDD), Porto, Portugal, pp 392–403Google Scholar
  88. 88.
    Yu C, Ooi BC, Tan K-L, Jagadish V (2001) Indexing the distance: an efficient method to KNN processing. In: Proceedings of the 27th international conference on very large data bases (VLDB), Roma, Italy, pp 421–430Google Scholar
  89. 89.
    Zheng Z, Kohavi R, Mason L (2001) Real world performance of association rule algorithms. In: Proceedings of the 7th ACM international conference on knowledge discovery and data mining (SIGKDD), San Francisco, CA, pp 401–406Google Scholar

Copyright information

© Springer-Verlag London 2016

Authors and Affiliations

  1. 1.Institute for InformaticsLudwig-Maximilians-Universität MünchenMunichGermany
  2. 2.Department of Mathematics and Computer ScienceUniversity of Southern DenmarkOdense MDenmark

Personalised recommendations