Knowledge and Information Systems

, Volume 14, Issue 1, pp 1–37

Top 10 algorithms in data mining

  • Xindong Wu
  • Vipin Kumar
  • J. Ross Quinlan
  • Joydeep Ghosh
  • Qiang Yang
  • Hiroshi Motoda
  • Geoffrey J. McLachlan
  • Angus Ng
  • Bing Liu
  • Philip S. Yu
  • Zhi-Hua Zhou
  • Michael Steinbach
  • David J. Hand
  • Dan Steinberg
Survey Paper

DOI: 10.1007/s10115-007-0114-2

Cite this article as:
Wu, X., Kumar, V., Ross Quinlan, J. et al. Knowl Inf Syst (2008) 14: 1. doi:10.1007/s10115-007-0114-2

Abstract

This paper presents the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in December 2006: C4.5, k-Means, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART. These top 10 algorithms are among the most influential data mining algorithms in the research community. With each algorithm, we provide a description of the algorithm, discuss the impact of the algorithm, and review current and further research on the algorithm. These 10 algorithms cover classification, clustering, statistical learning, association analysis, and link mining, which are all among the most important topics in data mining research and development.

Copyright information

© Springer-Verlag London Limited 2007

Authors and Affiliations

  • Xindong Wu
    • 1
  • Vipin Kumar
    • 2
  • J. Ross Quinlan
    • 3
  • Joydeep Ghosh
    • 4
  • Qiang Yang
    • 5
  • Hiroshi Motoda
    • 6
  • Geoffrey J. McLachlan
    • 7
  • Angus Ng
    • 8
  • Bing Liu
    • 9
  • Philip S. Yu
    • 10
  • Zhi-Hua Zhou
    • 11
  • Michael Steinbach
    • 12
  • David J. Hand
    • 13
  • Dan Steinberg
    • 14
  1. 1.Department of computer ScienceUniversity of VermontBurlingtonUSA
  2. 2.Department of Computer Science and EngineeringUniversity of MinnesotaMinneapolisUSA
  3. 3.Rulequest Research Pty LtdSt IvesAustralia
  4. 4.Department of Electrical and Computer EngineeringUniversity of Texas at AustinAustinUSA
  5. 5.Department of Computer ScienceHong Kong University of Science and TechnologyHonkongChina
  6. 6.AFOSR/AOARD and Osaka UniversityMinato-ku, TokyoJapan
  7. 7.Department of MathematicsThe University of QueenslandBrisbaneAustralia
  8. 8.School of MedicineGriffith UniversityBrisbaneAustralia
  9. 9.Department of Computer ScienceUniversity of Illinois at ChicagoChicagoUSA
  10. 10.IBM T. J. Watson Research CenterHawthorneUSA
  11. 11.National Key Laboratory for Novel Software TechnologyNanjing UniversityNanjingChina
  12. 12.Department of Computer Science and EngineeringUniversity of MinnesotaMinneapolisUSA
  13. 13.Department of MathematicsImperial CollegeLondonUK
  14. 14.Salford SystemsSan DiegoUSA