Advertisement

Pattern Analysis and Applications

, Volume 17, Issue 1, pp 83–96 | Cite as

Automatic classifier selection for non-experts

  • Matthias Reif
  • Faisal Shafait
  • Markus Goldstein
  • Thomas Breuel
  • Andreas Dengel
Theoretical Advances

Abstract

Choosing a suitable classifier for a given dataset is an important part of developing a pattern recognition system. Since a large variety of classification algorithms are proposed in literature, non-experts do not know which method should be used in order to obtain good classification results on their data. Meta-learning tries to address this problem by recommending promising classifiers based on meta-features computed from a given dataset. In this paper, we empirically evaluate five different categories of state-of-the-art meta-features for their suitability in predicting classification accuracies of several widely used classifiers (including Support Vector Machines, Neural Networks, Random Forests, Decision Trees, and Logistic Regression). Based on the evaluation results, we have developed the first open source meta-learning system that is capable of accurately predicting accuracies of target classifiers. The user provides a dataset as input and gets an automatically created high-performance ready-to-use pattern recognition system in a few simple steps. A user study of the system with non-experts showed that the users were able to develop more accurate pattern recognition systems in significantly less development time when using our system as compared to using a state-of-the-art data mining software.

Keywords

Meta-learning Meta-features Landmarking Regression Classifier selection Classifier recommendation 

References

  1. 1.
    Abdelmessih SD, Shafait F, Reif M, Goldstein M (2010) Landmarking for meta-learning using RapidMiner. In: RapidMiner community meeting and conferenceGoogle Scholar
  2. 2.
    Ali S, Smith KA (2006) On learning algorithm selection for classification. Applied Soft Comput. 6:119–138CrossRefGoogle Scholar
  3. 3.
    Asuncion A, Newman D UCI machine learning repository (2007) http://www.ics.uci.edu/~mlearn/MLRepository.html University of California, Irvine, School of Information and Computer Sciences
  4. 4.
    Bensusan H, Giraud-Carrier C (2000) Casa batló is in passeig de gràcia or how landmark performances can describe tasks. In: Proceedings of the ECML-00 workshop on meta-learning: building automatic advice strategies for model selection and method combination, pp. 29–46Google Scholar
  5. 5.
    Bensusan H, Giraud-Carrier C, Kennedy C (2000) A higher-order approach to meta-learning. In: Proceedings of the ECML’2000 workshop on meta-learning: building automatic advice strategies for model selection and method combination, pp. 109–117Google Scholar
  6. 6.
    Bensusan H, Giraud-Carrier CG (2000) Discovering task neighbourhoods through landmark learning performances. In: PKDD ’00: Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, Springer-Verlag, London, UK, pp 325–330Google Scholar
  7. 7.
    Bensusan H, Kalousis A (2001) Estimating the predictive accuracy of a classifier. In: De Raedt L, Flach P (eds.) Machine Learning: ECML 2001, Lecture Notes in Computer Science, vol. 2167 Springer, Berlin, pp 25–36Google Scholar
  8. 8.
    Brazdil P, Soares C, da Costa JP (2003) Ranking learning algorithms: using IBL and meta-learning on accuracy and time results. Mach Learn. 50(3):251–277CrossRefMATHGoogle Scholar
  9. 9.
    Brazdil PB, Soares C (2000) Zoomed ranking: Selection of classification algorithms based on relevant performance information. In: Proceedings of principles of data mining and knowledge discovery, 4th European conference (PKDD-2000). Springer, pp 126–135Google Scholar
  10. 10.
    Breiman L (2001) Random forests. Mach Learn. 45:5–32CrossRefMATHGoogle Scholar
  11. 11.
    Chang CC, Lin CJ LIBSVM: a library for support vector machines (2001). Software available at http://www.csie.ntu.edu.tw/∼cjlin/libsvm
  12. 12.
    Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn. 20(3):273–297MATHGoogle Scholar
  13. 13.
    Engels R, Theusinger C (1998) Using a data metric for preprocessing advice for data mining applications. In: Proceedings of the European Conference on artificial intelligence (ECAI-98, Wiley, pp 430–434Google Scholar
  14. 14.
    Frasch JV, Lodwich A, Shafait F, Breuel TM (2011) A bayes-true data generator for evaluation of supervised and unsupervised learning methods. Pattern Recogn Lett. 32(11):1523–1531CrossRefGoogle Scholar
  15. 15.
    Friedman JH (1989) Regularized discriminant analysis. J Am Stat Assoc. 84(405):165–175CrossRefGoogle Scholar
  16. 16.
    Fürnkranz J, Petrak J (2001) An evaluation of landmarking variants. In: C. Giraud-Carrier, N. Lavrač, S. Moyle, B. Kavšek (eds.) Proceedings of the ECML/PKDD workshop on integrating aspects of data mining, decision support and meta-learning (IDDM-2001), Freiburg, Germany, pp 57–68Google Scholar
  17. 17.
    Gama J, Brazdil P (1995) Characterization of classification algorithms. In: C. Pinto-Ferreira, N. Mamede (eds.) Progress in artificial intelligence, Lecture Notes in Computer Science, vol. 990, Springer Heidelberg, pp 189–200Google Scholar
  18. 18.
    Giraud-Carrier C (2005) The data mining advisor: meta-learning at the service of practitioners. In: Proceedings of the fourth international conference on machine learning and applications, 2005, pp 113–119Google Scholar
  19. 19.
    Hilario M, Nguyen P, Do H, Woznica A, Kalousis A (2011) Ontology-based meta-mining of knowledge discovery workflows. In: Jankowski N, Duch W, Grąbczewski K (eds.) Meta-Learning in Computational Intelligence, Studies in Computational Intelligence, vol. 358, Springer Heidelberg, pp 273–315Google Scholar
  20. 20.
    John GH, Kohavi R, Pfleger K (1994) Irrelevant features and the subset selection problem. In: International Conference on machine learning, Morgan Kaufmann, pp 121–129Google Scholar
  21. 21.
    Kalousis A, Hilario M (2001) Feature selection for meta-learning. In: Cheung D, Williams G, Li Q (eds.) Advances in Knowledge Discovery and Data Mining, Lecture Notes in Computer Science, vol. 2035, Springer Heidelberg, pp 222–233Google Scholar
  22. 22.
    Kietz JU, Serban F, Bernstein A, Fischer S (2010) Data mining workflow templates for intelligent discovery assistance and auto-experimentation. In: Proceedings of the ECML/PKDD-10 Workshop on Third Generation Data Mining: Towards Service-Oriented Knowledge Discovery, pp 1–12Google Scholar
  23. 23.
    King RD, Feng C, Sutherland A (1995) Statlog: comparison of classification algorithms on large real-world problems. Appl Artif Intell. 9(3):289–333CrossRefGoogle Scholar
  24. 24.
    Köpf C, Taylor C, Keller J (2000) Meta-analysis: from data characterisation for meta-learning to meta-regression. In: Proceedings of the PKDD-00 Workshop on Data Mining, Decision Support, Meta-Learning and ILPGoogle Scholar
  25. 25.
    Lindner G, Studer R (1999) Ast: support for algorithm selection with a cbr approach. In: Recent Advances in Meta-Learning and Future Work, pp 418–423Google Scholar
  26. 26.
    Esprit project METAL (#26.357): A meta-learning assistant for providing user support in data mining and machine learning (1999–2002). http://www.ofai.at/research/impml/metal/
  27. 27.
    Michie D, Spiegelhalter D, Taylor C (1994) Machine Learning, Neural & Statistical Classification. Ellis Horwood, ChichesterGoogle Scholar
  28. 28.
    Mierswa I, Wurst M, Klinkenberg R, Scholz M, Euler T (2006) Yale: Rapid prototyping for complex data mining tasks. In: Ungar L, Craven M, Gunopulos D, Eliassi-Rad T (eds.) KDD ’06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, New York, NY, USA, pp 935–940Google Scholar
  29. 29.
    Peng Y, Flach P, Soares C, Brazdil P (2002) Improved dataset characterisation for meta-learning. In: S. Lange, K. Satoh, C. Smith (eds.) Discovery Science, Lecture Notes in Computer Science, vol. 2534, Springer, Heidelberg, pp 193–208Google Scholar
  30. 30.
    Pfahringer B, Bensusan H, Giraud-Carrier C (2000) Meta-learning by landmarking various learning algorithms. In: In Proceedings of the Seventeenth international conference on machine learning, Morgan Kaufmann, pp 743–750Google Scholar
  31. 31.
    Piatetsky-Shapiro G (2010) Data mining / analytic tools used poll http://www.kdnuggets.com/polls/2010/data-mining-analytics-tools.html
  32. 32.
    Qiao Z, Zhou L, Huang JZ (2009) Sparse linear discriminant analysis with applications to high dimensional low sample size data. IAENG Int J Appl Math. 39(1):48–60MATHMathSciNetGoogle Scholar
  33. 33.
  34. 34.
    Quinlan JR (1992) Learning with continuous classes. In Proceedings AI’92, pp. 343–348Google Scholar
  35. 35.
    Quinlan R (2002) Data mining tools see5 and c5.0. http://www.rulequest.com/see5-info.html
  36. 36.
    Rendell L, Cho H (1990) Empirical learning as a function of concept character. Mach Learn. 5:267–298Google Scholar
  37. 37.
    Rice JR (1976) The algorithm selection problem. Adv Comput. 15:65–118Google Scholar
  38. 38.
    Segrera S, Pinho J, Moreno M (2008) Information-theoretic measures for meta-learning. In: Corchado E, Abraham A, Pedrycz W (eds.) Hybrid Artificial Intelligence Systems, Lecture Notes in Computer Science, vol. 5271, Springer, Heidelberg, pp 458–465Google Scholar
  39. 39.
    Sohn SY (1999) Meta analysis of classification algorithms for pattern recognition. IEEE Trans Pattern Anal Mach Intell. 21(11):1137 –1144CrossRefGoogle Scholar
  40. 40.
    Todorovski L, Brazdil P, Soares C (2000) Report on the experiments with feature selection in meta-level learning. In: Proceedings of the PKDD-00 Workshop on Data Mining, Decision Support, Meta-Learning and ILP, pp 27–39Google Scholar
  41. 41.
    Vlachos P StatLib datasets archive (1998) http://lib.stat.cmu.edu Department of Statistics, Carnegie Mellon University
  42. 42.
    Wolpert DH (1996) The lack of a priori distinctions between learning algorithms. Neural Comput. 8(7):1341–1390CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2012

Authors and Affiliations

  • Matthias Reif
    • 1
  • Faisal Shafait
    • 1
  • Markus Goldstein
    • 1
  • Thomas Breuel
    • 2
  • Andreas Dengel
    • 1
  1. 1.German Research Center for Artificial Intelligence (DFKI)KaiserslauternGermany
  2. 2.Department of Computer ScienceUniversity of KaiserslauternKaiserslauternGermany

Personalised recommendations