## Abstract

An evaluation and choice of learning algorithms is a current research area in data mining, artificial intelligence and pattern recognition, etc. Supervised learning is one of the tasks most frequently used in data mining. There are several learning algorithms available in machine learning field and new algorithms are being added in machine learning literature. There is a need for selecting the best suitable learning algorithm for a given data. With the information explosion of different learning algorithms and the changing data scenarios, there is a need of smart learning system. The paper shows one approach where past experiences learned are used to suggest the best suitable learner using 3 meta-features namely simple, statistical and information theoretic features. The system tests 38 UCI benchmark datasets from various domains using nine classifiers from various categories. It is observed that for 29 datasets, i.e., 76 % of datasets, both the predicted and actual accuracies directly match. The proposed approach is found to be correct for algorithm selection of these datasets. New proposed equation of finding classifier accuracy based on meta-features is determined and validated. The study compares various supervised learning algorithms by performing tenfold cross-validation paired *t* test. The work helps in a critical step in data mining for selecting the suitable data mining algorithm.

## Keyword

Machine learning Data mining techniques Classification Data characteristics Learning algorithms Intelligent data analysis## Notes

### Acknowledgments

The Authors wish to thank the Editors and the anonymous Reviewers for their detailed comments and suggestions which significantly contributed to the improvement of the manuscript. The authors acknowledge support and help by Suhas Gore, P.G. student during the work.

## References

- Alexandros K, Melanie H (2001) Model selection via meta learning: a comparitive study. Int J Artif Intell Tools 10(4):525–554CrossRefGoogle Scholar
- Alpaydin E (2010) Introduction to machine learning. PHI learning, New DelhizbMATHGoogle Scholar
- Bouckaert R (2003) Choosing between two learning algorithms on calibrated tests. In: Proceedings of 20th international conference on machine learning. Morgan Kaufmann, pp 51–58Google Scholar
- Brazdil P, Soares C (2000) A comparison of ranking methods for classification algorithm selection. In: de Mantaras R, Plaza E (eds) Machine learning: proceedings of the 11th European conference on machine learning ECML2000. Springer, Berlin, pp 63–74Google Scholar
- Brazdil P, Soares C, Da Costa J (2003) Ranking learning algorithms: using ibl and meta-learning on accuracy and time results. Mach Learn 50(3):251–277CrossRefzbMATHGoogle Scholar
- Brazdil P, Giraud Carrier C, Soares C, Vilalta R (2008) Metalearning: applications to data mining. Springer, BerlinzbMATHGoogle Scholar
- Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140zbMATHGoogle Scholar
- Cai Q, He H, Man H (2014) Imbalanced evolving self-organizing learning. Neurocomputing 133:258–270CrossRefGoogle Scholar
- Caruana R, Niculescu-Mizil A (2006) An Empirical comparison of supervised learning algorithms. In: Proceedings of the 23rd International conference on machine learning (ICML2006), pp 161–168Google Scholar
- Chapelle O, Scholkopf B, Zien A (2006) Semi-Supervised Learning. MIT Press, CambridgeCrossRefGoogle Scholar
- Chawla N, Bowyer K, Hall L, Kegelmeyer W (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357zbMATHGoogle Scholar
- Cleveland W, Devlin S (1988) Locally weighted regression: an approach to regression analysis by local fitting. J Am Stat Assoc 403:596–610CrossRefzbMATHGoogle Scholar
- Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27CrossRefzbMATHGoogle Scholar
- Curran K, Yuan P, Coyle D (2011) Using acoustic sensors to discriminate between nasal and mouth breathing. Int J Bioinform Res Appl 7(4):382–396Google Scholar
- de Tiago PF, da Silva AJ, Ludermir TB, de Oliveira WR (2014) An automatic methodology for construction of multi-classifier systems based on the combination of selection and fusion. Prog Artif Intell 2:205–215CrossRefGoogle Scholar
- Dietterich TG (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 10(7):1895–1924CrossRefGoogle Scholar
- Dzeroski S, Zenko B (2004) Is combining classifiers with stacking better than selecting the best one? Mach Learn 54:255–273CrossRefzbMATHGoogle Scholar
- EI-Hefnawy N (2014) Solving bi-level problems using modified particle swarm optimization algorithm. Int J Artif Intell 12(2):88–101Google Scholar
- Fan L, Lei M (2006) Reducing cognitive overload by meta-learning assisted algorithm selection. In: Proceedings of 5th IEEE international conference on cognitive informatics, pp 120–125Google Scholar
- Frank A, Asuncion A (2010) UCI machine learning repository (online). http://archive.ics.uci.edu/ml. Accessed 4 Aug 2012
- Friedman J, Hastie T, Tibshirani R (1998) Additive logistic regression: a statistical view of boosting. Ann Stat 28(2):337–407CrossRefzbMATHMathSciNetGoogle Scholar
- Hall P, Racine J, LI QL (2004) Cross-validation and the estimation of conditional probability densities. J Am Stat Assoc 99(468):1015–1026CrossRefzbMATHMathSciNetGoogle Scholar
- Han J, Kamber M (2011) Data mining concepts and techniques. Morgan Kaufman Publishers, San FranciscozbMATHGoogle Scholar
- Hormozi H, Hormozi E, Nohooji HR (2012) The classification of the applicable machine learning methods in robots manipulators. Int J Machine Learn and Comput 2(5):560–563CrossRefGoogle Scholar
- Joachims T (1999) Making large-scale svm learning practical advances in kernel methods. In: Schölkopf B, Burges C, Smola A (eds) Support vector learning. MIT Press, CambridgeGoogle Scholar
- Kohonen T (2001) Self-organizing maps. Springer, BerlinCrossRefzbMATHGoogle Scholar
- Kotsiantis S, Zaharakis I, Pintelas P (2006) Machine learning: a review of classification and combining techniques. Artif Intell Rev 26:159–190CrossRefGoogle Scholar
- Kou G, Wu W (2014) An analytic hierarchy model for classification algorithms selection in credit risk analysis. Math probl Eng 2014:1–7. doi: 10.1155/2014/297563 CrossRefGoogle Scholar
- Kulkarni P (2012) Reinforcement and systemic machine learning for decision making, IEEE press series on systems science and engineering. Wiley, New JerseyGoogle Scholar
- Kwon O, Sim JM (2013) Effects of data set features on the performances of classification algorithms. Expert Syst Appl 40:1847–1857CrossRefGoogle Scholar
- Leo B (2001) Random forests. Machine Learn 45(1):5–32CrossRefGoogle Scholar
- Liu Q, Cao J (2010) A recurrent neural network based on projection operator for extended general variational inqualities. IEEE Trans Syst Man Cybern-Part B Cybern 40(3):928–938CrossRefGoogle Scholar
- Liu Q, Dang C, Cao J (2010a) A novel recurrent neural network with one neuron and finite-time convergence for kwinners-take-all operation. IEEE Transactions on neural networks 21(7):1140–1148CrossRefGoogle Scholar
- Liu Q, Cao J, Chen G (2010b) A novel recurrent neural network with finite-time convergence for linear programming Neural Comput. 22(11):2962–2978Google Scholar
- Mark H, Eibe F, Geoffrey H, Bernhard P, Peter R, Ian H (2009) The WEKA data mining software: an update. SIGKDD Explor 11(1):10–18CrossRefGoogle Scholar
- Michie D, Spiegelhalter DJ, Taylor CC (1994) Machine learning, neural and statistical classification. Ellis Horwood Series in Artifcial Intelligence. Ellis Horwood, ChichesterGoogle Scholar
- Mitchell T (1997) Machine learning. Burr Ridge, Mcgraw HillzbMATHGoogle Scholar
- Nadeau C, Bengio Y (2003) Inference for the generalization error. Mach Learn 52:239–281CrossRefzbMATHGoogle Scholar
- Nakamura M, Otsuka A, Kimura H (2014) Automatic selection of classification algorithms for non-experts using meta-features. China-USA Business Review. 13(3):199–205Google Scholar
- Oduguwa V, Tiwari A, Roy R (2005) Evolutionary computing in manufacturing industry: an overview of recent applications. Applied soft computing 5(3):281–299CrossRefGoogle Scholar
- Peng W, Flach PA, Soares C, Brazdil P (2002) Improved data set characterisation for meta-learning. In: proceedings of the fifth international confernce on discovery science, LNAI 2534, pp 141–152Google Scholar
- Pfahringer B, Bensusan H, Giraud-Carrier C (2000) Tell me who can learn you and i can tell you who you are: Landmarking various learning algorithms. In: Proceedings of the 17th international conference on machine learning, 743–750Google Scholar
- Pinto F, Soares C, Mendes-Moreira (2014) A framework to decompose and develop meta features. In: Proceedings of Meta-learning and algorithm selection workshop at 21st European conference on artificial intelligence, Prague, Czech Republic, 32–36Google Scholar
- Pise N, Kulkarni P (2008) A survey of semi-supervised learning methods. In: Proceedings of international conference on computational intelligence and security, Suzhou, China, pp 30–34Google Scholar
- Polikar R (2006) Ensemble based system in decision making. IEEE Circuit Syst Mag 6(3):21–45CrossRefGoogle Scholar
- Preitl S, Precup R, Fodor J, Bede B (2006) Iterative feedback tuning in fuzzy control systems. Theory Appl Acta Polytech Hung 3(3):81–96Google Scholar
- Quinlan J (1993) C45 programs for machine learning. Morgan Kaufmann Publishers, San FranciscoGoogle Scholar
- Romero C, Olmo JL, Ventura S (2013) A meta-learning approach for recommending a subset of white-box classification algorithms for Moodle datasets. In: Proceedings of 6th international conference on educational data mining, Memphis, TN, USA, 268–271Google Scholar
- Rosales-Pérez A, Gonzalez JA, Coello CAC, Escalante HJ, Reyes-Garcia CA (2014) Multi-objective model type selection. Neurocomputing 146:83–94. doi: 10.1016/j.neucom.2014.05.077 CrossRefGoogle Scholar
- Saitta L, Neri F (1998) Learning in the ‘Real World’. Mach Learn 30(2–3):133–163CrossRefGoogle Scholar
- Sewell M (2009) Machine Learning, http://machinelearningmartinsewell.com/machine-learning.pdf. Accessed 18 Sept 2014
- Sleenman D, Rissakis M (1995) Consulatant-2: pre and post-processing of machine learning applications. Int J Hum Comput Stud 43(1):43–63CrossRefGoogle Scholar
- Smith-Miles K (2008) Cross-disciplinary perspectives on meta-learning for algorithm selection. ACM Comput Surv 4(1):6–25Google Scholar
- Sun Y (2007) Cost-sensitive boosting for classification of imbalanced data. PhD thesis, department of electrical and computer engineering, University of Waterloo, Ontario, CanadaGoogle Scholar
- Sutton R, Barto A (1998) Reinforcement learning: an introduction. MIT Press, CambridgeGoogle Scholar
- Tan P, Steinbach M, Kumar V (2013) Introduction to data mining, 2nd edn. Addison-Wesley, pp 792Google Scholar
- Valiant LG (1984) A theory of the learnable. Commun ACM 27(11):1134–1142CrossRefzbMATHGoogle Scholar
- Vilalta R, Drissi Y (2002) A perspective view and survey of meta-learning. J Artif Intell Rev 18(2):77–95CrossRefGoogle Scholar
- Witten IH, Frank E, Hall M (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann series in data management systems, Morgan Kaufmann Publishers, CAGoogle Scholar
- Wolpert D, Macready W (1997) No free lunch theorems for optimization. IEEE Trans Evolut Comput 1(1):67–82CrossRefGoogle Scholar
- Yegnanarayana B (2005) Artificial neural networks. New Delhi, PHIGoogle Scholar