Applied Intelligence

, Volume 43, Issue 4, pp 892–912 | Cite as

An improved data characterization method and its application in classification algorithm recommendation

Article

Abstract

Picking up appropriate classification algorithms for a given data set is very important and useful in practice. One of the most challenging issues for algorithm selection is how to characterize different data sets. Recently, we extracted the structural information of a data set to characterize itself. Although these kinds of characteristics work well in identifying similar data sets and recommending appropriate classification algorithms, the extraction method can only be applied to binary data sets and its performance is not high. Thus, in this paper, an improved data set characterization method is proposed to address these problems. For the purpose of evaluating the effectiveness of the improved method on algorithm recommendation, the unsupervised learning method EM is employed to build the algorithm recommendation model. Extensive experiments with 17 different types of classification algorithms are conducted upon 84 public UCI data sets; the results demonstrate the effectiveness of the proposed method.

Keywords

Classification algorithm recommendation Classification Data set characteristics extraction 

References

  1. 1.
    Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: ACM SIGMOD Record, vol 22. ACM, pp 207–216Google Scholar
  2. 2.
    Aha DW (1992) Generalizing from case studies: A case study. In: Proceedings of the Ninth International Conference on Machine Learning. Citeseer, pp 1–10Google Scholar
  3. 3.
    Aha DW, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66Google Scholar
  4. 4.
    Ali S, Smith KA (2006) On learning algorithm selection for classification. Appl Soft Comput 6(2):119–138CrossRefGoogle Scholar
  5. 5.
    Asuncion A NDJ. UCI machine learning repository (2007). http://www.ics.uci.edu/~mlearn/MLRepository.html
  6. 6.
    Bensusan H (1998) God doesn’t always shave with occam’s razor - learning when and how to prune. In: Proceedigs of the 10th European Conference on Machine Learning. Springer, pp 119–124Google Scholar
  7. 7.
    Bensusan H, Giraud-Carrier C (2000) Casa batlo is in passeig de gracia or landmarking the expertise space. In: Proceedings of the ECML’2000 workshop on Meta-Learning: Building Automatic Advice Strategies for Model Selection and Method Combination, pp 29–47Google Scholar
  8. 8.
    Brazdil P, Gama J, Henery B (1994) Characterizing the applicability of classification algorithms using meta-level learning. In: Proceedings of the European conference on Machine Learning. Springer, pp 83–102Google Scholar
  9. 9.
    Brazdil P, Soares C (2000) A comparison of ranking methods for classification algorithm selection. In: 11th European Conference on Machine Learning. Springer, pp 63–75Google Scholar
  10. 10.
    Brazdil PB, Soares C, Da Costa JP (2003) Ranking learning algorithms: Using IBL and meta-learning on accuracy and time results. Mach Learn 50(3):251–277MATHCrossRefGoogle Scholar
  11. 11.
    Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140MATHMathSciNetGoogle Scholar
  12. 12.
    Breiman L (2001) Random Forests. Mach Learn 45:5–32MATHCrossRefGoogle Scholar
  13. 13.
    Brodley CE (1993) Addressing the selective superiority problem: Automatic algorithm/model class selection. In: Proceedings of the Tenth International Conference on Machine Learning. Citeseer, pp 17–24Google Scholar
  14. 14.
    Castiello C, Castellano G, Fanelli A (2005) Meta-data: Characterization of input features for meta-learning. Modeling Decisions for Artificial Intelligence pp. 457–468Google Scholar
  15. 15.
    Cleary JG, Trigg LEK* (1995) An Instance-based Learner Using and Entropic Distance Measure. In: International Conference on Machine Learning, pp 108–114Google Scholar
  16. 16.
    Cohen WW (1995) Fast effective rule induction. In: Proceedings of the International Conference on Machine Learning, pp 115–123Google Scholar
  17. 17.
    Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J Royal Stat Soc:1–38Google Scholar
  18. 18.
    Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MATHMathSciNetGoogle Scholar
  19. 19.
    Duin RPW, Pekalska E, Tax DMJ (2004) The characterization of classification problems by classifier disagreements. In: Proceedings of the 17th International Conference on Pattern Recognition, vol 1. IEEE, pp 140–143Google Scholar
  20. 20.
    Fisher D, Xu L, Zard N (1992) Ordering effects in clustering. In: Proceedings of the Ninth International Conference on Machine LearningGoogle Scholar
  21. 21.
    Frank E, Witten IH (1998) Generating accurate rule sets without global optimization. In: Proceedings of the 15th International Conference on Machine Learning. CiteseerGoogle Scholar
  22. 22.
    Freund Y, Schapire R (1996) Experiments with a new boosting algorithm. In: Proceeding of the Thirteenth International Conference on Machine Learning. Citeseer, pp 148–156Google Scholar
  23. 23.
    Gama J, Brazdil P (1995) Characterization of classification algorithms. Progress in Artificial Intelligence pp. 189–200Google Scholar
  24. 24.
    Henery RJ (1994) Methods for comparison. Ellis Horwood, Upper Saddle River, NJ, USA, pp 107–124. http://dl.acm.org/citation.cfm?id=212782.212789 Google Scholar
  25. 25.
    Hilario M, Kalousis A (2001) Fusion of meta-knowledge and meta-data for case-based model selection. Principles of Data Mining and Knowledge Discovery pp. 180–191Google Scholar
  26. 26.
    Ho TK (2000) Complexity of classification problems and comparative advantages of combined classifiers. Multiple Classifier Systems pp. 97–106Google Scholar
  27. 27.
    Ho TK, Basu M (2002) Complexity measures of supervised classification problems. IEEE Trans Pattern Anal Mach Intell 24(3):289–300CrossRefGoogle Scholar
  28. 28.
    John GH, Langley P (1995) Estimating continuous distributions in bayesian classifiers. In: Proceedings of the eleventh conference on uncertainty in artificial intelligence, vol 1. Citeseer , pp 338–345Google Scholar
  29. 29.
    Kalousis A (2002) Algorithm selection via meta-learning. Ph.D. thesis, University of GeneveGoogle Scholar
  30. 30.
    Kalousis A, Gama J, Hilario M (2004) On data and algorithms: Understanding inductive performance. Mach Learn 54(3):275–312MATHCrossRefGoogle Scholar
  31. 31.
    Kalousis A, Hilario M (2000) Model selection via meta-learning: a comparative study. In: Proceedings of the 12th IEEE International Conference on Tools with Artificial Intelligence. IEEE, pp 406–413Google Scholar
  32. 32.
    Kalousis A, Theoharis T (1999) NOEMON: Design, implementation and performance results of an intelligent assistant for classifier selection. Intell Data Anal 3(5):319–337MATHCrossRefGoogle Scholar
  33. 33.
    King RD, Feng C, Sutherland A (1995) Statlog: comparison of classification algorithms on large real-world problems. Appl Artif Intell Int J 9(3):289–333CrossRefGoogle Scholar
  34. 34.
    Lindner G, Studer R (1999) AST: Support for algorithm selection with a CBR approach. Principles of Data Mining and Knowledge Discovery pp. 418–423Google Scholar
  35. 35.
    Michie D, Spiegelhalter DJ, Taylor CC (1994) Machine learning, neural and statistical classification. CiteseerGoogle Scholar
  36. 36.
    Michie D, Spiegelhalter DJ, Taylor CC (1994) Machine learning. neural and statistical classificationGoogle Scholar
  37. 37.
    Peng Y, Flach P, Soares C, Brazdil P (2002) Improved dataset characterisation for meta-learning. In: Discovery Science. Springer, pp 193–208Google Scholar
  38. 38.
    Pfahringer B, Bensusan H, Giraud-Carrier C (2000) Meta-learning by landmarking various learning algorithms. Morgan Kaufmann, pp 743–750Google Scholar
  39. 39.
    Pizarro J, Guerrero E, Galindo PL (2002) Multiple comparison procedures applied to model selection. Neurocomputing 48: 155–173MATHCrossRefGoogle Scholar
  40. 40.
    Platt J (1998) Machines using sequential minimal optimizationGoogle Scholar
  41. 41.
    Quinlan JR (1994) Comparing connectionist and symbolic learning methods. In: Computational Learning Theory and Natural Learning Systems: Constraints and Prospects. CiteseerGoogle Scholar
  42. 42.
    Smith KA, Woo F, Ciesielski V, Ibrahim R (2001) Modelling the relationship between problem characteristics and data mining algorithm performance using neural networks. Smart Engineering System Design: Neural Networks, Fuzzy Logic, Evolutionary Programming. Data Mining, and Complex Systems pp. 357–362Google Scholar
  43. 43.
    Smith KA, Woo F, Ciesielski V, Ibrahim R (2002) Matching data mining algorithm suitability to data characteristics using a self-organising map. Hybrid Information Systems pp. 169–180Google Scholar
  44. 44.
    Smith-Miles KA (2008) Cross-disciplinary perspectives on meta-learning for algorithm selection. ACM Comput Surv 41(1):1–25CrossRefGoogle Scholar
  45. 45.
    Sohn SY (1999) Meta analysis of classification algorithms for pattern recognition. IEEE Trans Pattern Anal Mach Intell 21(11):1137–1144CrossRefGoogle Scholar
  46. 46.
    Song Q, Wang G, Wang C (2012) Automatic recommendation of classification algorithms based on data set characteristics. Pattern RecognitionGoogle Scholar
  47. 47.
    Tatti N (2007) Distances between data sets based on summary statistics. J Mach Learn Res 8:131–154MATHMathSciNetGoogle Scholar
  48. 48.
    Webb GI (2000) Multiboosting: A technique for combining boosting and wagging. Mach Learn 40(2):159–196CrossRefGoogle Scholar
  49. 49.
    Wolpert DH (2001) The supervised learning no-free-lunch theorems. In: Proceedings of 6th Online World Conference on Soft Computing in Industrial Applications. Citeseer, pp 25–42Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Department of Computer ScienceTechnology, Xi’an Jiaotong UniversityXi’anChina

Personalised recommendations