Applied Intelligence

, Volume 43, Issue 4, pp 892–912

An improved data characterization method and its application in classification algorithm recommendation


DOI: 10.1007/s10489-015-0689-3

Cite this article as:
Wang, G., Song, Q. & Zhu, X. Appl Intell (2015) 43: 892. doi:10.1007/s10489-015-0689-3


Picking up appropriate classification algorithms for a given data set is very important and useful in practice. One of the most challenging issues for algorithm selection is how to characterize different data sets. Recently, we extracted the structural information of a data set to characterize itself. Although these kinds of characteristics work well in identifying similar data sets and recommending appropriate classification algorithms, the extraction method can only be applied to binary data sets and its performance is not high. Thus, in this paper, an improved data set characterization method is proposed to address these problems. For the purpose of evaluating the effectiveness of the improved method on algorithm recommendation, the unsupervised learning method EM is employed to build the algorithm recommendation model. Extensive experiments with 17 different types of classification algorithms are conducted upon 84 public UCI data sets; the results demonstrate the effectiveness of the proposed method.


Classification algorithm recommendation Classification Data set characteristics extraction 

Funding information

Funder NameGrant NumberFunding Note
China Postdoctoral Science Foundation
  • 2014M562417

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Department of Computer ScienceTechnology, Xi’an Jiaotong UniversityXi’anChina

Personalised recommendations