Mining Supervised Classification Performance Studies: A Meta-Analytic Investigation
- 314 Downloads
There have been many comparative studies of classification methods in which real datasets are used as a gauge to assess the relative performance of the methods. Since these comparisons often yield inconclusive or limited results on how methods perform, it is often believed that a broader approach combining these studies would shed some light on this difficult question. This paper describes such an attempt: we have sampled the available literature and created a dataset of 5807 classification results. We show that one of the possible ways to analyze the resulting data is an overall assessment of the classification methods, and we present methods for that particular aim. The merits and demerits of such an approach are discussed, and conclusions are drawn which may assist future research: we argue that the current state of the literature hardly allows large-scale investigations.
KeywordsClassification rules Supervised classification Neural networks Tree classifiers Logistic regression Nearest neighbor method Bradley-Terry Meta-analysis Data mining
Unable to display preview. Download preview PDF.
- BATCHELOR, B. G. and HAND, D. J. (1976), “A Pattern Recognition Competition”, in Proceedings of the Third International Joint Conference on Pattern Recognition, San Diego, 1976.Google Scholar
- COLLETT, D. (2002), Modelling Binary Data (2nd ed.), London: Chapman and Hall.Google Scholar
- EKLUND, P.W. and HOANG, A. (2002), “A Performance Survey of Public Domain Supervised Machine Learning Algorithms”, http://citeseer.nj.nec.com/142129.html.
- FISHER, R.A. (1936), “The Use of Multiple Measurements in Taxonomic Problems”, Annals of Eugenics, 7:179–188.Google Scholar
- HAND, D.J. (2004), “Academic Obsessions and Classification Realities: Ignoring Practicalities in Supervised Classification”, in Classification, Clustering and Data Mining Applications, eds. B. Banks, L. House, F. R. McMorris, P. Arabie, and W. Gaul, Berlin: Springer, pp. 209–232.Google Scholar
- HAND, D.J., MANNILA, H., and SMYTH, P. (2001), Principles of Data Mining, Cambridge MA: MIT Press.Google Scholar
- HASTIE, T., TIBSHIRANI, R., and FRIEDMAN, J. (2001), The Elements of Statistical Learning Theory, New York: Springer.Google Scholar
- JAMAIN, A. (2004), “Meta-analysis of Classification Methods”, PhD thesis, Department of Mathematics, Imperial College, London.Google Scholar
- MCLACHLAN, G.J. (1992), Discriminant Analysis and Statistical Pattern Recognition, New York: Wiley.Google Scholar
- METAL CONSORTIUM (2002), “Esprit Project METAL (#26.357)”, http://www.metalkdd.org.
- RASMUSSEN, C.E., NEAL, R.M., HINTON, G.E., VAN CAMP, D., REVOW, M., GHAHRAMANI, Z., KUSTRA, R., and TIBSHIRANI, R. (1996), “DELVE, Data for Evaluating Learning in Valid Experiments”, http://www.cs.toronto.edu/~delve/.
- ZARNDT, F. (1995), “A Comprehensive Case Study: an Examination of Machine Learning and Connectionnist Algorithms”, http://citeseer.nj.nec.com/481595.html.