Machine Learning

, Volume 54, Issue 3, pp 275–312

On Data and Algorithms: Understanding Inductive Performance

  • Alexandros Kalousis
  • João Gama
  • Melanie Hilario
Article

DOI: 10.1023/B:MACH.0000015882.38031.85

Cite this article as:
Kalousis, A., Gama, J. & Hilario, M. Machine Learning (2004) 54: 275. doi:10.1023/B:MACH.0000015882.38031.85

Abstract

In this paper we address two symmetrical issues, the discovery of similarities among classification algorithms, and among datasets. Both on the basis of error measures, which we use to define the error correlation between two algorithms, and determine the relative performance of a list of algorithms. We use the first to discover similarities between learners, and both of them to discover similarities between datasets. The latter sketch maps on the dataset space. Regions within each map exhibit specific patterns of error correlation or relative performance. To acquire an understanding of the factors determining these regions we describe them using simple characteristics of the datasets. Descriptions of each region are given in terms of the distributions of dataset characteristics within it.

classificationmeta-learningerror correlationclassifier rankingclustering datasetsclustering classifiers
Download to read the full article text

Copyright information

© Kluwer Academic Publishers 2004

Authors and Affiliations

  • Alexandros Kalousis
    • 1
  • João Gama
    • 2
  • Melanie Hilario
    • 1
  1. 1.University of Geneva, Computer Science DepartmentGeneva 4Switzerland
  2. 2.LIACC, FEP—University of PortoPortoPortugal