Automated Data Pre-processing via Meta-learning

  • Besim Bilalli
  • Alberto Abelló
  • Tomàs Aluja-Banet
  • Robert Wrembel
Conference paper

DOI: 10.1007/978-3-319-45547-1_16

Part of the Lecture Notes in Computer Science book series (LNCS, volume 9893)
Cite this paper as:
Bilalli B., Abelló A., Aluja-Banet T., Wrembel R. (2016) Automated Data Pre-processing via Meta-learning. In: Bellatreche L., Pastor Ó., Almendros Jiménez J., Aït-Ameur Y. (eds) Model and Data Engineering. MEDI 2016. Lecture Notes in Computer Science, vol 9893. Springer, Cham

Abstract

A data mining algorithm may perform differently on datasets with different characteristics, e.g., it might perform better on a dataset with continuous attributes rather than with categorical attributes, or the other way around. As a matter of fact, a dataset usually needs to be pre-processed. Taking into account all the possible pre-processing operators, there exists a staggeringly large number of alternatives and non-experienced users become overwhelmed. We show that this problem can be addressed by an automated approach, leveraging ideas from meta-learning. Specifically, we consider a wide range of data pre-processing techniques and a set of data mining algorithms. For each data mining algorithm and selected dataset, we are able to predict the transformations that improve the result of the algorithm on the respective dataset. Our approach will help non-expert users to more effectively identify the transformations appropriate to their applications, and hence to achieve improved results.

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Besim Bilalli
    • 1
  • Alberto Abelló
    • 1
  • Tomàs Aluja-Banet
    • 1
  • Robert Wrembel
    • 2
  1. 1.Universitat Politécnica de CatalunyaBarcelonaSpain
  2. 2.Poznan University of TechnologyPoznanPoland

Personalised recommendations