Data Mining and Knowledge Discovery

, Volume 28, Issue 5, pp 1503–1529

A peek into the black box: exploring classifiers by randomization

  • Andreas Henelius
  • Kai Puolamäki
  • Henrik Boström
  • Lars Asker
  • Panagiotis Papapetrou
Article

DOI: 10.1007/s10618-014-0368-8

Cite this article as:
Henelius, A., Puolamäki, K., Boström, H. et al. Data Min Knowl Disc (2014) 28: 1503. doi:10.1007/s10618-014-0368-8

Abstract

Classifiers are often opaque and cannot easily be inspected to gain understanding of which factors are of importance. We propose an efficient iterative algorithm to find the attributes and dependencies used by any classifier when making predictions. The performance and utility of the algorithm is demonstrated on two synthetic and 26 real-world datasets, using 15 commonly used learning algorithms to generate the classifiers. The empirical investigation shows that the novel algorithm is indeed able to find groupings of interacting attributes exploited by the different classifiers. These groupings allow for finding similarities among classifiers for a single dataset as well as for determining the extent to which different classifiers exploit such interactions in general.

Keywords

Classifiers Randomization 

Copyright information

© The Author(s) 2014

Authors and Affiliations

  • Andreas Henelius
    • 1
  • Kai Puolamäki
    • 1
  • Henrik Boström
    • 2
  • Lars Asker
    • 2
  • Panagiotis Papapetrou
    • 2
  1. 1.Finnish Institute of Occupational HealthHelsinkiFinland
  2. 2.Department of Computer and Systems SciencesStockholm UniversityKistaSweden

Personalised recommendations