Data Mining and Knowledge Discovery

, Volume 21, Issue 2, pp 259–276

Maximal exceptions with minimal descriptions

Open Access
Article

Abstract

We introduce a new approach to Exceptional Model Mining. Our algorithm, called EMDM, is an iterative method that alternates between Exception Maximisation and Description Minimisation. As a result, it finds maximally exceptional models with minimal descriptions. Exceptional Model Mining was recently introduced by Leman et al. (Exceptional model mining 1–16, 2008) as a generalisation of Subgroup Discovery. Instead of considering a single target attribute, it allows for multiple ‘model’ attributes on which models are fitted. If the model for a subgroup is substantially different from the model for the complete database, it is regarded as an exceptional model. To measure exceptionality, we propose two information-theoretic measures. One is based on the Kullback–Leibler divergence, the other on Krimp. We show how compression can be used for exception maximisation with these measures, and how classification can be used for description minimisation. Experiments show that our approach efficiently identifies subgroups that are both exceptional and interesting.

Keywords

Exceptional Model Mining Subgroup Discovery Information theory 

Copyright information

© The Author(s) 2010

Authors and Affiliations

  1. 1.Department of Information and Computing SciencesUniversiteit UtrechtUtrechtThe Netherlands

Personalised recommendations