Lecture Notes in Computer Science Volume 7250, 2012, pp 390-406,
Open Access This content is freely available online to anyone, anywhere at any time.

Contrast Mining from Interesting Subgroups

Download Book (17,091 KB) As a courtesy to our readers the eBook is provided DRM-free. However, please note that Springer uses effective methods and state-of-the art technology to detect, stop, and prosecute illegal sharing to safeguard our authors’ interests.
Download Chapter (281 KB)

Abstract

Subgroup discovery methods find interesting subsets of objects of a given class. We propose to extend subgroup discovery by a second subgroup discovery step to find interesting subgroups of objects specific for a class in one or more contrast classes. First, a subgroup discovery method is applied. Then, contrast classes of objects are defined by using set theoretic functions on the discovered subgroups of objects. Finally, subgroup discovery is performed to find interesting subgroups within the two contrast classes, pointing out differences between the characteristics of the two. This has various application areas, one being biology, where finding interesting subgroups has been addressed widely for gene-expression data. There, our method finds enriched gene sets which are common to samples in a class (e.g., differential expression in virus infected versus non-infected) and at the same time specific for one or more class attributes (e.g., time points or genotypes). We report on experimental results on a time-series data set for virus infected potato plants. The results present a comprehensive overview of potato’s response to virus infection and reveal new research hypotheses for plant biologists.