Discovering Subgroups by Means of Genetic Programming
This paper deals with the problem of discovering subgroups in data by means of a grammar guided genetic programming algorithm, each subgroup including a set of related patterns. The proposed algorithm combines the requirements of discovering comprehensible rules with the ability of mining expressive and flexible solutions thanks to the use of a context-free grammar. A major characteristic of this algorithm is the small number of parameters required, so the mining process is easy for end-users.
The algorithm proposed is compared with existing subgroup discovery evolutionary algorithms. The experimental results reveal the excellent behaviour of this algorithm, discovering comprehensible subgroups and behaving better than the other algorithms. The conclusions obtained were reinforced through a series of non-parametric tests.
KeywordsData mining subgroup discovery genetic programming grammar guided genetic programming
Unable to display preview. Download preview PDF.
- 7.Klösgen, W.: Explora: A multipattern and multistrategy discovery assistant. In: Advances in Knowledge Discovery and Data Mining, pp. 249–271 (1996)Google Scholar
- 8.Lavrač, N., Kavšek, B., Flach, P., Todorovski, L.: Subgroup discovery with cn2-sd. Journal of Machine Learning Research 5, 153–188 (2004)Google Scholar
- 11.Romero, C., Ventura, S., Espejo, P., Hervás, C.: Data mining algorithms to classify students. In: Proceedings of the First International Conference on Educational Data Mining, EDM 2008, Montreal, Quebec, Canada, pp. 182–185 (2008)Google Scholar