Discovering Subgroups by Means of Genetic Programming

  • José M. Luna
  • José Raúl Romero
  • Cristóbal Romero
  • Sebastián Ventura
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7831)

Abstract

This paper deals with the problem of discovering subgroups in data by means of a grammar guided genetic programming algorithm, each subgroup including a set of related patterns. The proposed algorithm combines the requirements of discovering comprehensible rules with the ability of mining expressive and flexible solutions thanks to the use of a context-free grammar. A major characteristic of this algorithm is the small number of parameters required, so the mining process is easy for end-users.

The algorithm proposed is compared with existing subgroup discovery evolutionary algorithms. The experimental results reveal the excellent behaviour of this algorithm, discovering comprehensible subgroups and behaving better than the other algorithms. The conclusions obtained were reinforced through a series of non-parametric tests.

Keywords

Data mining subgroup discovery genetic programming grammar guided genetic programming 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Carmona, C.J., González, P., del Jesus, M.J., Herrera, F.: NMEEF-SD: Non-dominated multiobjective evolutionary algorithm for extracting fuzzy rules in subgroup discovery. IEEE Transactions on Fuzzy Systems 18(5), 958–970 (2010)CrossRefGoogle Scholar
  2. 2.
    Carmona, C.J., González, P., del Jesus, M.J., Navío-Acosta, M., Jiménez-Trevino, L.: Evolutionary fuzzy rule extraction for subgroup discovery in a psychiatric emergency department. Soft Computing 15(12), 2435–2448 (2011)CrossRefGoogle Scholar
  3. 3.
    García, S., Fernández, A., Luengo, J., Herrera, F.: Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Information Sciences 180(10), 2044–2064 (2010)CrossRefGoogle Scholar
  4. 4.
    Herrera, F., Carmona, C.J., González, P., del Jesus, M.J.: An overview on subgroup discovery: Foundations and applications. Knowledge and Information Systems 29(3), 495–525 (2011)CrossRefGoogle Scholar
  5. 5.
    del Jesus, M.J., González, P., Herrera, F., Mesonero, M.: Evolutionary fuzzy rule induction process for subgroup discovery: A case study in marketing. IEEE Transactions on Fuzzy Systems 15(4), 578–592 (2007)CrossRefGoogle Scholar
  6. 6.
    Kavšek, B., Lavrač, N.: APRIORI-SD: Adapting association rule learning to subgroup discovery. Applied Artificial Intelligence 20(7), 543–583 (2006)CrossRefGoogle Scholar
  7. 7.
    Klösgen, W.: Explora: A multipattern and multistrategy discovery assistant. In: Advances in Knowledge Discovery and Data Mining, pp. 249–271 (1996)Google Scholar
  8. 8.
    Lavrač, N., Kavšek, B., Flach, P., Todorovski, L.: Subgroup discovery with cn2-sd. Journal of Machine Learning Research 5, 153–188 (2004)Google Scholar
  9. 9.
    Luna, J.M., Romero, J.R., Ventura, S.: Design and behavior study of a grammar-guided genetic programming algorithm for mining association rules. Knowledge and Information Systems 32(1), 53–76 (2012)CrossRefGoogle Scholar
  10. 10.
    Romero, C., Luna, J.M., Romero, J.R., Ventura, S.: RM-Tool: A framework for discovering and evaluating association rules. Advances in Engineering Software 42(8), 566–576 (2011)CrossRefGoogle Scholar
  11. 11.
    Romero, C., Ventura, S., Espejo, P., Hervás, C.: Data mining algorithms to classify students. In: Proceedings of the First International Conference on Educational Data Mining, EDM 2008, Montreal, Quebec, Canada, pp. 182–185 (2008)Google Scholar
  12. 12.
    Ventura, S., Romero, C., Zafra, A., Delgado, J.A., Hervás, C.: JCLEC: A java framework for evolutionary computation. Soft Computing 12(4), 381–392 (2008)CrossRefGoogle Scholar
  13. 13.
    Wrobel, S.: An Algorithm for Multi-Relational Discovery of Subgroups. In: Komorowski, J., Żytkow, J.M. (eds.) PKDD 1997. LNCS, vol. 1263, pp. 78–87. Springer, Heidelberg (1997)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • José M. Luna
    • 1
  • José Raúl Romero
    • 1
  • Cristóbal Romero
    • 1
  • Sebastián Ventura
    • 1
  1. 1.Dept. of Computer Science and Numerical AnalysisUniversity of CordobaCordobaSpain

Personalised recommendations