Soft Computing

, Volume 13, Issue 3, pp 307–318 | Cite as

KEEL: a software tool to assess evolutionary algorithms for data mining problems

  • J. Alcalá-Fdez
  • L. Sánchez
  • S. García
  • M. J. del Jesus
  • S. Ventura
  • J. M. Garrell
  • J. Otero
  • C. Romero
  • J. Bacardit
  • V. M. Rivas
  • J. C. Fernández
  • F. Herrera
Focus

Abstract

This paper introduces a software tool named KEEL which is a software tool to assess evolutionary algorithms for Data Mining problems of various kinds including as regression, classification, unsupervised learning, etc. It includes evolutionary learning algorithms based on different approaches: Pittsburgh, Michigan and IRL, as well as the integration of evolutionary learning techniques with different pre-processing techniques, allowing it to perform a complete analysis of any learning model in comparison to existing software tools. Moreover, KEEL has been designed with a double goal: research and educational.

Keywords

Computer-based education Data mining Evolutionary computation Experimental design Graphical programming Java Knowledge extraction Machine learning 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alcalá R, Alcala-Fdez J, Casillas J, Cordón O, Herrera F (2006) Hybrid learning models to get the interpretabilityaccuracy trade-off in fuzzy modeling. Soft Comput 10(9): 717–734CrossRefGoogle Scholar
  2. Batista GE, Monard MC (2003) An analysis of four missing data treatment methods for supervised learning. Appl Artif Intell 17: 519–533CrossRefGoogle Scholar
  3. Bernadó-Mansilla E, Ho TK (2005) Domain of competence of XCS classifier system in complexity measurement space. IEEE Trans Evol Comput 9(1): 82–104CrossRefGoogle Scholar
  4. Berthold MR, Cebron N, Dill F, Di Fatta G, Gabriel TR, Georg F, Meinl T, Ohl P (2006) KNIME: The Konstanz Information Miner, In: Proceedings of the 4th annual industrial simulation conference, Workshop on multi-agent systems and simulations, PalermoGoogle Scholar
  5. Cano JR, Herrera F, Lozano M (2003) Using evolutionary algorithms as instance selection for data reduction in KDD: An experimental study. IEEE Trans Evol Comput 7(6): 561–575CrossRefGoogle Scholar
  6. Cordón O, del Jesus MJ, Herrera F, Lozano M (1999) MOGUL: a methodology to obtain genetic fuzzy rule-based systems under the iterative rule learning approach. Int J Intell Syst 14(9): 1123–1153MATHCrossRefGoogle Scholar
  7. Cordón O, Herrera F, Sánchez L (1999) Solving electrical distribution problems using hybrid evolutionary data analysis techniques. Appl Intell 10: 5–24CrossRefGoogle Scholar
  8. Cordón O, Herrera F, Hoffmann F, Magdalena L (2001) Genetic fuzzy systems: Evolutionary tuning and learning of fuzzy knowledge bases. World Scientific, Singapore, p 488MATHGoogle Scholar
  9. Chuang AS (2000) An extendible genetic algorithm framework for problem solving in a common environment. IEEE Trans Power Syst 15(1): 269–275CrossRefGoogle Scholar
  10. del Jesus MJ, Hoffmann F, Navascues LJ, Sánchez L (2004) Induction of Fuzzy-Rule-Based Classifiers with Evolutionary Boosting Algorithms. IEEE Trans Fuzzy Syst 12(3): 296–308CrossRefGoogle Scholar
  11. Demšar J, Zupan B Orange: From experimental machine learning to interactive data mining, White Paper (http://www.ailab.si/orange). Faculty of Computer and Information Science, University of Ljubljana
  12. Dietterich TG (1998) Approximate Statistica Tests for Comparing Supervised Classification Learning Algorithms. Neural Computation 10(7): 1895–1924CrossRefGoogle Scholar
  13. Eiben AE, Smith JE (2003) Introduction to evolutionary computing. Springer, Berlin, p 299MATHGoogle Scholar
  14. Freitas AA (2002) Data mining and knowledge discovery with evolutionary algorithms. Springer, Berlin, p 264MATHGoogle Scholar
  15. Gagné C, Parizeau M (2006) Genericity in evolutionary computation sofyware tools: principles and case-study Int J Artif Intell Tools 15(2): 173–194CrossRefGoogle Scholar
  16. Ghosh A, Jain LC (2005) Evolutionary Computation in Data Mining. Springer, New York, pp 264Google Scholar
  17. Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, New York, pp 372MATHGoogle Scholar
  18. Grefenstette JJ (1993) Genetic Algorithms for Machine Learning. Kluwer, Norwell, p 176MATHGoogle Scholar
  19. Holland JH (1975) Adaptation in natural and artificial systems. The University of Michigan Press, London, p 228Google Scholar
  20. Keijzer M, Merelo JJ, Romero G, Schoenauer M (2001) Evolving objects: A general purpose evolutionary computation library. In: Collet P, Fonlupt C, Hao JK, Lutton E, Schoenauer M (eds) Artificial evolution: selected papers from the 5th european conference on artificial evolution, London, UK, pp 231–244Google Scholar
  21. Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence 2(12):1137–1143Google Scholar
  22. Krasnogor N, Smith J (2000) MAFRA: A Java memetic algorithms framework. In: Proceedings of the Genetic and Evolutionary Computation Workshops. Las Vegas, Nevada, USA, pp 125–131Google Scholar
  23. Llorà X (2006) E2K: Evolution to knowledge. SIGEVOlution 1(3): 10–16CrossRefGoogle Scholar
  24. Llorà X, Garrell JM (2003) Prototype induction and attribute selection via evolutionary algorithms. Int Data Anal 7(3): 193–208Google Scholar
  25. Liu H, Hussain F, Lim C, Dash M (2002) Discretization: an enabling technique. Data Min Knowl Discov 6(4): 393–423CrossRefMathSciNetGoogle Scholar
  26. Luke S, Panait L, Balan G, Paus S, Skolicki Z, Bassett J, Hubley R, Chircop A (2007) ECJ: A Java based evolutionary computation research system. http://cs.gmu.edu/~eclab/projects/ecj
  27. Martínez-Estudillo A, Martínez-Estudillo F, Hervás-Martínez C, García-Pedrajas N (2006) Evolutionary product unit based neural networks for regression. Neural Netw 19: 477–486MATHCrossRefGoogle Scholar
  28. Meyer M, Hufschlag K (2006) A generic approach to an object-oriented learning classifier system library. Journal of Artificial Societies and Social Simulation 9:3 http://jasss.soc.surrey.ac.uk/9/3/9.html Google Scholar
  29. Mierswa I, Wurst M, Klinkenberg R, Scholz M, Euler T (2006) YALE: Rapid Prototyping for Complex Data Mining Tasks. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 1–6Google Scholar
  30. Morik K, Scholz M (2004) The MiningMart Approach to Knowledge Discovery in Databases. In: Zhong N, Liu J (eds) Intelligent Technologies for Information Analysis. Springer, Heidelberg, pp 47–65Google Scholar
  31. Mucientes M, Moreno DL, Bugarín A, Barro S (2006) Evolutionary learning of a fuzzy controller for wallfollowing behavior in mobile robotics. Soft Comput 10(10): 881–889CrossRefGoogle Scholar
  32. Oh IS, Lee JS, Moon BR (2004) Hybrid genetic algorithms for feature selection. IEEE Trans Pattern Anal Mach Intell 26(11): 1424–1437CrossRefGoogle Scholar
  33. Ortega M, Bravo J (2000) Computers and education in the 21st century. Kluwer, Norwell, p 266Google Scholar
  34. Otero J, Sánchez L (2006) Induction of descriptive fuzzy classifiers with the Logitboost algorithm. Soft Comput 10(9): 825–835CrossRefGoogle Scholar
  35. Pal SK, Wang PP (1996) Genetic algorithms for pattern recognition. CRC Press, Boca Raton,p 336Google Scholar
  36. Punch B, Zongker D (1998) lib-gp 1.1 beta. http://garage.cse.msu.edu/software/lil-gp
  37. Pyle D (1999) Data preparation for data mining. Morgan Kaufmann, San Mateo, p 540Google Scholar
  38. Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo, p 316Google Scholar
  39. R Development Core Team (2005) R: A language and environment for statistical computing. R Foundation for Statistical Computing. Vienna, Austria http://www.R-project.org
  40. Rakotomalala R (2005) TANAGRA: un logiciel gratuit pour l’enseignement et la recherche. In: Proceedings of the 5th Journées d’Extraction et Gestion des Connaissances 2:697–702Google Scholar
  41. Rivera AJ, Rojas I, Ortega J, del Jesus MJ (2007) A new hybrid methodology for cooperative-coevolutionary optimization of radial basis function networks. Soft Comput 11(7): 655–668CrossRefGoogle Scholar
  42. Rodríguez JJ, Kuncheva LI, Alonso CJ (2006) Rotation forest: a new classifier ensemble method. IEEE Trans Pattern Anal Mach Intell 28(10): 1619–1630CrossRefGoogle Scholar
  43. Romero C, Ventura S, Bra P (2004) Knowledge discovery with genetic programming for providing feedback to courseware author, user modeling and user-adapted interaction. J Personal Res 14(5): 425–465Google Scholar
  44. Rummler A (2007) Evolvica: a Java framework for evolutionary algorithms. http://www.evolvica.org
  45. Rushing J, Ramachandran R, Nair U, Graves S, Welch R, Lin H (2005) ADaM: a data mining toolkit for scientists and engineers. Comput Geosci 31(5): 607–618CrossRefGoogle Scholar
  46. Sonnenburg S, Braun ML, Ong ChS, Bengio S, Bottou L, Holmes G, LeCun Y, Müller K-R, Pereira F, Rasmussen CE, Rätsch G, Schölkopf B, Smola A, Vincent P, Weston J, Williamson RC (2007) The need for open source software in machine learning. J Mach Learn Res 8: 2443–2466Google Scholar
  47. Stejić Z, Takama Y, Hirota K (2007) Variants of evolutionary learning for interactive image retrieval. Soft Comput 11(7): 669–678CrossRefGoogle Scholar
  48. Tan JC, Lee TH, Khoo D, Khor EF (2001) A multiobjective evolutionary algorithm toolbox for computer-aided multiobjective optimization. IEEE Trans Syst Man Cybern B Cybern 31(4): 537–556CrossRefGoogle Scholar
  49. Tan JC, Tay A, Cai J (2003) Design and implementation of a distributed evolutionary computing software. IEEE Trans Syst Man Cybern B Cybern 33(3): 325–338Google Scholar
  50. Tan PN, Steinbach M, Kumar V (2006) Introduction to Data Mining. Addison-Wesley, Reading, p 769Google Scholar
  51. Ventura S, Romero C, Zafra A, Delgado JA, Hervás C (2008) JCLEC: a java framework for evolutionary computation. Soft Comput 12(4): 381–392CrossRefGoogle Scholar
  52. Wang LX, Mendel JM (1992) Generating fuzzy rules by learning from examples. IEEE Trans Syst Man Cybern 22(6): 1414–1427CrossRefMathSciNetGoogle Scholar
  53. Wang X, Nauck DD, Spott M, Kruse R (2007) Intelligent data analysis with fuzzy decision trees. Soft Comput 11(5): 439–457CrossRefGoogle Scholar
  54. Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2): 149–175CrossRefGoogle Scholar
  55. Wilson DR, Martinez TR (2000) Reduction techniques for instance-based learning algorithms. Mach Learn 38: 257–268MATHCrossRefGoogle Scholar
  56. Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco, p 525. http://www.cs.waikato.ac.nz/ml/weka/index.html
  57. Wong ML, Leung KS (2000) Data mining using grammar based genetic programming and applications. Kluwer, Norwell, p 232MATHGoogle Scholar
  58. Zhang S, Zhang C, Yang Q (2003) Data preparation for data mining. Appl Artif Intell 17: 375–381CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2008

Authors and Affiliations

  • J. Alcalá-Fdez
    • 1
  • L. Sánchez
    • 2
  • S. García
    • 1
  • M. J. del Jesus
    • 3
  • S. Ventura
    • 4
  • J. M. Garrell
    • 5
  • J. Otero
    • 2
  • C. Romero
    • 4
  • J. Bacardit
    • 6
  • V. M. Rivas
    • 3
  • J. C. Fernández
    • 4
  • F. Herrera
    • 1
  1. 1.Department of Computer Science and Artificial IntelligenceUniversity of GranadaGranadaSpain
  2. 2.Department of Computer ScienceUniversity of OviedoGijónSpain
  3. 3.Department of Computer ScienceUniversity of JaénJaénSpain
  4. 4.Department of Computer Sciences and Numerical AnalysisUniversity of CórdobaCórdobaSpain
  5. 5.Department of Computer ScienceUniversity Ramon LlullBarcelonaSpain
  6. 6.Department of Computer Science and Information TechnologyUniversity of NottinghamNottinghamUK

Personalised recommendations