Skip to main content

Advertisement

Log in

REGAL-TC: a distributed genetic algorithm for concept learning based on REGAL and the treatment of counterexamples

  • Original Paper
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

This paper presents a proposal to improve REGAL, a concept learning system based on a distributed genetic algorithm that learns first-order logic multi-modal concept descriptions in the field of classification tasks. This algorithm has been a pioneer system and source of inspiration for others. Studying the philosophy and experimental behaviour of REGAL, we propose some improvements based principally on a new treatment of counterexamples that promote its underlying goodness in order to achieve better performances in accuracy, interpretability and scalability, so that the new system meets the main requirements for classification rules extraction in data mining. The experimental study carried out shows valuable improvements compared with both REGAL and G-Net distributed genetic algorithms and interesting results compared with some state-of-the-art representative algorithms in this field.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Aguilar-Ruiz JS, Riquelme JC, Toro M (2003) Evolutionary learning of hierarchical decision rules. IEEE Trans Syst Man Cybern Part B Cybern 33(2):324–331

    Article  Google Scholar 

  • Alba E, Troya JM (1999) A survey of parallel distributed genetic algorithms. Complexity 4(4):31–52

    Article  MathSciNet  Google Scholar 

  • Alba E, Nebro AJ, Troya JM (2002) Heterogeneous computing and parallel genetic algorithms. J Parallel Distrib Comput 62(9):1362–1385

    Article  MATH  Google Scholar 

  • Alcalá-Fdez J, Sánchez L, García S, del Jesus MJ, Ventura S, Garrell-Guiu JM, Otero J, Romero C, Bacardit J, Rivas VM, Fernández JC, Herrera F (2009) KEEL: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput 13(3):307–318

    Article  Google Scholar 

  • An A, Cercone N (2000) Rule quality measures improve the accuracy of rule induction: an experimental approach. In: Foundations of intelligent systems. Lecture Notes in Computer Science, vol 1932. Springer, Berlin, pp 119–129

  • Anand R, Mehrotra K, Mohan CK, Ranka S (1995) Efficient classification for multiclass problems using modular neural networks. IEEE Trans Neural Netw 6(1):117–124

    Article  Google Scholar 

  • Asuncion A, Newman DJ (2007) UCI machine learning repository. http://www.ics.uci.edu/~mlearn/MLRepository.html

  • Bacardit J, Goldberg D, Butz M (2007) Improving the performance of a Pittsburgh learning classifier system using a default rule. In: Kovacs T, Llorà X, Takadama K, Lanzi P, Stolzmann W, Wilson S (eds) Learning classifier systems. Lecture Notes in Computer Science, vol 4399. Springer, Berlin, pp 291–307

  • Ben-David A (2007) A lot of randomness is hiding in accuracy. Eng Appl Artif Intell 20(7):875–885

    Article  Google Scholar 

  • Bernadó-Mansilla E, Garrell-Guiu JM (2003) Accuracy-based learning classifier systems: models, analysis and applications to classification tasks. Evolut Comput 11(3):209–238

    Article  Google Scholar 

  • Bianchini R, Brown CM, Cierniak M, Meira W (1995) Combining distributed populations and periodic centralized selections in coarse-grain parallel genetic algorithms. In: Proceedings of the international conference on artificial neural networks and genetic algorithms 1995, pp 483–486

  • Cantú-Paz E (1998) A survey of parallel genetic algorithms. Calculateurs Paralleles 10:141–171

    Google Scholar 

  • Carvalho DR, Freitas AA (2002) A genetic algorithm with sequential niching for discovering small-disjunct rules. In: Proceedings of the genetic and evolutionary computation conference. Morgan Kaufmann Publishers Inc., San Francisco, pp 1035–1042

  • Ching JY, Wong AKC, Chan KCC (1995) Class-dependent discretization for inductive learning from continuous and mixed-mode data. IEEE Trans Pattern Anal Mach Intell 17(7):641–651

    Article  Google Scholar 

  • Clark P, Boswell R (1991) Rule induction with CN2: some recent improvements. In: Kodratoff Y (ed) Machine learning EWSL-91. Lecture Notes in Computer Science, vol 482. Springer, Berlin, pp 151–163

  • Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20(1):37–46

    Article  Google Scholar 

  • Cohen WW (1995) Fast effective rule induction. In: Proceedings of the 12th international conference on machine learning. Morgan Kaufmann, pp 115–123

  • De Jong KA, Spears WM, Gordon D (1993) Using genetic algorithms for concept learning. Special Issue Genet algorithms 13(2–3):161–188

    Google Scholar 

  • De Jong KA, Potter M, Grefenstette JJ (1995) A coevolutionary approach to learning sequential decision rules. In: Proceedings of the sixth international conference on genetic algorithms. Morgan Kaufmann, pp 366–372

  • Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(7):1–30

    MathSciNet  Google Scholar 

  • Domingos P (1995) Rule induction and instance-based learning a unified approach. In: Proceedings of the fourteenth international joint conference on artificial intelligence, vol 2, pp 1226–1232

  • Fernández A, García S, Luengo J, Bernadó-Mansilla E, Herrera F (2010) Genetics-based machine learning for rule induction: state of the art, taxonomy and comparative study. IEEE Trans Evolut Comput (in press)

  • Finner H (1993) On a monotonicity problem in step-down multiple test procedures. J Am Stat Assoc 88(423):920–923

    Article  MathSciNet  MATH  Google Scholar 

  • Freitas AA (2001) Understanding the crucial role of attribute interaction in data mining. Artif Intell Rev 16(3):177–199

    Article  MathSciNet  MATH  Google Scholar 

  • Freitas AA (2003) A survey of evolutionary algorithms for data mining and knowledge discovery. In: Ghosh A, Tsutsui S (eds) Advances in evolutionary computing: theory and applications. Springer-Verlag New York, Inc., New York, pp 819–845

  • Friedman JH (1996) Another approach to polychotomous classification. Tech. rep. Department of Statistics, Stanford University, Stanford, CA. http://www-stat.stanford.edu/jhf/ftp/poly.ps.Z

  • Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701

    Article  Google Scholar 

  • Gallagher M, Bo Y (2005) A hybrid approach to parameter tuning in genetic algorithms. In: Proceedings of 2005 IEEE congress on evolutionary computation, IEEE, vol 2, pp 1096–1103

  • García S, Fernández A, Luengo J, Herrera F (2009) A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput Fusion Found Methodol Appl 13(10):959–977

    Google Scholar 

  • García S, Fernández A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inform Sci 180(10):2044–2064

    Article  Google Scholar 

  • Giordana A, Neri F (1995) Search-intensive concept induction. Evolut Comput 3(4):375–416

    Article  Google Scholar 

  • Giordana A, Saitta L, Bello GL (1997) A coevolutionary approach to concept learning. In: ISMIS ’97: Proceedings of the 10th international symposium on foundations of intelligent systems, vol 1325. Springer, London, UK, pp 257–266

  • Greene DP, Smith SF (1993) Competition-based induction of decision models from examples. Mach Learn 13(2):229–257

    Article  Google Scholar 

  • Hekanaho J (1997) GA-based rule enhancement in concept learning. In: Proceedings of the third international conference on knowledge discovery and data mining. AAAI Press, pp 183–186

  • Herrera F, Lozano M (2003) Fuzzy adaptive genetic algorithms: design, taxonomy and future directions. Soft Comput 7(8):545–562

    Google Scholar 

  • Ho Y, Pepyne D (2002) Simple explanation of the no-free-lunch theorem and its implications. J Optim Theory Appl 115(3):549–570

    Article  MathSciNet  MATH  Google Scholar 

  • Holden N, Freitas A (2009) Hierarchical classification of protein function with ensembles of rules and particle swarm optimisation. Soft Comput 13(3):259–272

    Article  Google Scholar 

  • Holland JH, Reitman JS (1977) Cognitive systems based on adaptive algorithms. In: Waterman DA, Hayes-Roth F (eds) Pattern directed inference systems. Academic Press, New York, pp 313–329

    Google Scholar 

  • Janikow CZ (1993) A knowledge-intensive genetic algorithm for supervised learning. Mach Learn 13(2):189–228

    Article  Google Scholar 

  • Jiao L, Liu J, Zhong W (2006) An organizational coevolutionary algorithm for classification. IEEE Trans Evolut Comput 10(1):67–80

    Article  Google Scholar 

  • Kim MW, Ryu JW (2007) An efficient coevolutionary algorithm using dynamic species control. In: Proceedings of the third international conference on natural computation (ICNC 2007), vol 3. IEEE, Haikou, pp 431–435

  • Knerr S, Personnaz L, Dreyfus G (1990) Single-layer learning revisited: a stepwise procedure for building and training a neural network. In: Fogelman J (ed) Neurocomputing: algorithms, architectures and applications, vol F68. Springer, NATO ASI, New York, pp 41–50

  • Lanzi PL (2008) Learning classifier systems: then and now. Evolut Intell 1(1):63–82

    Article  MATH  Google Scholar 

  • Liu JJ, Kwok JTY (2000) An extended genetic rule induction algorithm. In: Proceedings of the 2000 congress on evolutionary computation, vol 1, CEC00 (Cat. No. 00TH8512), IEEE, La Jolla, CA, pp 458–463

  • Marín-Blázquez J, Martínez Pérez G (2009) Intrusion detection using a linguistic hedged fuzzy-xcs classifier system. Soft Comput 13(3):273–290

    Article  Google Scholar 

  • Mendes RRF, Voznika FDB, Freitas AA, Nievola JC (2001) Discovering fuzzy classification rules with genetic programming and co-evolution. In: Proceedings of the fifth European conference on principles of data mining and knowledge discovery. Lecture Notes In Computer Science, vol 2168. Springer, London. pp 314–325

  • Michalewicz Z (1996) Genetic algorithms + data structures = evolution programs, 3rd edn. Springer, London, UK

    Google Scholar 

  • Michalski RS (1980) Pattern recognition as rule-guided inductive inference. IEEE Trans Pattern Anal Mach Intell 2(4):349–361

    Article  MATH  Google Scholar 

  • Michalski RS (1983) A theory and methodology of inductive learning. Artif Intell 20(2):111–161

    Article  MathSciNet  Google Scholar 

  • Mitchell TM (1982) Generalization as search. Artif Intell 18(2):203–226

    Article  Google Scholar 

  • Neri F (2002) Relational concept learning by cooperative evolution. J Exp Algorithm 7:12–37

    Article  MathSciNet  Google Scholar 

  • Neri F, Saitta L (1996) An analysis of the universal suffrage selection operator. Evolut Comput 4(1):87–107

    Article  Google Scholar 

  • Nojima Y, Ishibuchi H, Kuwajima I (2008) Parallel distributed genetic fuzzy rule selection. Soft Comput 13(5):511–519

    Article  Google Scholar 

  • Orriols-Puig A, Bernadó-Mansilla E (2005) The class imbalance problem in learning classifier systems. In: Proceedings of the 2005 workshops on genetic and evolutionary computation, GECCO ’05. ACM Press, New York, pp 74–78

  • Orriols-Puig A, Bernadó-Mansilla E (2009) Evolutionary rule-based systems for imbalanced data sets. Soft Comput 13(3):213–225

    Article  Google Scholar 

  • Orriols-Puig A, Casillas J, Bernadó-Mansilla E (2008) Genetic-based machine learning systems are competitive for pattern recognition. Evolut Intell 1(3):209–232

    Article  MATH  Google Scholar 

  • Provost F, Kolluri V (1999) A survey of methods for scaling up inductive algorithms. Data Min Knowl Discov 3(2):131–169

    Article  Google Scholar 

  • Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco, CA

    Google Scholar 

  • Reynolds A, de la Iglesia B (2009) A multi-objective grasp for partial classification. Soft Comput 13(3):227–243

    Article  Google Scholar 

  • Rissanen J (1989) Stochastic complexity in statistical inquiry theory. World Scientific Publishing Co., Inc., River Edge, NJ

    Google Scholar 

  • Rivero D, Dorado J, Rabual J, Pazos A (2009) Modifying genetic programming for artificial neural network development for data mining. Soft Comput 13(3):291–305

    Article  Google Scholar 

  • Rodríguez M, Escalante DM, Peregrín A (2010) Efficient distributed genetic algorithm for rule extraction. Appl Soft Comput (in press)

  • Stout M, Bacardit J, Hirst J, Smith R, Krasnogor N (2009) Prediction of topological contacts in proteins using learning classifier systems. In: Special issue on evolutionary and metaheuristics based data mining (EMBDM), vol 13. Springer, Berlin, pp 245–258

  • Tan KC, Yu Q, Ang JH (2006a) A dual-objective evolutionary algorithm for rules extraction in data mining. Comput Optim Appl 34(2):273–294

    Article  MathSciNet  MATH  Google Scholar 

  • Tan KC, Yu Q, Ang JH (2006b) A dual-objective evolutionary algorithm for rules extraction in data mining. Int J Syst Sci 37(12):835–864

    Article  MathSciNet  MATH  Google Scholar 

  • Venturini G (1993) SIA: a supervised inductive algorithm with genetic search for learning attributes based concepts. In: Machine learning: ECML-93. Lecture Notes in Computer Science, vol 667. Springer, Berlin, pp 280–296

  • Weilie Y, Qizhen L, Yongbao H (2000) Dynamic distributed genetic algorithms. In: Proceedings of the 2000 congress on evolutionary computation, vol 2. IEEE, La Jolla, CA, pp 1132–1136

  • Wilcoxon F (1945) Individual comparisons by ranking methods. Biometr Bull 1(6):80–83

    Article  Google Scholar 

  • Wilson SW (1995) Classifier fitness based on accuracy. Evolut Comput 3(2):149–175

    Article  Google Scholar 

  • Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco

  • Yang Q, Wu X (2006) 10 challenging problems in data mining research. Int J Inform Technol Decis Mak 5(4):597–604

    Article  Google Scholar 

  • Yoon HS, Moon BR (2002) An empirical study on the synergy of multiple crossover operators. IEEE Trans Evolut Comput 6(2):212–223

    Article  Google Scholar 

  • Zar JH (2007) Biostatistical analysis, 5th edn. Prentice-Hall, Inc., Upper Saddle River, NJ

    Google Scholar 

  • Zhang X, Luo M, Pi D (2005) Effective classifier pruning with rule information. In: Hoffmann A, Motoda H, Scheffer T (eds) Discovery science. Lecture Notes in Computer Science, vol 3735. Springer, Berlin, pp 392–395

Download references

Acknowledgments

This paper was supported in part by the Spanish Ministry of Education and Science under grant no. TIN2008-06681-C06-06 and the Andalusian government under grant no. P07-TIC-03179.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Antonio Peregrin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lopez, L.I., Bardallo, J.M., De Vega, M.A. et al. REGAL-TC: a distributed genetic algorithm for concept learning based on REGAL and the treatment of counterexamples. Soft Comput 15, 1389–1403 (2011). https://doi.org/10.1007/s00500-010-0678-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-010-0678-8

Keywords

Navigation