Advanced Genetic Programming Based Machine Learning

Abstract

A Genetic Programming based approach for solving classification problems is presented in this paper. Classification is understood as the act of placing an object into a set of categories, based on the object’s properties; classification algorithms are designed to learn a function which maps a vector of object features into one of several classes. This is done by analyzing a set of input-output examples (“training samples”) of the function. Here we present a method based on the theory of Genetic Algorithms and Genetic Programming that interprets classification problems as optimization problems: Each presented instance of the classification problem is interpreted as an instance of an optimization problem, and a solution is found by a heuristic optimization algorithm. The major new aspects presented in this paper are advanced algorithmic concepts as well as suitable genetic operators for this problem class (mainly the creation of new hypotheses by merging already existing ones and their detailed evaluation). The experimental part of the paper documents the results produced using new hybrid variants of Genetic Algorithms as well as investigated parameter settings. Graphical analysis is done using a novel multiclass classifier analysis concept based on the theory of Receiver Operating Characteristic curves.

This is a preview of subscription content, log in to check access.

References

  1. 1.

    Affenzeller, M.: Segregative Genetic Algorithms (SEGA): a hybrid superstructure upwards compatible to genetic algorithms for retarding premature convergence. Int. J. Comput. Syst. Signal 2(1), 18–32 (2001)

    Google Scholar 

  2. 2.

    Affenzeller, M.: Population Genetics and Evolutionary Computation: Theoretical and Practical Aspects. Trauner Verlag, Linz, Austria (2005)

  3. 3.

    Affenzeller, M., Wagner, S.: SASEGASA: A New Generic Parallel Evolutionary Algorithm for Achieving Highest Quality Results. Journal of Heuristics (Special Issue on New. Advances on Parallel Meta-Heuristics for Complex Problems 10, 239–263 (2004)

    Google Scholar 

  4. 4.

    Affenzeller, M., Wagner, S.: Offspring selection: a new self-adaptive selection scheme for genetic algorithms. In: Adaptive and Natural Computing Algorithms, pp. 218–221 (2005)

  5. 5.

    Alberer, D., del Re, L., Winkler, S., Langthaler, P.: Virtual sensor design of particulate and nitric oxide emissions in a DI diesel engine. In: Proceedings of the 7th International Conference on Engines for Automobile ICE 2005, Capri, Napoli, paper no. 2005-24-063 (2005)

  6. 6.

    Beyer, H.G.: The Theory of Evolution Strategies. Springer, Berlin Heidelberg New York (1998)

    Google Scholar 

  7. 7.

    Bradley, A.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn. 30, 1145–1159 (1997)

    Article  Google Scholar 

  8. 8.

    Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P.: From data mining to knowledge discovery: an overview. In: Advances in Knowledge Discovery and Data Mining (1996)

  9. 9.

    Fieldsend, J.E., Everson, R.M.: Formulation and comparison of multi-class ROC surfaces. In: Proceedings of the ICML 2005 Workshop on ROC Analysis in Machine Learning (2005)

  10. 10.

    Flach, P., Blockeel, H., Ferri, C., Hernández-Orallo, J., Struyf, J.: Decision support for data mining: introduction to ROC analysis and its applications. In: Data Mining and Decision Support: Integration and Collaboration. Kluwer, Boston, MA (2003)

    Google Scholar 

  11. 11.

    Fogel, D.B.: An introduction to simulated evolutionary optimization. IEEE Trans. Neural Netw. 5(1), 3–14 (1994)

    Article  Google Scholar 

  12. 12.

    Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, Reading, MA (1989)

    Google Scholar 

  13. 13.

    Hamker, F., Heinke, D.: Implementation and comparison of growing neural das, growing cell structures and fuzzy artmap. Technical Report ISSN 0945-7518, Technische Universität Ilmenau, (1997)

  14. 14.

    Hanley, J.A., McNeil, B.J.: The meaning and use of the area under a Receiver Operating Characteristic (ROC) curve. Radiology 143, 29–36 (1982)

    Google Scholar 

  15. 15.

    Hernández-Orallo, J., Ferri, C., Lachiche, C., Flach, P.(eds): Analysis in Artificial Intelligence. In: 1st International Workshop ROCAI-2004 (2004)

  16. 16.

    Hochreiter, S., Obermayer, K.: Classification, regression, and feature selection on matrix data. Technical Report ISSN 1436-9915, Department of Electrical Engineering and Computer Science, Technische Universität Berlin (2004)

  17. 17.

    Holland, J.H.: Adaption in Natural and Artificial Systems, 1st edn. MIT Press, Cambridge, MA (1975)

    Google Scholar 

  18. 18.

    Jiang, X., Motai, Y.: Incremental on-line PCA for automatic motion learning of eigen behavior. In: Proceedings of the 1st International Workshop on Automatic Learning and Real-Time ALaRT ‘05, pp. 153–164 (2005)

  19. 19.

    Kohavi, R., Provost, F.: Glossary of terms. Mach. Learn. (Special Issue on Applications of Machine Learning and the Knowledge Discovery Process) 30, 271–274 (1998)

    Google Scholar 

  20. 20.

    Koza, J.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA (1992)

    Google Scholar 

  21. 21.

    Langdon, W., Poli, R.: Foundations of Genetic Programming. Springer, Berlin Heidelberg New York (2002)

    Google Scholar 

  22. 22.

    Michalewicz, Z.: Genetic Algorithms + Data Structures = Evolution Programs (3rd edn). Springer, Berlin Heidelberg New York (1996)

    Google Scholar 

  23. 23.

    Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (2000)

    Google Scholar 

  24. 24.

    Wagner, S., Affenzeller, M.: Heuristiclab: a generic and extensible optimization environment. In: Adaptive and Natural Computing Algorithms, pp. 538–541 (2005)

  25. 25.

    Wagner, S., Affenzeller, M.: SexualGA: gender-specific selection for genetic algorithms. In: Proceedings of the 9th World Multi-Conference on Systemics, Cybernetics and Informatics (WMSCI), pp. 76–81 (2005)

  26. 26.

    Winkler, S.: Identification of nonlinear model structures by genetic programming techniques. Master’s thesis, Institut für Systemtheorie und Simulation, Technisch-Naturwissenschaftliche Fakultät der Johannes Kepler Universität, Linz, Austria, (2004)

  27. 27.

    Winkler, S., Affenzeller, M., Wagner, S.: New methods for the identification of nonlinear model structures based upon genetic programming techniques. J. Syst. Sci. 31(1), 5–13 (2005)

    Google Scholar 

  28. 28.

    Winkler, S., Affenzeller, M., Wagner, S.: Automatic data based patient classification using genetic programming. Cybern. Syst. (Austrian Society for Cybernetic Studies) 1, 251–256 (2006)

    Google Scholar 

  29. 29.

    Winkler, S., Affenzeller, M., Wagner, S.: Using enhanced genetic programming techniques for evolving classifiers in the context of medical diagnosis – an empirical study. In: Proceedings of the GECCO 2006 Workshop on Medical Applications of Genetic and Evolutionary Computation (MedGEC 2006), paper no. WKSP115. The Association for Computing Machinery (ACM), New York (2006)

    Google Scholar 

  30. 30.

    Winkler, S., Affenzeller, M., Wagner, S.: Sets of receiver operating characteristic curves and their use in the evaluation of multi-class classification. In: Proceedings of the Genetic and Evolutionary Computation Conference 2006, vol. 2, pp. 1601–1602. The Association for Computing Machinery (ACM), New York (2005)

    Google Scholar 

  31. 31.

    Zweig, M.H., Campbell, G.: Receiver-Operating Characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clin. Chem. 39(4), 551–577 (1993)

    Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Stephan Winkler.

Additional information

The work described in this paper was done within the Translational Research Project L282 “GP-Based Techniques for the Design of Virtual Sensors” sponsored by the Austrian Science Fund (FWF).

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Winkler, S., Affenzeller, M. & Wagner, S. Advanced Genetic Programming Based Machine Learning. J Math Model Algor 6, 455–480 (2007). https://doi.org/10.1007/s10852-007-9065-6

Download citation

Keywords

  • Evolutionary algorithms
  • Genetic programming
  • Data mining

Mathematics Subject Classifications (2000)

  • 68T05
  • 68T20