Using Ontologies to Express Prior Knowledge for Genetic Programming

  • Stefan PrieschlEmail author
  • Dominic Girardi
  • Gabriel Kronberger
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11713)


Ontologies are useful for modeling domains and can be used to capture expert knowledge about a system. Genetic programming can be used to identify statistical relationships or models from data. Combining expert knowledge as well as statistical rules identified solely from data is necessary in application domains where data is scarce and a large body of expert knowledge exists.

We therefore study if the performance of genetic programming can be improved by incorporating prior knowledge from an ontology. In particular, we include prior knowledge as additional features for genetic programming.

The approach is tested with six benchmark data sets where we compare the required computational effort that is necessary to find an acceptable model with and without additional features. The results show that additional features gathered from an ontology improve the performance of tree-based GP. The probability to find acceptable solutions with a fixed computational budget is increased. For noisy data sets we observed the same effect as for the data sets without noise.


Supervised learning Ontologies Domain knowledge Genetic programming Symbolic regression 


  1. 1.
    Affenzeller, M., Winkler, S., Wagner, S., Beham, A.: Genetic Algorithms and Genetic Programming - Modern Concepts and Practical Applications. CRC Press, Boca Raton (2009)zbMATHCrossRefGoogle Scholar
  2. 2.
    Ashburner, M., et al.: Gene ontology: tool for the unification of biology. Nat. Genet. 25(1), 25 (2000)CrossRefGoogle Scholar
  3. 3.
    Chandrasekaran, B., Josephson, J.R., Benjamins, V.R.: What are ontologies, and why do we need them? IEEE Intell. Syst. 14(1), 20–26 (1999)CrossRefGoogle Scholar
  4. 4.
    Chen, C., Luo, C., Jiang, Z.: A multilevel block building algorithm for fast modeling generalized separable systems. Expert Syst. Appl. 109, 25–34 (2018). Scholar
  5. 5.
    Couchet, J., Manrique, D., Ríos, J., Rodríguez-Patón, A.: Crossover and mutation operators for grammar-guided genetic programming. Soft. Comput. 11(10), 943–955 (2007)CrossRefGoogle Scholar
  6. 6.
    Cramer, N.L.: A representation for the adaptive generation of simple sequential programs. In: Proceedings of the First International Conference on Genetic Algorithms, pp. 183–187 (1985)Google Scholar
  7. 7.
    Cruz, I.F., Xiao, H.: The role of ontologies in data integration. Eng. Intell. Syst. Electr. Eng. Commun. 13(4), 245 (2005)Google Scholar
  8. 8.
    Darwin, C.: The Origin of Species: By Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life. Cambridge Library Collection - Life Sciences, 6th edn. Cambridge University Press, Cambridge (2009)CrossRefGoogle Scholar
  9. 9.
    Eilbeck, K., et al.: The sequence ontology: a tool for the unification of genome annotations. Genome Biol. 6(5), R44 (2005)CrossRefGoogle Scholar
  10. 10.
    Gardner, S.P.: Ontologies and semantic data integration. Drug Discovery Today 10(14), 1001–1007 (2005)CrossRefGoogle Scholar
  11. 11.
    Girardi, D., et al.: Interactive knowledge discovery with the doctor-in-the-loop: a practical example of cerebral aneurysms research. Brain Inf. 3(3), 133–143 (2016)CrossRefGoogle Scholar
  12. 12.
    Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley Professional, Boston (1989)zbMATHGoogle Scholar
  13. 13.
    Gruber, T.R.: A translation approach to portable ontology specifications. Knowl. Acquisition 5(2), 199–220 (1993)CrossRefGoogle Scholar
  14. 14.
    Hansen, N., Auger, A., Ros, R., Finck, S., Pošík, P.: Comparing results of 31 algorithms from the black-box optimization benchmarking BBOB-2009. In: Proceedings of the 12th Annual Conference Companion on Genetic and Evolutionary Computation, pp. 1689–1696. ACM (2010)Google Scholar
  15. 15.
    Holland, J.H.: Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. MIT Press, Cambridge (1992)CrossRefGoogle Scholar
  16. 16.
    Holzinger, A., et al.: Interactive machine learning: experimental evidence for the human in the algorithmic loop. Appl. Intell. 49(7), 2401–2414 (2019)CrossRefGoogle Scholar
  17. 17.
    Kommenda, M., Kronberger, G., Wagner, S., Winkler, S., Affenzeller, M.: On the architecture and implementation of tree-based genetic programming in HeuristicLab. In: Companion Proceedings of the 14th Annual Conference on Genetic and Evolutionary Computation (GECCO 2012), pp. 101–108. ACM (2012)Google Scholar
  18. 18.
    Korns, M.F.: Accuracy in symbolic regression. In: Riolo, R., Vladislavleva, E., Moore, J. (eds.) Genetic Programming Theory and Practice IX, pp. 129–151. Springer, New York (2011). Scholar
  19. 19.
    Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge (1992)zbMATHGoogle Scholar
  20. 20.
    Luke, S.: Two fast tree-creation algorithms for genetic programming. IEEE Trans. Evol. Comput. 4(3), 274–283 (2000)CrossRefGoogle Scholar
  21. 21.
    McKay, R.I., Hoai, N.X., Whigham, P.A., Shan, Y., O’Neill, M.: Grammar-based genetic programming: a survey. Genet. Program Evolvable Mach. 11(3–4), 365–396 (2010)CrossRefGoogle Scholar
  22. 22.
    Osborne, J.D., et al.: Annotating the human genome with disease ontology. BMC Genom. 10(1), S6 (2009)CrossRefGoogle Scholar
  23. 23.
    Pagie, L., Hogeweg, P.: Evolutionary consequences of coevolving targets. Evol. Comput. 5(4), 401–418 (1997)CrossRefGoogle Scholar
  24. 24.
    Ratle, A., Sebag, M.: Genetic programming and domain knowledge: beyond the limitations of grammar-guided machine discovery. In: Schoenauer, M., et al. (eds.) PPSN 2000. LNCS, vol. 1917, pp. 211–220. Springer, Heidelberg (2000). Scholar
  25. 25.
    Salustowicz, R., Schmidhuber, J.: Probabilistic incremental program evolution. Evol. Comput. 5(2), 123–141 (1997)CrossRefGoogle Scholar
  26. 26.
    Schoenauer, M., Sebag, M.: Using domain knowledge in evolutionary system identification. CoRR abs/cs/0602021 (2006).
  27. 27.
    Vladislavleva, E.J., Smits, G.F., Den Hertog, D.: Order of nonlinearity as a complexity measure for models generated by symbolic regression via pareto genetic programming. IEEE Trans. Evol. Comput. 13(2), 333–349 (2009)CrossRefGoogle Scholar
  28. 28.
    Whigham, P.A., et al.: Grammatically-based genetic programming. In: Proceedings of the Workshop on Genetic Programming: From Theory to Real-world Applications, vol. 16, pp. 33–41 (1995)Google Scholar
  29. 29.
    White, D.R., et al.: Better GP benchmarks: community survey results and proposals. Genet. Program Evolvable Mach. 14(1), 3–29 (2013)CrossRefGoogle Scholar
  30. 30.
    Winkler, S.M.: Evolutionary system identification: modern concepts and practical applications. Ph.D. thesis, Johannes Kepler University, Altenbergerstr. 69, 4040 Linz (2008)Google Scholar
  31. 31.
    Wong, M.L., Leung, K.S.: Data Mining Using Grammar Based Genetic Programming and Applications, vol. 3. Springer Science & Business Media, New York (2006). Scholar

Copyright information

© IFIP International Federation for Information Processing 2019

Authors and Affiliations

  • Stefan Prieschl
    • 1
    Email author
  • Dominic Girardi
    • 1
  • Gabriel Kronberger
    • 2
  1. 1.RISC Software GmbHJohannes Kepler UniversityHagenbergAustria
  2. 2.Josef Ressel Centre for Symbolic RegressionUniversity of Applied Sciences Upper AustriaHagenbergAustria

Personalised recommendations