Skip to main content
Log in

Genetic programming in the twenty-first century: a bibliometric and content-based analysis from both sides of the fence

  • Published:
Genetic Programming and Evolvable Machines Aims and scope Submit manuscript

Abstract

In this work we present an extensive bibliometric and content-based analysis of the scientific literature about genetic programming in the twenty-first century. Our work has two key peculiarities. First, we revealed the topics emerging from the literature based on an unsupervised analysis of the textual content of titles and abstracts. Second, we executed all of our analyses twice, once on the papers published in the venues that are typical of the evolutionary computation research community and once on those published in all the other venues. This view from “both sides of the fence” allows us to gain broader and deeper insights into the actual contributions of our community.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. Formally, the twenty-first century started on January 1, 2001, rather than on January 1, 2000, which is the starting date of the so-called “2000s century.” We chose, however, to use “twenty-first century” because we think it is a more accessible locution.

  2. https://www.scopus.com.

  3. The publication venue is shown in the “source title” field of Scopus results.

  4. http://species-society.org, accessed on September 2018. SPECIES is a non-profit association that “aims to promote evolutionary algorithmic thinking within Europe and wider, and more generally to promote inspiration of parallel algorithms derived from natural processes”.

  5. Provided by the Computing Research and Education Association of Australasia, it is one among A\(^*\) (best), A, B, and C (worst), http://www.core.edu.au/conference-portal.

  6. Both lemmatization and stemming have been done using the NLTK toolkit, https://www.nltk.org/.

  7. Originally, countries of affiliation are a multiset as more than one author can be affiliated with an institution in the same country. We considered the corresponding set.

References

  1. G. Bao, H. Fang, L. Chen, Y. Wan, F. Xu, Q. Yang, L. Zhang, Soft robotics: academic insights and perspectives through bibliometric analysis. Soft Robot. 5(3), 229–241 (2018)

    Article  Google Scholar 

  2. A. Bartoli, E. Medvet, Bibliometric evaluation of researchers in the internet age. Inf. Soc. 30(5), 349–354 (2014)

    Article  Google Scholar 

  3. D.M. Blei, Probabilistic topic models. Commun. ACM 55(4), 77–84 (2012)

    Article  Google Scholar 

  4. D.M. Blei, J.D Lafferty, Dynamic topic models, in ICML (2006)

  5. D.M. Blei, A.Y. Ng, M.I. Jordan, Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  6. J. Branke, S. Nguyen, C.W. Pickardt, M. Zhang, Automated design of production scheduling heuristics: a review. IEEE Trans. Evol. Comput. 20(1), 110–124 (2016)

    Article  Google Scholar 

  7. V.K. Dabhi, S. Chaudhary, Empirical modeling using genetic programming: a survey of issues and approaches. Nat. Comput. 14(2), 303–330 (2015)

    Article  MathSciNet  Google Scholar 

  8. D. De Nart, D. Degl’Innocenti, A. Pavan, M. Basaldella, C. Tasso, Modelling the user modelling community (and other communities as well), in International Conference on User Modeling, Adaptation, and Personalization (Springer, 2015), pp. 357–363

  9. P.G. Espejo, S. Ventura, F. Herrera, A survey on the application of genetic programming to classification. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 40(2), 121–144 (2010)

    Article  Google Scholar 

  10. N. Evangelopoulos, X. Zhang, V.R. Prybutok, Latent semantic analysis: five methodological recommendations. Eur. J. Inf. Syst. 21(1), 70–86 (2012)

    Article  Google Scholar 

  11. A.W. Harzing, S. Alakangas, Google Scholar, Scopus and the Web of Science: a longitudinal and cross-disciplinary comparison. Scientometrics 106(2), 787–804 (2016)

    Article  Google Scholar 

  12. M. Herrera, D.C. Roberts, N. Gulbahce, Mapping the evolution of scientific fields. PLoS ONE 5(5), e10355 (2010)

    Article  Google Scholar 

  13. J.R. Koza, Survey of genetic algorithms and genetic programming, in WESCON/’95. Conference Record. ’Microelectronics Communications Technology Producing Quality Products Mobile and Portable Power Emerging Technologies (IEEE, 1995), p. 589

  14. W.B. Langdon, S.M. Gustafson, Genetic programming and evolvable machines: ten years of reviews. Genet. Program Evolvable Mach. 11(3–4), 321–338 (2010)

    Article  Google Scholar 

  15. J. McDermott, D.R. White, S. Luke, L. Manzoni, M. Castelli, L. Vanneschi, W. Jaskowski, K. Krawiec, R. Harper, K. De Jong, U.M. O’Reilly, Genetic programming needs better benchmarks, in Proceedings of the 14th Annual Conference on Genetic and Evolutionary Computation, GECCO ’12 (ACM, 2012), pp. 791–798

  16. R.I. McKay, N.X. Hoai, P.A. Whigham, Y. Shan, M. O’Neill, Grammar-based genetic programming: a survey. Genet. Program Evolvable Mach. 11(3), 365–396 (2010)

    Article  Google Scholar 

  17. E. Medvet, A. Bartoli, G. Davanzo, A. De Lorenzo, Automatic face annotation in news images by mining the web, in Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, vol. 1 (IEEE Computer Society, 2011), pp. 47–54

  18. E. Medvet, A. Bartoli, G. Piccinin, Publication venue recommendation based on paper abstract, in 2014 IEEE 26th International Conference on Tools with Artificial Intelligence (ICTAI) (IEEE, 2014), pp. 1004–1010

  19. Y. Meguebli, M. Kacimi, B.L. Doan, F. Popineau, Unsupervised approach for identifying users’ political orientations, in European Conference on Information Retrieval (Springer, 2014), pp. 507–512

  20. P. Mongeon, A. Paul-Hus, The journal coverage of Web of Science and Scopus: a comparative analysis. Scientometrics 106(1), 213–228 (2016)

    Article  Google Scholar 

  21. M. Oltean, C. Groşan, L. Dioşan, C. Mihăilă, Genetic programming with linear representation: a survey. Int. J. Artif. Intell. Tools 18(02), 197–238 (2009)

    Article  Google Scholar 

  22. J. Petke, S. Haraldsson, M. Harman, D. White, J. Woodward et al., Genetic improvement of software: a comprehensive survey. IEEE Trans. Evol. Comput. 22, 415–432 (2017)

    Article  Google Scholar 

  23. M. Röder, A. Both, A. Hinneburg, Exploring the space of topic coherence measures, in Proceedings of the Eighth ACM International Conference on Web Search and Data Mining (ACM, 2015), pp. 399–408

  24. F. Schlegel, S. Schneegans, D. Eröcal, UNESCO Science Report: Towards 2030 (UNESCO Publ., 2015)

  25. Statistics & Collaboration Network in GECCO. https://doi.org/10.13140/RG.2.2.25153.66404

  26. P. Sondhi, Feature construction methods: a survey. Tech. rep. (2009)

  27. The GP bibliography, http://www.cs.bham.ac.uk/~wbl/biblio/. Accessed Oct 2018

  28. M.C. Tremblay, C. Parra, A. Castellanos, Analyzing corporate social responsibility reports using unsupervised and supervised text data mining, in International Conference on Design Science Research in Information Systems (Springer, 2015), pp. 439–446

  29. L. Vanneschi, M. Castelli, S. Silva, A survey of semantic methods in genetic programming. Genet. Program Evolvable Mach. 15(2), 195–214 (2014)

    Article  Google Scholar 

  30. T. Velden, K.W. Boyack, J. Gläser, R. Koopman, A. Scharnhorst, S. Wang, Comparison of topic extraction approaches and their results. Scientometrics 111(2), 1169–1221 (2017)

    Article  Google Scholar 

  31. M.L. Wallace, V. Larivière, Y. Gingras, Modeling a century of citation distributions. J. Informetr. 3(4), 296–303 (2009)

    Article  Google Scholar 

  32. D.R. White, J. McDermott, M. Castelli, L. Manzoni, B.W. Goldman, G. Kronberger, W. Jaśkowski, U.M. O’Reilly, S. Luke, Better GP benchmarks: community survey results and proposals. Genet. Program Evolvable Mach. 14(1), 3–29 (2013)

    Article  Google Scholar 

  33. B. Xue, M. Zhang, W.N. Browne, X. Yao, A survey on evolutionary computation approaches to feature selection. IEEE Trans. Evol. Comput. 20(4), 606–626 (2016)

    Article  Google Scholar 

  34. C.K. Yau, A. Porter, N. Newman, A. Suominen, Clustering scientific documents with topic modeling. Scientometrics 100(3), 767–786 (2014)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by national funds through FCT (Fundação para a Ciência e a Tecnologia) under Project DSAIPA/DS/0022/2018 (GADgET). Mauro Castelli acknowledges the financial support from the Slovenian Research Agency (research core Funding No. P5-0410).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eric Medvet.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

De Lorenzo, A., Bartoli, A., Castelli, M. et al. Genetic programming in the twenty-first century: a bibliometric and content-based analysis from both sides of the fence. Genet Program Evolvable Mach 21, 181–204 (2020). https://doi.org/10.1007/s10710-019-09363-3

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10710-019-09363-3

Keywords

Navigation