Can Genetic Programming Do Manifold Learning Too?

  • Andrew LensenEmail author
  • Bing Xue
  • Mengjie Zhang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11451)


Exploratory data analysis is a fundamental aspect of knowledge discovery that aims to find the main characteristics of a dataset. Dimensionality reduction, such as manifold learning, is often used to reduce the number of features in a dataset to a manageable level for human interpretation. Despite this, most manifold learning techniques do not explain anything about the original features nor the true characteristics of a dataset. In this paper, we propose a genetic programming approach to manifold learning called GP-MaL which evolves functional mappings from a high-dimensional space to a lower dimensional space through the use of interpretable trees. We show that GP-MaL is competitive with existing manifold learning algorithms, while producing models that can be interpreted and re-used on unseen data. A number of promising future directions of research are found in the process.


Manifold learning Genetic programming Dimensionality reduction Feature construction 


  1. 1.
    Bengio, Y., Courville, A.C., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)CrossRefGoogle Scholar
  2. 2.
    Cano, A., Ventura, S., Cios, K.J.: Multi-objective genetic programming for feature extraction and data visualization. Soft Comput. 21(8), 2069–2089 (2017)CrossRefGoogle Scholar
  3. 3.
    Dheeru, D., Karra Taniskidou, E.: UCI machine learning repository (2017).
  4. 4.
    François, D., Wertz, V., Verleysen, M.: The concentration of fractional distances. IEEE Trans. Knowl. Data Eng. 19(7), 873–886 (2007)CrossRefGoogle Scholar
  5. 5.
    Jolliffe, I.T.: Principal component analysis. In: Lovric, M. (ed.) International Encyclopedia of Statistical Science, pp. 1094–1096. Springer, Berlin (2011). Scholar
  6. 6.
    Kruskal, J.B.: Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29(1), 1–27 (1964)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Lensen, A., Xue, B., Zhang, M.: New representations in genetic programming for feature construction in k-means clustering. In: Shi, Y., et al. (eds.) SEAL 2017. LNCS, vol. 10593, pp. 543–555. Springer, Cham (2017). Scholar
  8. 8.
    Lensen, A., Xue, B., Zhang, M.: Automatically evolving difficult benchmark feature selection datasets with genetic programming. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO, pp. 458–465. ACM (2018)Google Scholar
  9. 9.
    Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining, vol. 454. Springer, Boston (2012). Scholar
  10. 10.
    van der Maaten, L.: Accelerating t-SNE using tree-based algorithms. J. Mach. Learn. Res. 15(1), 3221–3245 (2014)MathSciNetzbMATHGoogle Scholar
  11. 11.
    van der Maaten, L., Hinton, G.E.: Visualizing high-dimensional data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)zbMATHGoogle Scholar
  12. 12.
    Neshatian, K., Zhang, M., Andreae, P.: A filter approach to multiple feature construction for symbolic learning classifiers using genetic programming. IEEE Trans. Evol. Comput. 16(5), 645–661 (2012)CrossRefGoogle Scholar
  13. 13.
    Nguyen, S., Zhang, M., Alahakoon, D., Tan, K.C.: Visualizing the evolution of computer programs for genetic programming [research frontier]. IEEE Comput. Intell. Mag. 13(4), 77–94 (2018)CrossRefGoogle Scholar
  14. 14.
    Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  15. 15.
    Poli, R., Langdon, W.B., McPhee, N.F.: A Field Guide to Genetic Programming., Morrisville (2008)Google Scholar
  16. 16.
    Rodriguez-Coayahuitl, L., Morales-Reyes, A., Escalante, H.J.: Structurally layered representation learning: towards deep learning through genetic programming. In: Castelli, M., Sekanina, L., Zhang, M., Cagnoni, S., García-Sánchez, P. (eds.) EuroGP 2018. LNCS, vol. 10781, pp. 271–288. Springer, Cham (2018). Scholar
  17. 17.
    Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)CrossRefGoogle Scholar
  18. 18.
    Sun, Y., Xue, B., Zhang, M., Yen, G.G.: A particle swarm optimization-based flexible convolutional auto-encoder for image classification. IEEE Trans. Neural Netw. Learn. Syst. (2018).
  19. 19.
    Sun, Y., Yen, G.G., Yi, Z.: Evolving unsupervised deep neural networks for learning meaningful representations. IEEE Trans. Evol. Comput. (2018).
  20. 20.
    Tran, B., Xue, B., Zhang, M.: Genetic programming for feature construction and selection in classification on high-dimensional data. Memet. Comput. 8(1), 3–15 (2016)CrossRefGoogle Scholar
  21. 21.
    Zhang, C., Liu, C., Zhang, X., Almpanidis, G.: An up-to-date comparison of state-of-the-art classification algorithms. Expert Syst. Appl. 82, 128–150 (2017)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.School of Engineering and Computer ScienceVictoria University of WellingtonWellingtonNew Zealand

Personalised recommendations