Advertisement

Accelerating GPU-based Evolutionary Induction of Decision Trees - Fitness Evaluation Reuse

  • Krzysztof JurczukEmail author
  • Marcin Czajkowski
  • Marek Kretowski
Conference paper
  • 43 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12043)

Abstract

The rapid development of new technologies and parallel frameworks is a chance to overcome barriers of slow evolutionary induction of decision trees (DTs). This global approach, that searches for the tree structure and tests simultaneously, is an emerging alternative to greedy top-down solutions. However, in order to be efficiently applied to big data mining, both technological and algorithmic possibilities need to be fully exploited. This paper shows how by reusing information from previously evaluated individuals, we can accelerate GPU-based evolutionary induction of DTs on large-scale datasets even further. Noting that some of the trees or their parts may reappear during the evolutionary search, we have created a so-called repository of trees (split between GPU and CPU). Experimental evaluation is carried out on the existing Global Decision Tree system where the fitness calculations are delegated to the GPU, while the core evolution is run sequentially on the CPU. Results demonstrate that reusing information about trees from the repository (classification errors, objects’ locations, etc.) can accelerate the original GPU-based solution. It is especially visible on large-scale data where the cost of the trees evaluation exceeds the cost of storing and exploring the repository.

Keywords

Evolutionary algorithms Decision trees Big data mining Graphics processing unit (GPU) CUDA 

Notes

Acknowledgments

This work was supported by the grant S/WI/2/18 from Bialystok University of Technology founded by Polish Ministry of Science and Higher Education.

References

  1. 1.
    NVIDIA Developer Zone - CUDA Toolkit Documentation (2019). https://docs.nvidia.com/cuda/cuda-c-programming-guide/
  2. 2.
    Barros, R.C., Basgalupp, M.P., De Carvalho, A.C., Freitas, A.A.: A survey of evolutionary algorithms for decision-tree induction. IEEE Trans. SMC, Part C 42(3), 291–312 (2012)Google Scholar
  3. 3.
    Charalampakis, A.E.: Registrar: a complete-memory operator to enhance performance of genetic algorithms. J. Glob. Optim. 54(3), 449–483 (2012)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Chitty, D.M.: Fast parallel genetic programming: multi-core CPU versus many-core GPU. Soft Comput. 16(10), 1795–1814 (2012)CrossRefGoogle Scholar
  5. 5.
    Czajkowski, M., Jurczuk, K., Kretowski, M.: A parallel approach for evolutionary induced decision trees. MPI+OpenMP implementation. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2015, Part I. LNCS (LNAI), vol. 9119, pp. 340–349. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-19324-3_31CrossRefGoogle Scholar
  6. 6.
    Czajkowski, M., Kretowski, M.: Evolutionary induction of global model trees with specialized operators and memetic extensions. Inf. Sci. 288, 153–173 (2014)CrossRefGoogle Scholar
  7. 7.
    Franco, M.A., Bacardit, J.: Large-scale experimental evaluation of GPU strategies for evolutionary machine learning. Inf. Sci. 330(C), 385–402 (2016)CrossRefGoogle Scholar
  8. 8.
    Jurczuk, K., Czajkowski, M., Kretowski, M.: Evolutionary induction of a decision tree for large-scale data: a GPU-based approach. Soft Comput. 21(24), 7363–7379 (2017)CrossRefGoogle Scholar
  9. 9.
    Kotsiantis, S.B.: Decision trees: a recent overview. Artif. Intell. Rev. 39(4), 261–283 (2013)CrossRefGoogle Scholar
  10. 10.
    Kretowski, M.: Evolutionary Decision Trees in Large-Scale Data Mining. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-21851-5CrossRefGoogle Scholar
  11. 11.
    Lo, W.T., Chang, Y.S., Sheu, R.K., Chiu, C.C., Yuan, S.M.: CUDT: A CUDA based decision tree algorithm. Sci. World J. (2014) Google Scholar
  12. 12.
    Loh, W.Y.: Fifty years of classification and regression trees. Int. Stat. Rev. 82(3), 329–348 (2014)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Marron, D., Bifet, A., Morales, G.D.F.: Random forests of very fast decision trees on GPU for mining evolving big data streams. In: Proceedings of the Twenty-First European Conference on Artificial Intelligence, ECAI 2014, pp. 615–620 (2014)Google Scholar
  14. 14.
    Michalewicz, Z.: Genetic Algorithms + Data Structures = Evolution Programs, 3rd edn. Springer, Heidelberg (1996).  https://doi.org/10.1007/978-3-662-03315-9CrossRefzbMATHGoogle Scholar
  15. 15.
    Reska, D., Jurczuk, K., Kretowski, M.: Evolutionary induction of classification trees on spark. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds.) ICAISC 2018, Part I. LNCS (LNAI), vol. 10841, pp. 514–523. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-91253-0_48CrossRefGoogle Scholar
  16. 16.
    Rokach, L., Maimon, O.: Top-down induction of decision trees classifiers - a survey. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 35(4), 476–487 (2005)CrossRefGoogle Scholar
  17. 17.
    Storti, D., Yurtoglu, M.: CUDA for Engineers : An Introduction to High-Performance Parallel Computing. Addison-Wesley, New York (2016)Google Scholar
  18. 18.
    Tsutsui, S., Collet, P. (eds.): Massively Parallel Evolutionary Computation on GPGPUs. Natural Computing Series. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-37959-8CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Faculty of Computer ScienceBialystok University of TechnologyBialystokPoland

Personalised recommendations