Abstract
The rapid development of new technologies and parallel frameworks is a chance to overcome barriers of slow evolutionary induction of decision trees (DTs). This global approach, that searches for the tree structure and tests simultaneously, is an emerging alternative to greedy top-down solutions. However, in order to be efficiently applied to big data mining, both technological and algorithmic possibilities need to be fully exploited. This paper shows how by reusing information from previously evaluated individuals, we can accelerate GPU-based evolutionary induction of DTs on large-scale datasets even further. Noting that some of the trees or their parts may reappear during the evolutionary search, we have created a so-called repository of trees (split between GPU and CPU). Experimental evaluation is carried out on the existing Global Decision Tree system where the fitness calculations are delegated to the GPU, while the core evolution is run sequentially on the CPU. Results demonstrate that reusing information about trees from the repository (classification errors, objects’ locations, etc.) can accelerate the original GPU-based solution. It is especially visible on large-scale data where the cost of the trees evaluation exceeds the cost of storing and exploring the repository.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
NVIDIA Developer Zone - CUDA Toolkit Documentation (2019). https://docs.nvidia.com/cuda/cuda-c-programming-guide/
Barros, R.C., Basgalupp, M.P., De Carvalho, A.C., Freitas, A.A.: A survey of evolutionary algorithms for decision-tree induction. IEEE Trans. SMC, Part C 42(3), 291–312 (2012)
Charalampakis, A.E.: Registrar: a complete-memory operator to enhance performance of genetic algorithms. J. Glob. Optim. 54(3), 449–483 (2012)
Chitty, D.M.: Fast parallel genetic programming: multi-core CPU versus many-core GPU. Soft Comput. 16(10), 1795–1814 (2012)
Czajkowski, M., Jurczuk, K., Kretowski, M.: A parallel approach for evolutionary induced decision trees. MPI+OpenMP implementation. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2015, Part I. LNCS (LNAI), vol. 9119, pp. 340–349. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19324-3_31
Czajkowski, M., Kretowski, M.: Evolutionary induction of global model trees with specialized operators and memetic extensions. Inf. Sci. 288, 153–173 (2014)
Franco, M.A., Bacardit, J.: Large-scale experimental evaluation of GPU strategies for evolutionary machine learning. Inf. Sci. 330(C), 385–402 (2016)
Jurczuk, K., Czajkowski, M., Kretowski, M.: Evolutionary induction of a decision tree for large-scale data: a GPU-based approach. Soft Comput. 21(24), 7363–7379 (2017)
Kotsiantis, S.B.: Decision trees: a recent overview. Artif. Intell. Rev. 39(4), 261–283 (2013)
Kretowski, M.: Evolutionary Decision Trees in Large-Scale Data Mining. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21851-5
Lo, W.T., Chang, Y.S., Sheu, R.K., Chiu, C.C., Yuan, S.M.: CUDT: A CUDA based decision tree algorithm. Sci. World J. (2014)
Loh, W.Y.: Fifty years of classification and regression trees. Int. Stat. Rev. 82(3), 329–348 (2014)
Marron, D., Bifet, A., Morales, G.D.F.: Random forests of very fast decision trees on GPU for mining evolving big data streams. In: Proceedings of the Twenty-First European Conference on Artificial Intelligence, ECAI 2014, pp. 615–620 (2014)
Michalewicz, Z.: Genetic Algorithms + Data Structures = Evolution Programs, 3rd edn. Springer, Heidelberg (1996). https://doi.org/10.1007/978-3-662-03315-9
Reska, D., Jurczuk, K., Kretowski, M.: Evolutionary induction of classification trees on spark. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds.) ICAISC 2018, Part I. LNCS (LNAI), vol. 10841, pp. 514–523. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91253-0_48
Rokach, L., Maimon, O.: Top-down induction of decision trees classifiers - a survey. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 35(4), 476–487 (2005)
Storti, D., Yurtoglu, M.: CUDA for Engineers : An Introduction to High-Performance Parallel Computing. Addison-Wesley, New York (2016)
Tsutsui, S., Collet, P. (eds.): Massively Parallel Evolutionary Computation on GPGPUs. Natural Computing Series. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37959-8
Acknowledgments
This work was supported by the grant S/WI/2/18 from Bialystok University of Technology founded by Polish Ministry of Science and Higher Education.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Jurczuk, K., Czajkowski, M., Kretowski, M. (2020). Accelerating GPU-based Evolutionary Induction of Decision Trees - Fitness Evaluation Reuse. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K. (eds) Parallel Processing and Applied Mathematics. PPAM 2019. Lecture Notes in Computer Science(), vol 12043. Springer, Cham. https://doi.org/10.1007/978-3-030-43229-4_36
Download citation
DOI: https://doi.org/10.1007/978-3-030-43229-4_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-43228-7
Online ISBN: 978-3-030-43229-4
eBook Packages: Computer ScienceComputer Science (R0)