, Volume 96, Issue 5, pp 403–413 | Cite as

Decision tree construction on GPU: ubiquitous parallel computing approach



General Purpose Graphic Processing Unit (GPGPU) computing with CUDA has been effectively used in scientific applications, where huge accelerations have been achieved. However, while today’s traditional GPGPU can reduce the execution time of parallel code by many times, it comes at the expense of significant power and energy consumption. In this paper, we propose ubiquitous parallel computing approach for construction of decision tree on GPU. In our approach, we exploit parallelism of well-known ID3 algorithm for decision tree learning by two levels: at the outer level of building the tree node-by-node, and at the inner level of sorting data records within a single node. Thus, our approach not only accelerates the construction of decision tree via GPU computing, but also does so by taking care of the power and energy consumption of the GPU. Experiment results show that our approach outperforms purely GPU-based implementation and CPU-based sequential implementation by several times.


Ubiquitous computing GPU computing CUDA Decision tree 

Mathematics Subject Classification

68T05 68Q32 68T30 


  1. 1.
    Bakum P, Skadron K (2010) Accelerating SQL database operations on a GPU with CUDA. In: GPGPU’10: proceedings of the third workshop on general-purpose computation on graphics processing units. pp 94–103Google Scholar
  2. 2.
    Kim J, Kim SG, Nam B (2013) Parallel multi-dimensional range query processing with R-trees on GPU. J Parallel Distrib Comput 73(8):2164–2179Google Scholar
  3. 3.
    Cayrel PL, Hoffmann G, Schneider M (2011) GPU implementation of the Keccak Hash function family. Int J Secur Appl 5(4):123–132Google Scholar
  4. 4.
    Che S, Boyer M, Meng J, Tarjan D, Sheaffer JW, Skadron K (2008) A performance study of general-purpose applications on graphics processors using CUDA. J Parallel Distrib Comput 68(10):1370–1380Google Scholar
  5. 5.
    Nguyen H (2007) GPU Gems 3. Addison-Wesley Professional, ReadingGoogle Scholar
  6. 6.
    Huang S, Xiao S, Feng W (2009) On the energy efficiency of graphics processing units for scientific computing. In: Proceeding of the 2009 IEEE international symposium on parallel and distributed processing. pp 1–8Google Scholar
  7. 7.
    Yu CD, Wan W, Pierce D (2011) A CPU–GPU hybrid approach for the unsymmetric multifrontal method. Parallel Comput 37(12):759–770CrossRefGoogle Scholar
  8. 8.
    Garcia V, Debreuve E, Barlaud M (2008) Fast k nearest neighbour search using GPU. In: Computer vision and pattern recognition workshops. pp 1–6Google Scholar
  9. 9.
    Kufrin R (1997) Decision trees on parallel processors. Mach Intell Pattern Recognit 20:279–306CrossRefGoogle Scholar
  10. 10.
    Chui FCF, Bindoff I, Williams R (2009) Applying feature extraction for classification problems. J Signal Process Image Process Pattern Recognit 2(1):1–16Google Scholar
  11. 11.
    Cheng X, Xu J, Pei J, Liu J (2010) Hierarchical distributed data classification in wireless sensor networks. Comput Commun 33(12):1404–1413CrossRefGoogle Scholar
  12. 12.
    Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106Google Scholar
  13. 13.
    Teng Z, Du W (2007) A hybrid multi-group privacy-preserving approach for building decision trees. In: Proceedings of the 11th Pacific-Asia conference on advances in knowledge discovery and data mining. pp 296–307Google Scholar
  14. 14.
    Van Der Laan WJ, Jalba AC, Roerdink JBTM (2011) Accelerating wavelet lifting on graphics hardware using CUDA. IEEE Trans Parallel Distrib Syst 22(1):132–146CrossRefGoogle Scholar
  15. 15.
    Shafer CJ, Agrawal R, Mehta M (1996) SPRINT: a scalable parallel classifier for data mining. In: Proceedings of the 22th international conference on very large data bases. pp 544–555Google Scholar
  16. 16.
    Mehta M, Agrawal R, Rissanen J (1996) SLIQ: a fast scalable classifier for data mining. In: Proceedings of the 5th international conference on extending database technology: advances in database technology. pp 18–32Google Scholar
  17. 17.
    Panda B, Herbach JS, Basu S, Bayardo RJ (2009) PLANET: massively parallel learning of tree ensembles with MapReduce. J VLDB Endow 2(2):1426–1437Google Scholar
  18. 18.
    Ben-Haim Y, Tom-Tov E (2010) A streaming parallel decision tree algorithm. J Mach Learn Res 11(3):849–872MATHMathSciNetGoogle Scholar
  19. 19.
    Sharp T (2008) Implementing decision trees and forests on a GPU. In: Proceeding 10th European conference on computer vision. pp 595–608Google Scholar
  20. 20.
    Grahn H, Lavesson N, Lapajne HM, Slat D (2011) CudaRF: a CUDA-based implementation of random forests. In: Proceedings of the 2011 9th IEEE/ACS international conference on computer systems and applications. pp 95–101Google Scholar
  21. 21.
    Chiu CC, Luo GH, Yuan SM (2011) A decision tree using CUDA GPUs. In: Proceedings of the 13th international conference on information integration and web-based applications and services. pp 399–402Google Scholar
  22. 22.
    Park YH, Whang KY, Lee BS, Han WS (2006) Efficient evaluation of linear path expressions on large-scale heterogeneous XML documents using information retrieval techniques. J Syst Softw 79(2):180–190CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Wien 2013

Authors and Affiliations

  1. 1.Department of Multimedia ScienceSookmyung Women’s UniversitySeoulKorea
  2. 2.Department of Computer EngineeringSeokyeong UniversitySeoulKorea

Personalised recommendations