Artificial Intelligence Review

, Volume 39, Issue 4, pp 261–283 | Cite as

Decision trees: a recent overview

  • S. B. KotsiantisEmail author


Decision tree techniques have been widely used to build classification models as such models closely resemble human reasoning and are easy to understand. This paper describes basic decision tree issues and current research points. Of course, a single article cannot be a complete review of all algorithms (also known induction classification trees), yet we hope that the references cited will cover the major theoretical issues, guiding the researcher in interesting research directions and suggesting possible bias combinations that have yet to be explored.


Machine learning Decision trees Classification algorithms 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Aitkenhead MJ (2008) A co-evolving decision tree classification method. Exp Syst Appl 34: 18–25CrossRefGoogle Scholar
  2. Altınçay H (2007) Decision trees using model ensemble-based nodes. Pattern Recogn 40: 3540–3551zbMATHCrossRefGoogle Scholar
  3. Appavu Alias Balamurugan Subramanian, Pramala S, Rajalakshmi B, Rajaram R (2010) Improving decision tree performance by exception handling. Int J Automat Comput 7(3):372–380Google Scholar
  4. Banfield RE, Hall LO, Bowyer KW (2007) A comparison of decision tree ensemble creation techniques. IEEE Trans Pattern Anal Mach Intell 29: 173–180CrossRefGoogle Scholar
  5. Bar-Or, Wolff ASR, Keren D (2005) Decision tree induction in high dimensional, hierarchically distributed databases. In: Proceedings of 2005 SIAM international conference on data mining SDM’05, Newport Beach, CAGoogle Scholar
  6. Blockeel H, Page D, Srinivasan A (2005) Multi-instance tree learning. In: Proceedings of the 22nd international conference on Machine learning, Bonn, Germany, pp 57–64Google Scholar
  7. Breiman L (1996) Bagging predictors. Mach Learn 24: 123–140MathSciNetzbMATHGoogle Scholar
  8. Breiman L (2001) Random forests. Mach Learn 45(1): 5–32zbMATHCrossRefGoogle Scholar
  9. Breiman L, Friedman JH, Olshen RA, Sotne CJ (1984) Classification and regression trees. Wadsworth, BelmontzbMATHGoogle Scholar
  10. Caragea D, Silvescu A, Honavar V (2004) A framework for learning from distributed data using sufficient statistics and its application to learning decision trees. Int J Hybrid Intell Syst 1(2): 80–89zbMATHGoogle Scholar
  11. Carvalho DR, Freitas AA (2004) A hybrid decision tree/genetic algorithm method for data mining. Inf Sci 163: 13–35CrossRefGoogle Scholar
  12. Chandra B, Varghese PP (2009a) Fuzzifying Gini index based decision trees. Exp Syst Appl 36: 8549–8559CrossRefGoogle Scholar
  13. Chandra B, Varghese PP (2009b) Moving towards efficient decision tree construction. Inf Sci 179: 1059–1069zbMATHCrossRefGoogle Scholar
  14. Chandra B, Kothari R, Paul P (2010) A new node splitting measure for decision tree construction. Pattern Recogn 43: 2725–2731zbMATHCrossRefGoogle Scholar
  15. Chang P-C, Fan C-Y, Dzan W-Y (2010) A CBR-based fuzzy decision tree approach for database classification. Exp Syst Appl 37: 214–225CrossRefGoogle Scholar
  16. Chen Y, Hsu C, Chou S (2003) Constructing a multi-valued and multi-labeled decision tree. Exp Syst Appl 25(2): 199–209CrossRefGoogle Scholar
  17. Chen RY, Sheu DD, Liu CM (2007) Vague knowledge search in the design for outsourcing using fuzzy decision tree. Comput Oper Res 34: 3628–3637zbMATHCrossRefGoogle Scholar
  18. Chen Y-l, Wang T, Wang B-s, Li Z-j (2009a) A survey of fuzzy decision tree classifier. Fuzzy Inf Eng 2: 149–159CrossRefGoogle Scholar
  19. Chen Y-L, Wu C-C, Tang K (2009b) Building a cost-constrained decision tree with multiple condition attributes. Inf Sci 179: 967–979CrossRefGoogle Scholar
  20. Chengming Q (2007) A new partition criterion for fuzzy decision tree algorithm. In: Intelligent information technology application, workshop on 2–3 December 2007, pp 43–46Google Scholar
  21. Chou S, Hsu C (2005) MMDT: a multi-valued and multi-labeled decision tree classifier for data mining. Exp Syst Appl 28(2): 799–812CrossRefGoogle Scholar
  22. Dietterich T (2000) An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach Learn 40: 139–157CrossRefGoogle Scholar
  23. Djukova EV, Peskov NV (2007) A classification algorithm based on the complete decision tree. Pattern Recogn Image Anal 17(3): 363–367CrossRefGoogle Scholar
  24. Esmeir S, Markovitch S (2010) Anytime learning of anycost classifiers. Mach Learn, doi: 10.1007/s10994-010-5228-1
  25. Esposito F, Malerba D, Semeraro G (1997) A comparative analysis of methods for pruning decision trees. EEE Trans Pattern Anal Mach Intell 19(5): 476–492CrossRefGoogle Scholar
  26. Ferri U, Flach PA, Hernandez-Orallo J (2003) Improving the AUC of probabilistic estimation trees. Lect Notes Artif Intell 2837: 121–132Google Scholar
  27. Fournier D, Crémilleux B (2002) A quality index for decision tree pruning. Knowl Based Syst 15(1-2): 37–43CrossRefGoogle Scholar
  28. Frank E, Hall M (2001) A simple approach to ordinal prediction. In: De Raedt L, Flach P (eds) ECML 2001, LNAI. Springer, Berlin, vol 2167, pp 145–156Google Scholar
  29. Freitas A, Pereira A, Brazdil P (2007) Cost-sensitive decision trees applied to medical data. Lect Notes Comput Sci 4654: 303–312CrossRefGoogle Scholar
  30. Freund Y, Mason L (1999) The alternating decision tree learning algorithm. In: Proceedings of the sixteenth international conference on machine learning, Bled, Slovenia, pp 124–133Google Scholar
  31. Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1): 119–139MathSciNetzbMATHCrossRefGoogle Scholar
  32. Fu L (2006) Construction of decision trees using data cube. In: Chen CS et al (eds) Enterprise information systems, vol VII, pp 87–94Google Scholar
  33. Gama J (2004) Functional trees. Mach Learn 55(3): 219–250zbMATHCrossRefGoogle Scholar
  34. Gama J, Rocha R, Medas P (2003) Accurate decision trees for mining high-speed data streams. In: Proceedings of 9th ACM SIGKDD international conference on knowledge discovery and data mining, pp 523–528Google Scholar
  35. Gama J, Fernandes R, Rocha R (2006) Decision trees for mining data streams. Intell Data Anal 1: 23–45Google Scholar
  36. Gehrke J, Ramakrishnan R, Ganti V (2000) RainForest—a framework for fast decision tree construction of large datasets. Data Mining Knowl Discovery 4(2–3): 127–162CrossRefGoogle Scholar
  37. Gill Abdul A, Smith George D, Bagnall Anthony J (2004) Improving decision tree performance through induction- and cluster-based stratified sampling. LNCS 3177: 339–344Google Scholar
  38. Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8): 832–844CrossRefGoogle Scholar
  39. Hulse J, Khoshgoftaar T (2009) Knowledge discovery from imbalanced and noisy data. Data Knowl Eng 68(12): 1513–1542CrossRefGoogle Scholar
  40. Hüllermeier E, Beringer J (2005) Learning from ambiguously labeled examples. Intell Data Anal 168–179Google Scholar
  41. Hulten G, Spencer L, Domingos P (2001) Mining time-changing data streams. In: Proceedings of 7th ACM SIGKDD international conference on knowledge discovery and data mining, pp 97–106Google Scholar
  42. Ittner, Schlosser M (1996) Discovery of relevant new features by generating non-linear decision trees. In: Proceedings of second international conference on knowledge discovery and data mining. AAAI Press, Menlo Park, pp 108–113Google Scholar
  43. Jenhani I, Amor Nahla B, Elouedi Z (2008) Decision trees as possibilistic classifiers. Int J Approx Reason 48: 784–807CrossRefGoogle Scholar
  44. Jenhani I, Benferhat S, Elouedi Z (2009) On the Use of clustering in possibilistic decision tree induction. LNAI 5590: 505–517MathSciNetGoogle Scholar
  45. Jin R, Agrawal G (2003) Communication and memory efficient parallel decision tree construction. In: Proceedings of third SIAM conference on data miningGoogle Scholar
  46. Kotsiantis S, Kanellopoulos D (2010) Cascade generalization of ordinal problems. Int J Artif Intell Soft Comput (IJAISC) 2(1/2): 46–57CrossRefGoogle Scholar
  47. Kumar MA, Gopal M (2010) A hybrid SVM based decision tree. Pattern Recogn 43: 3977–3987zbMATHCrossRefGoogle Scholar
  48. Landwehr N, Hall M, Frank E (2005) Logistic model trees. Mach Learn 59(1–2): 161–205zbMATHCrossRefGoogle Scholar
  49. Lee JWT, Liu D-Z (2002) Induction of ordinal decision trees. In: International conference on machine learning and cybernetics, pp 2220–2224Google Scholar
  50. Li X-B (2005) A scalable decision tree system and its application in pattern recognition and intrusion detection. Decis Support Syst 41: 112–130zbMATHCrossRefGoogle Scholar
  51. Li RH, Belford GG (2002) Instability of decision tree classification algorithms. In: Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining, Edmonton, pp 570–575Google Scholar
  52. Li H, Zhao R, Chen J, Xiang Y (2006) Research on multi-valued and multi-labeled decision trees. LNAI 4093: 247–254Google Scholar
  53. Li C-G, Wang M, Sun Z-G, Wang X-R, Zhang Z-F (2009) Decision tree algorithm using attribute frequency splitting and information entropy discretization. Comput Eng Appl 45(12): 153–156Google Scholar
  54. Liang C, Zhang Y, Song Q (2010) Decision tree for dynamic and uncertain data streams. In: JMLR: workshop and conference proceedings, vol 13, pp 209–224, 2nd Asian conference on machine learning (ACML2010), Tokyo, JapanGoogle Scholar
  55. LiMin W, SenMiao Y, Ling L, HaiJun L (2004) Improving the performance of decision tree: a hybrid approach. Lect Notes Comput Sci 3288: 327–335CrossRefGoogle Scholar
  56. Ling CX, Yang Q, Wang J, Zhang S (2004) Decision trees with minimal costs. In: Proceedings of the 21st international conference on machine learning (ICML-2004), Banff, pp 69–77Google Scholar
  57. Liu J, Li X, Zhong W (2009) Ambiguous decision trees for mining concept-drifting data streams. Pattern Recogn Lett 30: 1347–1355CrossRefGoogle Scholar
  58. Lo S-H, Ou J-C, Chen M-S (2003) Inference based classifier: efficient construction of decision trees for sparse categorical attributes. LNCS 2737: 182–191Google Scholar
  59. Loh WY, Shih X (1999) Families of splitting criteria for classification trees. Stat Comput 9: 309–315CrossRefGoogle Scholar
  60. Mehta M, Agrawal R, Riassnen J (1996) SLIQ: a fast scalable classifier for data mining. Extending database technology. Springer, Avignon, pp 18–32Google Scholar
  61. Melville P, Mooney R (2005) Creating diversity in ensembles using artificial data. Inf Fus 6: 99–111CrossRefGoogle Scholar
  62. Muata K, Bryson O (2004) Evaluation of decision trees: a multi-criteria approach. Comput Oper Res 31: 1933–1945zbMATHCrossRefGoogle Scholar
  63. Muata K, Bryson O (2007) Post-pruning in decision tree induction using multiple performance measures. Comput Oper Res 34: 3331–3345zbMATHCrossRefGoogle Scholar
  64. Mugambi EM, Hunter A, Oatley G, Kennedy L (2004) Polynomial-fuzzy decision tree structures for classifying medical data. Knowl Based Syst 17: 81–87CrossRefGoogle Scholar
  65. Murthy SK (1998) Automatic construction of decision trees from data: a multi-disciplinary survey. Data Mining Knowl Discovery 2(4): 345–389CrossRefGoogle Scholar
  66. Niculescu-Mizil A, Caruana R (2005) Predicting good probabilities with supervised learning. Association for Computing Machinery Inc., New YorkGoogle Scholar
  67. Nijssen S, Fromont E (2007) Mining optimal decision trees from itemset lattices. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining (KDD-2007), San Jose, pp 530–539Google Scholar
  68. Olaru C, Wehenkel L (2003) A complete fuzzy decision tree technique. Fuzzy Sets Syst 138(2): 221–254MathSciNetCrossRefGoogle Scholar
  69. Ouyang J, Patel N, Sethi I (2009) Induction ofmulticlassmultifeature split decision trees fromdistributed data. Pattern Recogn 42: 1786–1794zbMATHCrossRefGoogle Scholar
  70. Pfahringer B, Holmes G, Kirkby R (2001) Optimizing the induction of alternating decision trees. In: Proceedings of the fifth Pacific-Asia conference on advances in knowledge discovery and data mining, pp 477–487Google Scholar
  71. Piramuthu S (2008) Input data for decision trees. Exp Syst Appl 34: 1220–1226CrossRefGoogle Scholar
  72. Poulet F (2002) Cooperation between automatic algorithms, interactive algorithms and visualization tools for visual data mining. In: Proceedings of VDM@ECML/PKDD 2002, international workshop on visual data mining, Helsinki, pp 67–80Google Scholar
  73. Poulet F, Do TN (2008) Interactive decision tree construction for interval and taxonomical data. LNCS 4404: 123–135Google Scholar
  74. Provost F, Domingos P (2003) Tree induction for probability-based ranking. Mach Learn 52: 199–215zbMATHCrossRefGoogle Scholar
  75. Provost F, Fawcett T (2001) Robust classification for imprecise environments. Mach Learn 42: 203–231zbMATHCrossRefGoogle Scholar
  76. Qin Z, Lawry J (2005) Decision tree learning with fuzzy labels. Inf Sci 172: 91–129MathSciNetzbMATHCrossRefGoogle Scholar
  77. Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San FranciscoGoogle Scholar
  78. Rodríguez JJ, Kuncheva LI, Alonso CJ (2006) Rotation forest: a new classifier ensemble method. IEEE Trans Pattern Anal Mach Intell 28(10): 1619–1630CrossRefGoogle Scholar
  79. Ruggieri S (2002) Efficient C4.5 [classification algorithm]. IEEE Trans Knowl Data Eng 14(2): 438–444CrossRefGoogle Scholar
  80. Saffavian SR, Landgrebe D (1991) A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybern 21(3): 660–674CrossRefGoogle Scholar
  81. Shafer J, Agrawal R, Mehta M (1996) SPRINT: a scalable parallel classifier for data mining. In Proceedings of the VLDB conference, BombayGoogle Scholar
  82. Sheng S, Ling CX, Yang Q (2005) Simple test strategies for cost-sensitive decision trees. In: Proceedings of the 9th European conference on machine learning (ECML-2005), Porto, pp 365–376Google Scholar
  83. Shyi-Ming C, Fu-Ming T (2007) Generating fuzzy rules from training instances for fuzzy classification systems. Exp Syst Appl, doi: 10.1016/j.eswa.2007.07.013
  84. Sieling D (2008) Minimization of decision trees is hard to approximate. J Comput Syst Sci 74: 394–403MathSciNetzbMATHCrossRefGoogle Scholar
  85. Srivastava A, Han E-H, Kumar V, Singh V (1999) Parallel formulations of decision-tree classification algorithms. Data Mining Knowl Discovery 3: 237–261CrossRefGoogle Scholar
  86. Sug H (2005) A comprehensively sized decision tree generation method for interactive data mining of very large databases. LNAI 3584: 141–148Google Scholar
  87. Tjen-Sien L, Wei-Yin L, Yu-Shan S (2000) A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Mach Learn 40: 203–228zbMATHCrossRefGoogle Scholar
  88. Twala BETH, Jones MC, Hand DJ (2008) Good methods for coping with missing data in decision trees. Pattern Recogn Lett 29: 950–956CrossRefGoogle Scholar
  89. Wang X, Chen B, Qian G, Ye F (2000) On the optimization of fuzzy decision trees. Fuzzy Sets Syst 112: 117–125MathSciNetCrossRefGoogle Scholar
  90. Wang S, Wei J, You J, Liu D (2006) ComEnVprs: a novel approach for inducing decision tree classifiers. LNAI 4093: 126–134Google Scholar
  91. Wang X-Z, Zhai J-H, Lu S-X (2008) Induction of multiple fuzzy decision trees based on rough set technique. Inf Sci 178: 3188–3202MathSciNetzbMATHCrossRefGoogle Scholar
  92. Wang T, Qin Z, Jin Z, Zhang S (2010) Handling over-fitting in test cost-sensitive decision tree learning by feature selection, smoothing and pruning. J Syst Softw 83: 1137–1147CrossRefGoogle Scholar
  93. Ware M, Franck E, Holmes G, Hall M, Witten I (2001) Interactive machine learning: letting users build classifiers. Int J Hum Comput Stud 55: 281–292zbMATHCrossRefGoogle Scholar
  94. Webb GI (2000) MultiBoosting: a technique for combining boosting and wagging. Mach Learn 40: 159–196CrossRefGoogle Scholar
  95. Wei J-M, Wang S-Q, Wang M-Y, You J-P, Liu D-Y (2007) Rough set based approach for inducing decision trees. Knowl Based Syst 20: 695–702CrossRefGoogle Scholar
  96. Yao Z, Liu P, Lei L, Yin J (2005) R_C4.5 decision tree model and its applications to health care dataset. In: Proceedings of international conference on services systems and services management—ICSSSM’05, vol 2, pp 13–15, IEEE, 2005, pp 1099–1103Google Scholar
  97. Yıldız OT, Alpaydın E (2001) Omnivariate decision trees. IEEE Trans Neural Netw 12(6): 1539–1546CrossRefGoogle Scholar
  98. Yıldız OT, Alpaydın E (2005) Linear discriminant trees. Int J Pattern Recogn 19(3): 323–353CrossRefGoogle Scholar
  99. Yıldız OT, Dikmen O (2007) Parallel univariate decision trees. Pattern Recogn Lett 28: 825–832CrossRefGoogle Scholar
  100. Zadrozny B, Elkan C (2001) Learning and making decisions when costs and probabilities are both unknown. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, San Francisco, pp 204–213Google Scholar
  101. Zhang D, Zhou X, Leung S, Zheng J (2010) Vertical bagging decision trees model for credit scoring. Exp Syst Appl 37: 7838–7843CrossRefGoogle Scholar
  102. Zhao H (2007) A multi-objective genetic programming approach to developing Pareto optimal decision trees. Decis Support Syst 43: 809–826CrossRefGoogle Scholar
  103. Zhou Z-H, Chen Z-Q (2002) Hybrid decision tree. Knowl Based Syst 15(8): 515–528CrossRefGoogle Scholar
  104. Zhou Z-H, Tang W (2003) Selective ensemble of decision trees. In: Lecture notes in artificial intelligence. Springer, Berlin, vol 2639, pp 476–483Google Scholar
  105. Zhou Y, Zhang T, Chen Z (2006) Applying Bayesian approach to decision tree. LNAI 4114: 290–295Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2011

Authors and Affiliations

  1. 1.Educational Software Development Laboratory, Department of MathematicsUniversity of PatrasRioGreece

Personalised recommendations