Advertisement

Journal of Computer Science and Technology

, Volume 29, Issue 3, pp 392–407 | Cite as

ConfDTree: A Statistical Method for Improving Decision Trees

  • Gilad Katz
  • Asaf Shabtai
  • Lior Rokach
  • Nir Ofek
Regular Paper

Abstract

Decision trees have three main disadvantages: reduced performance when the training set is small; rigid decision criteria; and the fact that a single “uncharacteristic” attribute might “derail” the classification process. In this paper we present ConfDTree (Confidence-Based Decision Tree) — a post-processing method that enables decision trees to better classify outlier instances. This method, which can be applied to any decision tree algorithm, uses easy-to-implement statistical methods (confidence intervals and two-proportion tests) in order to identify hard-to-classify instances and to propose alternative routes. The experimental study indicates that the proposed post-processing method consistently and significantly improves the predictive performance of decision trees, particularly for small, imbalanced or multi-class datasets in which an average improvement of 5%~9% in the AUC performance is reported.

Keywords

decision tree confidence interval imbalanced dataset 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

11390_2014_1438_MOESM1_ESM.pdf (76 kb)
ESM 1 (PDF 75 kb)

References

  1. [1]
    Rokach L, Maimon O. Data Mining with Decision Trees: Theory and Applications. World Scientific Publishing, 2008.Google Scholar
  2. [2]
    Quinlan J R. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.Google Scholar
  3. [3]
    Chawla N V, Japkowicz N, Kotcz A. Editorial: Special issue on learning from imbalanced data sets. SIGKDD Explor. Newsl., 2004, 6(1): 1-6.Google Scholar
  4. [4]
    Provost F, Domingos P. Well-trained PETs: Improving probability estimation trees. Technical Report, CDER #00-04-IS, Stern School of Business, New York University, 2001. http://pages.stern.nyu.edu/~fprovost/Papers/pet-wp.pdf, Mar. 2014.
  5. [5]
    Lin H Y. Efficient classifiers for multi-class classification problems. Decision Support Systems, 2012, 53(3): 473-481.CrossRefGoogle Scholar
  6. [6]
    Breiman L. Random forests. Machine Learning, 2001, 45(1): 5-32.CrossRefzbMATHGoogle Scholar
  7. [7]
    Van Assche A, Blockeel H. Seeing the forest through the trees: Learning a comprehensible model from an ensemble. In Proc. the 18th European Conf. Machine Learning, Sept. 2007, pp.418-429.Google Scholar
  8. [8]
    Katz G, Shabtai A, Rokach L, Ofek N. ConfDTree: Improving decision trees using confidence intervals. In Proc. the 12th Int. Conf. Data Mining (ICDM), Dec. 2012, pp.339-348.Google Scholar
  9. [9]
    Quinlan J R. Induction of decision trees. Machine Learning, 1986, 1(1): 81-106.Google Scholar
  10. [10]
    Quinlan J R. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.Google Scholar
  11. [11]
    Breiman L, Friedman J, Stone C J, Olshen R A. Classification and Regression Trees. Chapman and Hall/CRC, 1984.Google Scholar
  12. [12]
    Breiman L. Technical note: Some properties of splitting criteria. Machine Learning, 1996, 24(1): 41-47.zbMATHMathSciNetGoogle Scholar
  13. [13]
    Cieslak D A, Chawla N V. Learning decision trees for unbalanced data. In Proc. 2008 ECML PKDD, Sept. 2008, pp.241-256.Google Scholar
  14. [14]
    Buntine W, Niblett T. A further comparison of splitting rules for decision-tree induction. Machine Learning, 1992, 8(1): 75-85.Google Scholar
  15. [15]
    Rodriguez J J, Kuncheva L I, Alonso C J. Rotation forest: A new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(10): 1619-1630.CrossRefGoogle Scholar
  16. [16]
    Gehrke J, Ganti V, Ramakrishnan R, Loh W Y. BOAT-optimistic decision tree construction. In Proc. SIGMOD, May 31-June 03, 1999, pp.169-180.Google Scholar
  17. [17]
    John G H. Robust decision trees: Removing outliers from databases. In Proc. the 1st Int. Conf. Knowledge Discovery and Data Mining, Aug. 1995, pp.174-179.Google Scholar
  18. [18]
    Last M, Maimon O, Minkov E. Improving stability of decision trees. International Journal of Pattern Recognition and Artificial Intelligence, 2002, 16(2): 145-159.CrossRefGoogle Scholar
  19. [19]
    Zadrozny B, Elkan C. Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers. In Proc. the 8th International Conference on Machine Learning, June 28-July 1, 2001, pp.609-616.Google Scholar
  20. [20]
    Ling C X, Robert J Y. Decision tree with better ranking. In Proc. the 20th International Conference on Machine Learning, Aug. 2003, pp.480-487.Google Scholar
  21. [21]
    Mccallum R A. Instance-based utile distinctions for reinforcement learning with hidden state. In Proc. the 12th Int. Conf. Machine Learning, July 1995, pp.387-395.Google Scholar
  22. [22]
    Massey F J. The Kolmogorov-Smirnov test for goodness of fit. Journal of the American Statistical Association, 1951, 46(253): 68-78.CrossRefzbMATHGoogle Scholar
  23. [23]
    Rzepakowski P, Jaroszewicz S. Decision trees for uplift modeling with single and multiple treatments. Knowledge and Information Systems, 2012, 32(2): 303-327.CrossRefGoogle Scholar
  24. [24]
    Bhattacharyya S. Confidence in predictions from random tree ensembles. Knowledge and Information Systems, 2013, 35(2): 391-410.CrossRefGoogle Scholar
  25. [25]
    Janikow C Z. Fuzzy decision trees: Issues and methods. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 1998, 28(1): 1-14.CrossRefGoogle Scholar
  26. [26]
    Olaru C, Wehenkel L. A complete fuzzy decision tree technique. Fuzzy Sets and Systems, 2003, 138(2): 221-254.CrossRefMathSciNetGoogle Scholar
  27. [27]
    Zadorny B, Elkan C. Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers. In Proc. the 18th International Conference on Machine Learning, June 28-July 1, 2001, pp.609-616.Google Scholar
  28. [28]
    Esposito F D, Malerba D, Semeraro G. A comparative analysis of methods for pruning decision trees. IEEE Trans. Pattern Analysis and Machine Intelligence, 1997, 19(5): 476-491.CrossRefGoogle Scholar
  29. [29]
    Chawla N V, Bowyer K W, Hall L O, Kegelmeyer W P. SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 2002, 16(1): 321-357.zbMATHGoogle Scholar
  30. [30]
    Stanfill C, Waltz D. Toward memory-based reasoning. Communications of the ACM, 1986, 29(12): 1213-1228.CrossRefGoogle Scholar
  31. [31]
    Kohavi R, Becker B, Sommerfield D. Improving simple Bayes. In Proc. the 9th European Conf. Machine Learning, April 1997, pp.78-87.Google Scholar
  32. [32]
    Ponte J M, Croft W B. A language modeling approach to information retrieval. In Proc. the 21st Annual Int. ACM SIGIR Conf. Research and Development in Information Retrieval, Aug. 1998, pp.275-281.Google Scholar
  33. [33]
    Lafferty J, Zhai C. Document language models, query models, and risk minimization for information retrieval. In Proc. the 24th Annual Int. ACM SIGIR Conf. Research and Development in Information Retrieval, Sept. 2001, pp.111-119.Google Scholar
  34. [34]
    Demšar J. Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 2006, 7: 1-30.zbMATHGoogle Scholar
  35. [35]
    Hand D J, Till R J. A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning, 2001, 45(2): 171-186.CrossRefzbMATHGoogle Scholar
  36. [36]
    Hall M, Frank E, Holmes G, Pfahringernd B, Reutemann P, Witten I H. The WEKA data mining software: An update. ACM SIGKDD Explorations Newsletter, 2009, 11(1): 10-18.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York & Science Press, China 2014

Authors and Affiliations

  • Gilad Katz
    • 1
    • 2
  • Asaf Shabtai
    • 1
    • 2
  • Lior Rokach
    • 1
    • 2
  • Nir Ofek
    • 1
    • 2
  1. 1.Department of Information Systems EngineeringBen-Gurion University of the NegevBeer ShevaIsrael
  2. 2.Telekom Innovation LaboratoriesBen-Gurion University of the NegevBeer ShevaIsrael

Personalised recommendations