A framework to induce more stable decision trees for pattern classification

Mirzamomen, Zahra; Kangavari, Mohammad Reza

doi:10.1007/s10044-016-0542-2

A framework to induce more stable decision trees for pattern classification

Theoretical Advances
Published: 17 March 2016

Volume 20, pages 991–1004, (2017)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Zahra Mirzamomen¹ &
Mohammad Reza Kangavari¹

428 Accesses
6 Citations
Explore all metrics

Abstract

Decision tree learning algorithms are known to be unstable, such that small changes in the training data can result in highly different output models. Instability is an important issue in the context of machine learning which is usually overlooked. In this paper, we illustrate and discuss the problem of instability of decision tree induction algorithms and propose a framework to induce more stable decision trees. In the proposed framework, the split test encompasses two advantageous properties: First, it is able to contribute multiple attributes. Second, it has a polylithic structure. The first property alleviates the race between the competing attributes to be installed at an internal node, which is the major cause of instability. The second property has the potential of improving the stability by providing the locality of the effect of the instances on the split test. We illustrate the effectiveness of the proposed framework by providing a complying decision tree learning algorithm and conducting several experiments. We have evaluated the structural stability of the algorithms by employing three measures. The experimental results reveal that the decision trees induced by the proposed framework exhibit great stability and competitive accuracy in comparison with several well-known decision tree learning algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Breiman L (1996) Bagging predictors. Mach Learn 24:123–140
MATH Google Scholar
Breiman L (1996) Heuristics of instability and stabilization in model selection. Ann Stat 24(6):2350–2383
Article MATH MathSciNet Google Scholar
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth International Group, Belmont, California
MATH Google Scholar
Brodley CE, Utgoff PE (1995) Multivariate decision trees. Mach Learn 19:45–77
MATH Google Scholar
Chandra B, Kothari R, Paul P (1995) A new node splitting measure for decision tree construction. Patt Recogn 43:2725–2731
Article MATH Google Scholar
Demšar, J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 1–16
Dwyer KD (2007) Decision tree instability and active learning, Ph.D. thesis. University of Alberta, Edmonton
Fong PK, Weber-Jahnke J (2012) Privacy preserving decision tree learning using unrealized data sets. IEEE Trans Know Data Eng 24:353–364
Article Google Scholar
Friedman M (1940) A Comparison of Alternative Tests of Significance for the Problem of m Rankings. Ann Math Stat 11:86–92
Article MATH MathSciNet Google Scholar
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: An update. SIGKDD Explor. Newsl 11:10–18
Google Scholar
Hu Q, Che X, Zhang L, Zhang D, Guo M, Yu D (2012) Rank entropy-based decision trees for monotonic classification. IEEE Trans Know Data Eng 24:2052–2064
Article Google Scholar
Kohavi R (1996) Scaling Up the Accuracy of Naive-Bayes Classifiers: a Decision-Tree Hybrid, Proceedings of the second international conference on knowledge discovery and data mining, AAAI Press, 202–207
Kohavi R, Kunz C (1997) Option decision trees with majority votes, Proceedings of the Fourteenth International Conference on Machine Learning, ICML ’97
Last M, Maimon O, Minkov E (2002) Improving stability of decision trees. Int J Patt Recogn Art Intell 16:145–159
Article Google Scholar
Lichman M (2013) UCI machine learning repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science
Maher PE, St.Clair D (1993) Uncertain reasoning in an ID3 machine learning framework, Second IEEE International Conference on Fuzzy Systems, 7–12
Paul J, Verleysen M, Dupont P (2012) The stability of feature selection and class prediction from ensemble tree classifiers, ESANN2012 Special Session on Machine Ensembles
Quinlan JR (1986) Induction of Decision Trees. Mach Learn 1(1):81–106
Google Scholar
Quinlan JR (1993) C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers Inc., San Francisco
Shi H (2007) Best-first decision tree learning, Master’s thesis. University of Waikato
Simpson PK (1992) Fuzzy min-max neural networks. I. Classification. IEEE Trans Neural Networks 3:776–786
Article Google Scholar
Turney P (1995) Technical note: Bias and the quantification of stability. Mach Learn 20:23–33
Google Scholar
Wang X, Liu X, Pedrycz W, Zhang L (2015) Fuzzy rule based decision trees. Patt Recogn 48:50–59
Article Google Scholar
Yi W, Lu M, Liu Z (2011) Multi-valued attribute and multi-labelled data decision tree algorithm. Int J Mach Learn Cyber 2:67–74
Article Google Scholar
Zimmermann A (2008) Ensemble-trees: leveraging ensemble power inside decision trees. Discovery Science, Springer, Berlin Heidelberg, Lecture Notes in Computer Science 5255:76–87
Dannegger F (2000) Tree stability diagnostics and some remedies against instability. Stat Med 19:475–491
Article Google Scholar
Furnkranz J (1998) Integrative windowing. Stat Med 8:129–164
MATH Google Scholar
Rokach L, Maimon O (2008) Data Mining with Decision Trees: Theory and Applications, World Scientific Publishing Co. Pte. Ltd, volume 69 series in machine learning and artificial intelligence
Alpaydin E (2010) Introduction to Machine Learning, The MIT Press, 2nd edition
Briand B, Ducharme GR, Parache V, Mercat-Rommens C (2009) A similarity measure to assess the stability of classification trees. Comp Stat Data Anal 53(4):1208–1217
Article MATH MathSciNet Google Scholar
Mirzamomen Z, Kangavari M (2016) Fuzzy Min-Max neural network based decision trees, Intelligent Data Analysis, to appear soon in. 20(4)
Gama J (2004) Functional trees. Mach Learn 55(3):219–250
Article MATH Google Scholar
Domingos P, Hulten G (2000) Mining high-speed data streams. Proceedi Sixth ACM SIGKDD Int Conf Know Dis Data Mining, KDD 00:71–80
Article Google Scholar
Hulten G, Spencer L, Domingos P (2001) Mining time-changing data streams, Proceedings of the 2001 ACM SIGKDD International Conference On Knowledge Discovery and Data Mining, 97–106
Gama J, Rocha R, Medas P (2003) Accurate decision trees for mining high-speed data streams, Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’03, New York, USA, 523–528
Hashemi S, Yang Y (2009) Flexible decision tree for data stream classification in the presence of concept change, noise and missing values. Data Mining Know Dis 19:95–131
Article MathSciNet Google Scholar
Bifet A, Gavaldá R (2009) Adaptive learning from evolving data streams. Advances in Intelligent Data Analysis VIII, Lecture Notes in Computer Science, Springer, Berlin Heidelberg 5772:249–260
Bifet A, Holmes G, Kirkby R, Pfahringer B (2010) Moa: massive online analysis. J Mach Learn Res 11:1601–1604
Google Scholar
Zhao Q (2005) Learning with data streams: an nntree based approach, Embedded and Ubiquitous Computing. In: T Enokido, L Yan , B Xiao, D Kim, Y Dai, L Yang (eds) Lecture Notes in Computer Science. Springer, Berlin, Heidelberg vol 3823, pp 519–528
Heath David, Kasif Simon, Salzberg Steven (1993) Induction of oblique decision trees. J Arti Intell Res 2(2):1–32
MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Engineering, Iran University of Science and Technology, Tehran, Iran
Zahra Mirzamomen & Mohammad Reza Kangavari

Authors

Zahra Mirzamomen
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Reza Kangavari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zahra Mirzamomen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mirzamomen, Z., Kangavari, M.R. A framework to induce more stable decision trees for pattern classification. Pattern Anal Applic 20, 991–1004 (2017). https://doi.org/10.1007/s10044-016-0542-2

Download citation

Received: 14 November 2015
Accepted: 23 February 2016
Published: 17 March 2016
Issue Date: November 2017
DOI: https://doi.org/10.1007/s10044-016-0542-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A framework to induce more stable decision trees for pattern classification

Abstract

Access this article

Similar content being viewed by others

A random forest guided tour

A survey on ensemble learning

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A framework to induce more stable decision trees for pattern classification

Abstract

Access this article

Similar content being viewed by others

A random forest guided tour

A survey on ensemble learning

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation