Abstract
Decision tree is a fundamentally different approach towards machine learning compared to other options like neural networks or support vector machines. The other approaches deal with the data that is strictly numerical that may increase or decrease monotonically. The equations that define these approaches are designed to work only when the data is numerical. However, the theory of decision trees does not rely on the assumption of numerical data. In this chapter, we will study the theory of decision trees along with some advanced topics in decision trees, like ensemble methods. We will focus on bagging and boosting as two main types of ensemble methods and learn how they work and what their advantages and disadvantages.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The words weak and strong have a different meaning in this context. A weak learner is a decision tree that is trained using only fraction of the total data and is not capable or even expected of giving metrics that are close to the desired ones. Theoretical definition of a weak learner is one whose performance is only slightly better than pure random chance. A strong learner is a single decision tree uses all the data and is capable of producing reasonably good metrics. In ensemble methods individual tree is always a weak learner as it is not exposed to the full data set.
- 2.
Outliers represent an important concept in the theory of machine learning. Although, its meaning is obvious, its impact on learning is not quite trivial. An outlier is a sample in training data that does not represent the generic trends in the data. Also, from mathematical standpoint, the distance of an outlier from other samples in the data is typically large. Such large distances can throw a machine learning model significantly away from the desired behavior. In other words, a small set of outliers can affect the learning of a machine learning model adversely and can reduce the metrics significantly. It is thus an important property of a machine learning model to be resilient of a reasonable number of outliers.
References
Trevor Hastie, Robert Tibshirani, Jerome Friedman The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. (Springer, New York, 2016).
Tin Kam Ho, Random Decision Forests, Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, 14–16 August 1995. pp. 278–282.
Leo Breiman, Random Forests, Machine learning 45.1 (2001): 5–32.
Leo Breimian, Prediction Games and ARCing Algorithms, Technical Report 504, Statistics Department, University of California, Berkerley, CA, 1998.
Yoav Freund, Robert Schapire A Short Introduction to Boosting, Journal of Japanese Society for Artificial Intelligence, 14(5):771–780, September, 1999.
Jamie Shotton, Toby Sharp, Pushmeet Kohli, Sebastian Nowozin, John Winn, Antonio Criminisi, Decision Jungles: COmpact and Rich Models for Classification, NIPS 2013.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Joshi, A.V. (2020). Decision Trees. In: Machine Learning and Artificial Intelligence. Springer, Cham. https://doi.org/10.1007/978-3-030-26622-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-26622-6_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26621-9
Online ISBN: 978-3-030-26622-6
eBook Packages: EngineeringEngineering (R0)