Skip to main content

Decision Trees

  • Chapter
  • First Online:
Machine Learning and Artificial Intelligence

Abstract

Decision tree is a fundamentally different approach towards machine learning compared to other options like neural networks or support vector machines. The other approaches deal with the data that is strictly numerical that may increase or decrease monotonically. The equations that define these approaches are designed to work only when the data is numerical. However, the theory of decision trees does not rely on the assumption of numerical data. In this chapter, we will study the theory of decision trees along with some advanced topics in decision trees, like ensemble methods. We will focus on bagging and boosting as two main types of ensemble methods and learn how they work and what their advantages and disadvantages.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The words weak and strong have a different meaning in this context. A weak learner is a decision tree that is trained using only fraction of the total data and is not capable or even expected of giving metrics that are close to the desired ones. Theoretical definition of a weak learner is one whose performance is only slightly better than pure random chance. A strong learner is a single decision tree uses all the data and is capable of producing reasonably good metrics. In ensemble methods individual tree is always a weak learner as it is not exposed to the full data set.

  2. 2.

    Outliers represent an important concept in the theory of machine learning. Although, its meaning is obvious, its impact on learning is not quite trivial. An outlier is a sample in training data that does not represent the generic trends in the data. Also, from mathematical standpoint, the distance of an outlier from other samples in the data is typically large. Such large distances can throw a machine learning model significantly away from the desired behavior. In other words, a small set of outliers can affect the learning of a machine learning model adversely and can reduce the metrics significantly. It is thus an important property of a machine learning model to be resilient of a reasonable number of outliers.

References

  1. Trevor Hastie, Robert Tibshirani, Jerome Friedman The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. (Springer, New York, 2016).

    Google Scholar 

  2. Tin Kam Ho, Random Decision Forests, Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, 14–16 August 1995. pp. 278–282.

    Google Scholar 

  3. Leo Breiman, Random Forests, Machine learning 45.1 (2001): 5–32.

    Article  Google Scholar 

  4. Leo Breimian, Prediction Games and ARCing Algorithms, Technical Report 504, Statistics Department, University of California, Berkerley, CA, 1998.

    Google Scholar 

  5. Yoav Freund, Robert Schapire A Short Introduction to Boosting, Journal of Japanese Society for Artificial Intelligence, 14(5):771–780, September, 1999.

    Google Scholar 

  6. Jamie Shotton, Toby Sharp, Pushmeet Kohli, Sebastian Nowozin, John Winn, Antonio Criminisi, Decision Jungles: COmpact and Rich Models for Classification, NIPS 2013.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Joshi, A.V. (2020). Decision Trees. In: Machine Learning and Artificial Intelligence. Springer, Cham. https://doi.org/10.1007/978-3-030-26622-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-26622-6_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-26621-9

  • Online ISBN: 978-3-030-26622-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics