Skip to main content

Decision Tree Induction: Using Frequency Tables for Attribute Selection

  • Chapter
  • First Online:
  • 359k Accesses

Part of the book series: Undergraduate Topics in Computer Science ((UTICS))

Abstract

This chapter describes an alternative method of calculating the average entropy of the training (sub)sets resulting from splitting on an attribute, which uses frequency tables. It is shown to be equivalent to the method used in Chapter 5 but requires less computation. Two alternative attribute selection criteria, the Gini Index of Diversity and the \(\chi^{2}\) statistic, are illustrated and it is shown how they can also be calculated using a frequency table.

The important issue of inductive bias is introduced. This leads to a description of a further attribute selection criterion, Gain Ratio, which was introduced as a way of overcoming the bias of the entropy minimisation method, which is undesirable for some datasets.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Mingers, J. (1989). An empirical comparison of pruning methods for decision tree induction. Machine Learning, 4, 227–243.

    Article  Google Scholar 

  2. Quinlan, J. R. (1993). C4.5: programs for machine learning. San Mateo: Morgan Kaufmann.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer-Verlag London Ltd.

About this chapter

Cite this chapter

Bramer, M. (2016). Decision Tree Induction: Using Frequency Tables for Attribute Selection. In: Principles of Data Mining. Undergraduate Topics in Computer Science. Springer, London. https://doi.org/10.1007/978-1-4471-7307-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-7307-6_6

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-7306-9

  • Online ISBN: 978-1-4471-7307-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics