On the handling of continuous-valued attributes in decision tree generation

Fayyad, Usama M.; Irani, Keki B.

doi:10.1007/BF00994007

On the handling of continuous-valued attributes in decision tree generation

Technical Note
Published: January 1992

Volume 8, pages 87–102, (1992)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

On the handling of continuous-valued attributes in decision tree generation

Download PDF

Usama M. Fayyad¹^nAff2 &
Keki B. Irani¹

6490 Accesses
460 Citations
4 Altmetric
Explore all metrics

Abstract

We present a result applicable to classification learning algorithms that generate decision trees or rules using the information entropy minimization heuristic for discretizing continuous-valued attributes. The result serves to give a better understanding of the entropy measure, to point out that the behavior of the information entropy heuristic possesses desirable properties that justify its usage in a formal sense, and to improve the efficiency of evaluating continuous-valued attributes for cut value selection. Along with the formal proof, we present empirical results that demonstrate the theoretically expected reduction in evaluation effort for training data sets from real-world domains.

References

Breiman, L., Friedman, J.H., Olshen, R.A., & Stone, C.J. (1984).Classification and regression trees. Monterey, CA: Wadsworth & Brooks.
Google Scholar
Cheng, J., Fayyad, U.M., Irani, K.B., & Qian, Z. (1988). Improved decision trees: A generalized version of ID3.Proceedings of the Fifth International Conference on Machine Learning (pp. 100–108). San Mateo, CA: Morgan Kaufmann.
Google Scholar
Clark, P. & Niblett, T. (1989). The CN2 induction algorithm.Machine Learning, 3, 261–284.
Google Scholar
Fayyad, U.M. & Irani, K.B. (1990). What should be minimized in a decision tree?Proceedings of the Eighth National Conference on Artificial Intelligence AAAI-90 (pp. 749–754). Cambridge, MA: MIT Press.
Google Scholar
Fayyad, U.M. & Irani, K.B. (1991). A machine learning algorithm (GID3*) for automated knowledge acquisition: Improvements and extensions. (General Motors Research Report CS-634). Warren, MI: GM Research Labs.
Google Scholar
Fayyad, U.M. (1991).On the induction of decision trees for multiple concept learning. Doctoral dissertation, EECS Department, The University of Michigan.
Fisher, R.A. (1936). The use of multiple measurements in taxonomic problems.Annual Eugenics, 7, Part II, 179–188.
Google Scholar
Gelfand, S., Ravishankar, C. & Delp, E. (1991). An iterative growing and pruning algorithm for classification tree design.IEEE Transactions on Pattern Analysis and Machine Intelligence, 13:2, 163–174.
Google Scholar
Irani, K.B., Cheng, J., Fayyad, U.M., & Qian, Z. (1990). Applications of machine learning techniques in semiconductor manufacturing.Proceedings of The S.P.I.E. Conference on Applications of Artificial Intelligence VIII (pp. 956–965). Bellingham, WA: SPIE: The International Society for Optical Engineers.
Google Scholar
Lewis, P.M. (1962). The characteristic selection problem in recognition systems.IRE Transactions on Information Theory, IT-8, 171–178.
Google Scholar
Luenberger, D.G. (1973).Introduction to linear and nonlinear programming. Reading, MA: Addison-Wesley.
Google Scholar
Quinlan, J.R. (1986). Induction of decision trees.Machine Learning 1, 81–106.
Google Scholar
Quinlan, J.R. (1990). Probabilistic decision trees. In Y. Kodratoff & R. Michalski (Eds.),Machine learning: An artificial intelligence approach, volume III. San Mateo, CA: Morgan Kaufmann.
Google Scholar

Download references

Author information

Usama M. Fayyad
Present address: Al Group, M/S 525-3660, Jet Propulsion Lab, California Institute of Technology, 4800 Oak Grove Drive, 91109, Pasadena, CA

Authors and Affiliations

Artificial Intelligence Laboratory, Electrical Engineering and Computer Science Department, The University of Michigan, 48109-2110, Ann Arbor, MI
Usama M. Fayyad & Keki B. Irani

Authors

Usama M. Fayyad
View author publications
You can also search for this author in PubMed Google Scholar
Keki B. Irani
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fayyad, U.M., Irani, K.B. On the handling of continuous-valued attributes in decision tree generation. Mach Learn 8, 87–102 (1992). https://doi.org/10.1007/BF00994007

Download citation

Issue Date: January 1992
DOI: https://doi.org/10.1007/BF00994007

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

On the handling of continuous-valued attributes in decision tree generation

Abstract

Article PDF

Similar content being viewed by others

Learning Decision Trees with Flexible Constraints and Objectives Using Integer Optimization

Learning optimal decision trees using constraint programming

HEAD-DT: Automatic Design of Decision-Tree Algorithms

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

On the handling of continuous-valued attributes in decision tree generation

Abstract

Article PDF

Similar content being viewed by others

Learning Decision Trees with Flexible Constraints and Objectives Using Integer Optimization

Learning optimal decision trees using constraint programming

HEAD-DT: Automatic Design of Decision-Tree Algorithms

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation