Classification trees with bivariate splits

Lubinsky, David

doi:10.1007/BF00872094

Classification trees with bivariate splits

Published: July 1994

Volume 4, pages 283–296, (1994)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

David Lubinsky¹

97 Accesses
4 Citations
Explore all metrics

Abstract

We extend the recursive partitioning approach to classifier learning to use more complex types of split at each decision node. The new split types we permit are bivariate and can thus be interpreted visually in plots and tables. In order to find optimal splits of these new types, a new split criterion is introduced that allows the development of divide-and-conquer type algorithms. Two experiments are presented in which the bivariate trees—both with the Gini split criterion and with the new split criterion—are compared to a traditional tree-growing procedure. With the Gini criterion, the bivariate trees show a slight improvement in predictive accuracy and a considerable improvement in tree size over univariate trees. Under the new split criterion, accuracy is also improved, but there is no consistent improvement in tree size.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Sholom Weiss and Casimir Kulikowski,Computer Systems that Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning, and Expert Systems Morgan Kaufmann: San Mateo, CA, 1990.
Google Scholar
J.R. Quinlan, “Induction of decision trees,”Machine Learning vol. 1, no. 1, pp. 81–106, 1986.
Article Google Scholar
Leo Breiman, Jerome H. Friedman, Richard A. Olshen, and Charles J. Stone,Classification and Regression Trees Wadsworth: Belmont, CA, 1984.
Google Scholar
Robert Detrano, Cleveland heart disease data, Cardiology III-C V.A. Medical Center 5901 E. 7th Street, Long Beach, CA 90028. From the UCI Machine Learning repository.
W.Y. Loh and N. Vanichsetakul, “Tree-structured classification via generalized discriminant analysis,”J. Am. Statist. Assoc. vol. 83, no. 403, pp. 715–725, 1988.
Google Scholar
Paul E. Utgoff, “Perceptron trees: A case study in hybrid concept representation,” inProc. AAAI, 1988, pp. 601–606.
Giula Pagallo, “Learning DNF by decision trees,” inEleventh Int. Joint Conf. Artif. Intell. vol. 1, pp. 639–644, 1989.
Google Scholar
J.R. Quinlan, “Simplifying decision trees,” inProceedings of Knowledge Acquisition for Knowledge Based Systems Workshop, Banff, Canada, 1986.
Richard S. Forsyth, Bupa liver disorders, 8 Grosvenor Avenue, Mapperley Park, Nottingham NG3 5DX, 0602-621676, 1990. From the UCI Machine Learning repository.
B. German, Glass data, Central Research Establishment, Home Office Forensic Science Service, Aldermaston, Reading, Berkshire RG7 4PN. From the UCI Machine Learning repository.
Statlib, Liver disease diagnoses. From Carnegie Mellon University Statistics Library.
M. Zwitter and M. Soklic, Lymphography data, University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. From the UCI Machine Learning repository.
National Institute of Diabetes and Digestive and Kidney Diseases, Pima Indians diabetes data, from the UCI Machine Learning repository, 1990.
Bojan Cestnik, Hapatitis data, Jozef Stefan Institute, Jamova 39, 61000 Ljubljana, Yugoslavia. From the UCI Machine Learning repository.
Chiharu Sano, Japanese credit screening (examples and domain theory). From the UCI Machine Learning repository.
Evlin Kinney, Echocardiogram data, The Reed Institute, P.O. Box 402603, Miami, FL 33140-0603. From the UCI Machine Learning repository.
Mary McLeish and Matt Cecile, Horse colic database, Department of Computer Science, University of Guelph, Guelph, Ontario, Canada N1G 2W1, mdmcleish@water.waterloo.edu. From the UCI Machine Learning repository.
Jason Catlett, Real-valued version of the multiplexor function. Private correspondence.
David W. Aha, Tic-tac-toe endgame database. From the UCI Machine Learning repository, 1991.
M. Forina, Wine recognition data. From the UCI Machine Learning repository.
M. Hollander and D. Wolfe,Non-parametric Statistical Methods Wiley: New York, 1973.
Google Scholar
David J. Lubinsky, “Bivariate splits and consistent split criteria in dichotomous classification trees,” Ph.D. thesis, Department of Computer Science, Rutgers University, New Brunswick, NJ, 1994.
Google Scholar
John Bentley,Programming Pearls ACM, New York, 1980.
Google Scholar
Robert Messenger and Lewis Mandell, “A model search technique for predictive nominal scale multivariate analysis,”J. Am. Statis. Assoc. vol. 67, pp. 768–772, 1972.
Google Scholar
David J. Lubinsky, “The use of additive split criteria in speeding up classification trees,” inFourth Int. Workshop Artif. Intell. Statist., Fort Lauderdale, FL, 1993.

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of The Witwatersrand, Johannesburg, South Africa
David Lubinsky

Authors

David Lubinsky
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Much of this work was completed while the author was an employee of AT&T Bell Laboratories.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lubinsky, D. Classification trees with bivariate splits. Appl Intell 4, 283–296 (1994). https://doi.org/10.1007/BF00872094

Download citation

Issue Date: July 1994
DOI: https://doi.org/10.1007/BF00872094

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Classification trees with bivariate splits

Abstract

Access this article

Similar content being viewed by others

Diversity Forests: Using Split Sampling to Enable Innovative Complex Split Procedures in Random Forests

SPAARC: A Fast Decision Tree Algorithm

Multivariate Predictive Clustering Trees for Classification

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Key words

Navigation

Classification trees with bivariate splits

Abstract

Access this article

Similar content being viewed by others

Diversity Forests: Using Split Sampling to Enable Innovative Complex Split Procedures in Random Forests

SPAARC: A Fast Decision Tree Algorithm

Multivariate Predictive Clustering Trees for Classification

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation