Functional Trees

Gama, João

doi:10.1023/B:MACH.0000027782.67192.13

Functional Trees

Published: June 2004

Volume 55, pages 219–250, (2004)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Functional Trees

Download PDF

João Gama¹

3544 Accesses
195 Citations
6 Altmetric
Explore all metrics

Abstract

In the context of classification problems, algorithms that generate multivariate trees are able to explore multiple representation languages by using decision tests based on a combination of attributes. In the regression setting, model trees algorithms explore multiple representation languages but using linear models at leaf nodes. In this work we study the effects of using combinations of attributes at decision nodes, leaf nodes, or both nodes and leaves in regression and classification tree learning. In order to study the use of functional nodes at different places and for different types of modeling, we introduce a simple unifying framework for multivariate tree learning. This framework combines a univariate decision tree with a linear function by means of constructive induction. Decision trees derived from the framework are able to use decision nodes with multivariate tests, and leaf nodes that make predictions using linear functions. Multivariate decision nodes are built when growing the tree, while functional leaves are built when pruning the tree. We experimentally evaluate a univariate tree, a multivariate tree using linear combinations at inner and leaf nodes, and two simplified versions restricting linear combinations to inner nodes and leaves. The experimental evaluation shows that all functional trees variants exhibit similar performance, with advantages in different datasets. In this study there is a marginal advantage of the full model. These results lead us to study the role of functional leaves and nodes. We use the bias-variance decomposition of the error, cluster analysis, and learning curves as tools for analysis. We observe that in the datasets under study and for classification and regression, the use of multivariate decision nodes has more impact in the bias component of the error, while the use of multivariate decision leaves has more impact in the variance component.

References

Berthold, M., & Hand, D. (1999). Intelligent data analis-An introduction.Springer Verlag.
Bhattacharyya, G., & Johnson, R. (1977).Statistical concepts and methods.New York: John Willey & Sons.
Google Scholar
Blake, C., Keogh, E., & Merz, C. (999). ICI repository of machine learning databases.
Brain, D., & Webb, G. (2002). Th need ifor low bias algorithms in classification learning from large data sets. In T. Elomaa, H. Mannila, & H,. Tiionen (Eds.), Principles of data mining and knowledge discovery PKDD-02, LNAI 2431 (pp. 62–73). Springer Verlag.
Breiman, L. (1996). Baging’predictors. Machine Learning, 24,123–140.
Google Scholar
Breiman, L. (1998). cing:classifiers. The Annals of Statistics, 26:3,801–849.
Article Google Scholar
Breiman, L., Friedman J., Olshen, R., & Stone, C. (1984). Classification and regression trees. Wadsworth International Group.
Brodley, C. E. (1995). Recursive automatic bias selection for classifier construction. Machine Learning, 20,63–94.
Google Scholar
Brodley, C. E.,::& Utgoff, P. E. (1995). Multivariate decision trees. Machine Learning, 19,45–77.
Google Scholar
Frank, E., Wang, Y, Inglis, S., Holmes, G., & Witten, I. (1998). Using model trees for classification. Machine Learning,32,63–82.
Article Google Scholar
Frank, E., & Witten, H. (1998). Generating accurate rule sets without global optimization. In J. Shavlik (Ed.), Proceedings of the 15th international conference-ICML’98 (pp. 144–151). Morgan Kaufmann.
Gama, J. (1997). Probabilistic linear tree. In D. Fisher (Ed.), Machine learning Proc. of the 14th international conference (pp. 134–142). Morgan Kaufmann.
Gama, J. (2000). A linear-bayes classifier. In C. Monard, & J. Sichman (Eds.),Advances on artificial intelligence-SBIA2000,LNAI 1952 (pp. 269–279). Springer Verlag.
Gama, J., & Brazdil, P. (2000). Cascade generalization. Machine Learning 41,315–343.
Article Google Scholar
Geman, S., Bienenstock, E., & Doursat, R. (1992). Neural networks and the bias/variance dilema. Neural Com-putation, 4,1–58.
Google Scholar
Ihaka, R., & Gentleman, R. (1996). R: A language for data analysis and graphics. Journal of Computational and Graphical Statistics, 5:3,299–314.
Google Scholar
Karalic, A. (1992). Employing linear regression in regression tree leaves. In B. Neumann (Ed.), European confer-ence on artificial intelligence (pp. 440–441). John Wiley & Sons.
Kim, H., & Loh, W. (2001). Classification trees with unbiased multiway splits. Journal of theAmerican Statistical Association,96, 589–604.
Article Google Scholar
Kim, H., & Loh, W.-Y. (2003). Classification trees with bivariate linear discriminant node models. Journal of Computational and Graphical Statistics, 12:3,512–530.
Article MathSciNet Google Scholar
Kohavi, R. (1996). Scaling up the accuracy of naive-bayes classifiers: A decision tree hybrid. In Proc. of the 2nd international conference on knowledge discovery and data mining (pp. 202–207). AAAI Press.
Kohavi, R., & Wolpert, D. (1996). Bias plus variance decomposition for zero-one loss functions. In L. Saitta (Ed.), Machine learning, Proc. of the 13th international conference.(pp. 275–283). Morgan Kaufmann.
Kononenko, I., Cestnik, B., & Bratko, I. (1988). Assistant professional user’s guide.Technical report, Jozef Stefan Institute.
Li, K. C., Lue, H., & Chen, C. (2000). Interactive tree-structured regression via principal Hessians direction. Journal of the American Statistical Association, 95, 547–560.
Google Scholar
Loh, W., & Shih, Y. (1997). Split selection methods for classification trees. Statistica Sinica,7, 815–840.
Google Scholar
Loh, W., & Vanichsetakul, N. (1988). Tree-structured classification via generalized discriminantanalysis. Journal of the American Statistical Association, 83, 715–728.
Google Scholar
McLachlan, G. (1992). Discriminant analysis and statistical pattern recognition. New York: Wiley and Sons.
Google Scholar
Mitchell, T. (1997). Machine learning.MacGraw-Hill Companies, Inc.
Murthy, S., Kasif, S., & Salzberg, S. (1994). A system for induction of oblique decision trees. Journal ofArtificial Inteligence Research,2, 1–32.
Google Scholar
Perlich, C., Provost,.F., & Simonoff, J. (2003). Tree induction vs. logistic regression: A learning-curve analysis. Journal of Machine Learning Research,4, 211–255.
Article Google Scholar
Quinlan, R. (1992). Learning with continuous classes. In Adams, & Steling (Eds.), 5th Australianjoint conference on artificial intelligence.(pp. 343–348). World Scientific.
Quinlan, R. (1993a). C4.5: Programs for machine learningiMorgan Kaufmann Publishers, Inc.
Quinlan, R. (1993b). Combining instance-based and model-based learning. In P. Utgoff (Ed.), Machine learning, proceedings of the 10th international conference (pp.236–243). Morgan Kaufmann.
Sahami, M. (1995). Generating neural networks though!the induction of threshold logic unit trees. In Proceedings of the first international IEEE symposium qn intelligence in neural and biological systems.(pp. 108–115). IEEE Computer Society.
Seewald, A., Petrak, J., & Widmer,G. (2001). Hybrid decision tree learners with alternative leaf classifiers: An empirical study. In Proceedingi the 14th FLAIRS conference. (pp.407–411). AAAI Press.
Todorovski, L., & Dzeroski S. (2003). Combining classifiers with meta decision trees. Machine Learning, 50, 223–249.
Article Google Scholar
Torgo, L. (1997). Functional models for regression tree leaves. In D. Fisher (Ed.), Machine learning, proceedings of the 14th iternational’conference.(pp. 385–393). Morgan Kaufmann.
Torgo, L. (2000). Partial linear trees. In P. Langley (Ed.), Machine learning, proceedings of the 17th international conference.(pp. 1007–1014). Morgan Kaufmann.
Utgoff, P. (1988). Percepton trees-A case study in hybrid conceptr epresentation. In Proceedings of the seventh national conference on artificial intelligence.(pp. 601–606). AAAI Press.
Utgoff, P., & Brodley, C. (1991). Linear machine decision trees. Coins technical report, 91–10, University of Massachusetts.
Witten, I., & Frank, E. (2000). Data mining: Practical machine learning tools and techniques with Java impleminentations. Morgan Kaufmann Publishers
Wolpert, D. (1992). Stacked generalization. Neural Networks (vol. 5, pp. 241–260). Pergamon Press.
Google Scholar

Download references

Author information

Authors and Affiliations

LIACC, FEP—University of Porto, Rua Campo Alegre 823, 4150, Porto, Portugal
João Gama

Authors

João Gama
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gama, J. Functional Trees. Mach Learn 55, 219–250 (2004). https://doi.org/10.1023/B:MACH.0000027782.67192.13

Download citation

Issue Date: June 2004
DOI: https://doi.org/10.1023/B:MACH.0000027782.67192.13

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Functional Trees

Abstract

Article PDF

Similar content being viewed by others

Multivariate Predictive Clustering Trees for Classification

Decision and Regression Trees in the Context of Attributes with Different Granularity Levels

Mathematical optimization in classification and regression trees

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Functional Trees

Abstract

Article PDF

Similar content being viewed by others

Multivariate Predictive Clustering Trees for Classification

Decision and Regression Trees in the Context of Attributes with Different Granularity Levels

Mathematical optimization in classification and regression trees

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation