Influence Measures for CART Classification Trees
This paper deals with measuring the influence of observations on the results obtained with CART classification trees. To define the influence of individuals on the analysis, we use influence measures to propose criterions to quantify the sensitivity of the CART classification tree analysis. The proposals are based on predictions and use jackknife trees. The analysis is extended to the pruned sequences of CART trees to produce CART specific notions of influence. Using the framework of influence functions, distributional results are derived.
A numerical example, the well known spam dataset, is presented to illustrate the notions developed throughout the paper. A real dataset relating the administrative classification of cities surrounding Paris, France, to the characteristics of their tax revenues distribution, is finally analyzed using the new influence-based tools.
KeywordsInfluential individuals Influence functions Decision trees CART.
Unable to display preview. Download preview PDF.
- BAR-HEN, A., MARIADASSOU, M., POURSAT, M.-A., and VANDENKOORNHUYSE, P.H. (2008), “Influence Function for Robust Phylogenetic Reconstructions”, Molecular Biology and Evolution, 25(5), 869–873.Google Scholar
- BEL, L., ALLARD, D., LAURENT, J.M., CHEDDADI, R., and BAR-HEN, A. (2009), “CART Algorithm for Spatial Data: Application to Environmental and Ecological Data”, Computational Statistics and Data Analaysis, 53(8), 3082–3093.Google Scholar
- BOUSQUET, O., and ELISSEEFF, A. (2002). “Stability and Generalization”, Journal of Machine Learning Research 2, 499–526.Google Scholar
- BREIMAN, L., FRIEDMAN, J.H., OLSHEN, R.A., and STONE, C.J. (1993), Classification and Regression Trees, Boca Raton FL: Chapman and Hall.Google Scholar
- BRIAND, B., DUCHARME, G.R., PARACHE, V., and MERCAT-ROMMENS, C. (2009), “A Similarity Measure to Assess the Stability of Classification Trees”, Computational Statistics and Data Analysis, 53(4), 1208–1217.Google Scholar
- CAMPBELL, N.A. (1978), “The Influence Function as an Aid in Outlier Detection in Discriminant Analysis”, Applied Statistics, 27, 251–258.Google Scholar
- CHÈZE, N., and POGGI, J.M. (2006), “Outlier Detection by Boosting Regression Trees”, Journal of Statistical Research of Iran (JSRI), 3, 1–21.Google Scholar
- CRITCHLEY, F., and VITIELLO, C. (1991), “The Influence of Observations on Misclassification Probability Estimates in Linear Discriminant Analysis”, Biometrika, 78, 677–690.Google Scholar
- CROUX, C., and JOOSSENS, K. (2005), “Influence of Observations on the Misclassification Probability in Quadratic Discriminant Analysis”, Journal of Multivariate Analysis, 96(2), 384–403.Google Scholar
- CROUX, C., FILZMOSER, P., and JOOSSENS, K. (2008), “Classification Efficiencies for Robust Linear Discriminant Analysis”, Statistica Sinica, 18(2), 581–599.Google Scholar
- CROUX, C., HAESBROECK, G., and JOOSSENS, K. (2008), “Logistic Discrimination using Robust Estimators: An Influence Function Approach”, The Canadian Journal of Statistics, 36(1), 157–174.Google Scholar
- CUEVAS, A., and ROMO, J. (1995), “On the Estimation of the Influence Curve”, The Canadian Journal of Statistics, 23, 1–9.Google Scholar
- GEY, S., and POGGI, J.M. (2006), “Boosting and Instability for Regression Trees”, Computational Statistical and Data Analysis, 50(2), 533–550.Google Scholar
- GILL, R.D. (1989), “Non- and Semi-Parametric Maximum Likelihood Estimators and the Von Mises Method (Part 1)”, Scandinavian Journal of Statistics, 16, 97–128.Google Scholar
- HAMPEL, F.R. (1988), “The Influence Curve and Its Role in Robust Estimation”, Journal of the American Statistical Association, 69, 383–393.Google Scholar
- HASTIE, T.J., TIBSHIRANI, R.J., and FRIEDMAN, J.H. (2009), The Elements of Statistical Learning: Data Mining, Inference and Prediction (3rd ed.), New York: Springer. HUBER, P.J. (1981), “Robust Statistics”, New York: Wiley and Sons.Google Scholar
- MIGLIO, R., and SOFFRITTI, G. (2004), “The Comparison Between Classification Trees Through Proximity Measures”, Computational Statistics and Data Analysis, 45(3), 577–593.Google Scholar
- MILLER, R.G. (1974), “The Jackknife - A Review”, Biometrika, 61, 1–15.Google Scholar
- R DEVELOPMENT CORE TEAM (2009), R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, ISBN 3-900051-07-0, http://www.R-project.org/.
- ROUSSEEUW, P. (1984), “Least Median of Squares Regression”, Journal of the Amererican Statistical Association, 79, 871–880.Google Scholar
- YOUNESS, G., and SAPORTA, G. (2009), “Comparing Partitions of Two Sets of Units Based on the Same Variables”, Advances in Data Analysis and Classification, 4(1), 53–64.Google Scholar
- VENABLES, W.N., and RIPLEY, B.D. (2002), Modern Applied Statistics with S (4th ed.), New York: Springer.Google Scholar
- ZHANG, H., and SINGER, B.H. (2010), Recursive Partitioning and Applications (2nd ed.), New York: Springer.Google Scholar