Data Mining within a Regression Framework

  • Richard A. Berk


Regression analysis can imply a far wider range of statistical procedures than often appreciated. In this chapter, a number of common Data Mining procedures are discussed within a regression framework. These include non-parametric smoothers, classification and regression trees, bagging, and random forests. In each case, the goal is to characterize one or more of the distributional features of a response conditional on a set of predictors.


regression smoothers splines CART bagging random forests 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Berk, R.A. (2003) Regression Analysis: A Constructive Critique. Newbury Park, CA.: Sage Publications.Google Scholar
  2. Berk, R.A., Ladd, H., Graziano, H., and J. Baek (2003) “A Randomized Experiment Testing Inmate Classification Systems,” Journal of Criminology and Public Policy, 2, No. 2: 215–242.CrossRefGoogle Scholar
  3. Breiman, L., Friedman, J.H., Olshen, R.A., and CJ. Stone, (1984) Classification and Regression Trees. Monterey, Ca: Wadsworth Press.Google Scholar
  4. Breiman, L. (1996) “Bagging Predictors.” Machine Learning 26:123–140.Google Scholar
  5. Breiman, L. (2000) “Some Infinity Theory for Predictor Ensembles.” Technical Report 522, Department of Statistics, University of California, Berkeley, California.Google Scholar
  6. Breiman, L. (2001a) “Random Forests.” Machine Learning 45: 5–32.MATHCrossRefGoogle Scholar
  7. Breiman, L. (2001b) “Statistical Modeling: Two Cultures,” (with discussion) Statistical Science 16: 199–231.MATHMathSciNetCrossRefGoogle Scholar
  8. Cleveland, W. (1979) “Robust Locally Weighted Regression and Smoothing Scatterplots.” Journal of the American Statistical Association 78: 829–836.MathSciNetCrossRefGoogle Scholar
  9. Cook, D.R. and Sanford Weisberg (1999) Applied Regression Including Computing and Graphics. New York: John Wiley and Sons.Google Scholar
  10. Dasu, T, and T. Johnson (2003) Exploratory Data Mining and Data Cleaning. New York: John Wiley and Sons.Google Scholar
  11. Christianini, N and J. Shawe-Taylor. (2000) Support Vector Machines. Cambridge, England: Cambridge University Press.Google Scholar
  12. Fan, J., and I. Gijbels. (1996) Local Polynomial Modeling and its Applications. New York: Chapman & Hall.Google Scholar
  13. Friedman, J., Hastie, T, and R. Tibsharini (2000). “Additive Logistic Regression: A Statistical View of Boosting” (with discussion). Annals of Statistics 28: 337–407.MathSciNetCrossRefGoogle Scholar
  14. Freund, Y, and R. Schapire. (1996) “Experiments with a New Boosting Algorithm,” Machine Learning: Proceedings of the Thirteenth International Conference: 148–156. San Francisco: Morgan FreemanGoogle Scholar
  15. Gigi, A. (1990) Nonlinear Multivariate Analysis. New York: John Wiley and Sons.Google Scholar
  16. Hand, D., Manilla, H., and P Smyth (2001) Principle of Data Mining. Cambridge, Massachusetts: MIT Press.Google Scholar
  17. Hastie, T.J. and R.J. Tibshirani. (1990) Generalized Additive Models. New York: Chapman & Hall.Google Scholar
  18. Hastie, T, Tibshirani, R. and J. Friedman (2001) The Elements of Statistical Learning. New York: Springer-Verlag.Google Scholar
  19. LeBlanc, M., and R. Tibshirani (1996) “Combining Estimates on Regression and Classification.” Journal of the American Statistical Association 91: 1641–1650.MathSciNetCrossRefGoogle Scholar
  20. Loader, C. (1999) Local Regression and Likelihood. New York: Springer-Verlag.Google Scholar
  21. Loader, C. (2004) “Smoothing: Local Regression Techniques,” in J. Gentle, W. Hardle, and Y. Mori, Handbook of Computational Statistics. New York: Springer-Verlag.Google Scholar
  22. Mocan, H.N. and K. Gittings (2003) “Getting off Death Row: Commuted Sentences and the Deterrent Effect of Capital Punishment.” (Revised version of NBER Working Paper No. 8639) and forthcoming in the Journal of Law and Economics.Google Scholar
  23. Mojirsheibani, M. (1999) “Combining Classifiers vis Discretization.” Journal of the American Statistical Association 94: 600–609.MATHMathSciNetCrossRefGoogle Scholar
  24. Reunanen, J. (2003) “Overfltting in Making Comparisons between Variable Selection Methods.” Journal of Machine Learning Research 3: 1371–1382.MATHCrossRefGoogle Scholar
  25. Sutton, R.S., and A.G. Barto. (1999). Reinforcement Learning. Cambridge, Massachusetts: MIT Press.Google Scholar
  26. Svetnik, V., Liaw, A., and C. Tong. (2003) “Variable Selection in Random Forest with Application to Quantitative Structure-Activity Relationship.” Working paper, Biometrics Research Group, Merck & Co., Inc.Google Scholar
  27. Vapnik, V. (1995) The Nature of Statistical Learning Theory. New York: Springer-Verlag.Google Scholar
  28. Witten, I.H. and E. Frank. (2000). Data Mining. New York: Morgan and Kauf-mann.Google Scholar
  29. Wood, S.N. (2004) “Stable and Eficient Multiple Smoothing Parameter Estimation for Generalized Additive Models,” Journal of the American Statistical Association, Vol. 99, No. 467: 673–686.MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, Inc. 2005

Authors and Affiliations

  • Richard A. Berk
    • 1
  1. 1.Department of StatisticsUCLAUSA

Personalised recommendations