A More Formal Treatment of Classification and Forecasting

  • Richard Berk


This chapter covers much of the foundational material from the last chapter but more formally and in more detail. The chapter opens with a discussion of the model used to characterize how the data were generated. That model is very different from the one used in conventional regression. Then, classification is considered followed by the estimation issues it raises. The bias-variance tradeoff is front and center. So is post-selection statistical inference. Some material in this chapter may require careful reading.


  1. Berk, R. A., Bjua, A., Brown, L., George, E., Kuchibhotla, A.K., Sue, W., & Zhau, L. (2018b) Assumption lean regression. arXiv:1806.09014v1 [stat,ME].Google Scholar
  2. Biau, G., (2012) Analysis of the random forests model. Journal of Machine Learning Research 13: 1063–1095.MathSciNetzbMATHGoogle Scholar
  3. Buja, A., Berk, R., Brown, L., George, E., Pitkin, E., Traskin, M., Zhan, K., & Zhao, L. (2018a). Models as approximations — part I: a conspiracy of nonlinearity and random regressors in linear regression.” arXiv:1404.1578Google Scholar
  4. Cochran, W.G., (1977) Sampling Techniques, 3rd Edition. New York: Wiley.zbMATHGoogle Scholar
  5. Efron, B., & Tibshirani, R. J. (1993) An Introduction to the Bootstrap London: Chapman & Hall.CrossRefGoogle Scholar
  6. Efron, B., & Hastie, T. (2016) Computer Age Statistical Inference. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  7. Geurts, P., Ernst, & Wehenkel, L. (2006) Extremely randomized trees. Machine Learning 63(1): 3–42.CrossRefGoogle Scholar
  8. Hastie, T., Tibshirani, R., & Friedman, J. (2009) The Elements of Statistical Learning. Second Edition. New York: Springer.CrossRefGoogle Scholar
  9. Kuchibhotla, A., Brown, L., Buja, A., George, E., & Zhao, L. (2018) A model free perspective for linear regression: uniform-in-model bounds for post-selection inference. arXiv:1802.05801v2 [math.ST]Google Scholar
  10. Rice, J. A. (2006) Mathematical Statistics and Data Analysis. Third Edition. New York: Duxbury Press,Google Scholar
  11. Wager, S., Hastie, T., & Efron, B. (2014) Confidence intervals for random forests: the jackknife and the infinitesimal jackknife. Journal of Machine Learning Research 15: 1625–1651.MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Richard Berk
    • 1
  1. 1.Department of CriminologyUniversity of PennsylvaniaPhiladelphiaUSA

Personalised recommendations