Tree-Based Forecasting Methods

  • Richard Berk


The past two chapters have provided the necessary technical background for a consideration of statistical procedures that can be especially effective in criminal justice forecasting. The joint probability distribution model, data partitioning, and asymmetric costs should now be familiar. These features combine to make tree-based methods of recursive partitioning the fundamental building blocks for the machine learning procedures discussed. The main focus is random forests. Stochastic gradient boosting and Bayesian trees are discussed briefly as worthy competitors to random forests. Although neural nets and deep learning are not tree-based, they are also considered. Current claims about remarkable performance need to be dispassionately addressed, especially in comparison to tree-based methods.


  1. Berk, R. A., & Bleich, J. (2013) Statistical procedures for forecasting criminal behavior: a comparative assessment. Journal of Criminology and Public Policy 12(3): 515–544, 2013.CrossRefGoogle Scholar
  2. Bishop, C. M. (2006) Pattern Recognition and Machine Learning. New York: Springer.zbMATHGoogle Scholar
  3. Breiman, L. (1996) Bagging predictors. Machine Learning 26:123–140.zbMATHGoogle Scholar
  4. Breiman, L. (2001a) Random forests. Machine Learning 45: 5–32.CrossRefGoogle Scholar
  5. Breiman, L., Friedman, J.H., Olshen, R.A., & Stone, C.J. (1984) Classification and Regression Trees. Monterey, CA: Wadsworth Press.zbMATHGoogle Scholar
  6. Chen, T., & Guestrin, C. (2016) XGBoost: a scalable tree boosting system. arXiv:1603.02754v3 [cs.LG]Google Scholar
  7. Chipman, H. A., George, E. I., & McCulloch, R. E. (1998) Bayesian CART model search (with discussion). Journal of the American Statistical Association 93: 935–960.CrossRefGoogle Scholar
  8. Chipman, H. A., George, E. I., & McCulloch, R. E. (2010) BART: Bayesian additive regression trees. Annals of Applied Statistics 4(1): 266–298.MathSciNetCrossRefGoogle Scholar
  9. Culp, M., Johnson, K., & Michailidis, G. (2006) ada: an R package for stochastic boosting. Journal of Statistical Software 17(2): 1–27CrossRefGoogle Scholar
  10. Friedman, J. H. (2002) Stochastic gradient boosting. Computational Statistics and Data Analysis 38: 367–378.MathSciNetCrossRefGoogle Scholar
  11. Goodyear, D. (2018) Can the manufacturer of tasers provide the answer to police abuse? New Yorker Magazine. August 27, 2018, downloaded at Google Scholar
  12. Hastie, T., Tibshirani, R., & Friedman, J. (2009) The Elements of Statistical Learning. Second Edition. New York: Springer.CrossRefGoogle Scholar
  13. Ho, T.K. (1998) The random subspace method for constructing decision trees. IEEE Transactions on Pattern Recognition and Machine Intelligence 20 (8) 832–844.CrossRefGoogle Scholar
  14. Mease, D., Wyner, A.J., & Buja, A. (2007) Boosted classification trees and class probability/quantile estimation. Journal of Machine Learning Research 8: 409–439.zbMATHGoogle Scholar
  15. Ridgeway, G. (2007) Generalized boosted models: a guide to the gbm package. Scholar
  16. Scharre, P. (2018) Army of None. New York: Norton.Google Scholar
  17. Tan, M., Chen, B., Pang, R., Vasudevan, V. & Le, Q.L. (2018) MnasNet: platform-aware neural architecture search for Mobile. asXiv:1807.11626v1 [cs.CV].Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Richard Berk
    • 1
  1. 1.Department of CriminologyUniversity of PennsylvaniaPhiladelphiaUSA

Personalised recommendations