Skip to main content

Advertisement

Log in

The Role of Race in Forecasts of Violent Crime

  • Published:
Race and Social Problems Aims and scope Submit manuscript

Abstract

This paper addresses the role of race in forecasts of failure on probation or parole. Failure is defined as committing a homicide or attempted homicide or being the victim of a homicide or an attempted homicide. These are very rare events in the population of individuals studied, which can make these outcomes extremely difficult to forecast accurately. Building in the relative costs of false positives and false negatives, machine learning procedures are applied to construct useful forecasts. The central question addressed is what role race should play as a predictor when as an empirical matter the majority of perpetrators and victims are young, African American, males.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. Statistical learning is also called machine learning. The two terms will be used interchangeably.

  2. Using forecasts derived from the experiences of individuals under supervision in the community to inform release decision is tricky. The populations involved are somewhat different. The population for parole decisions is prison inmates. The population for probation release is convicted offenders at sentencing. The forecasts sought in this study were for a population of individuals already under supervision.

  3. For example, logistic regression assumes that in the log-odds metric of the response, all predictors are linearly related to the response.

  4. Indeed, it was tried. No true positives were correctly identified.

  5. One false negative had approximately the cost of 20 false positives.

  6. Consider an example. There are 198 individuals who failed. Suppose forecasting error increases from 20 to 25%, an increase of 5%, there would be approximately ten more false negatives. But with 10,959 individuals who did not fail, an increase of ten in the number of false positives would imply a tiny percentage change of .09%.

  7. As before, there is no impact on false positives.

  8. Cases with rare values for certain predictors could have been dropped from the analysis. But that would have threatened external validity, especially because having a rare value on any given predictor does not necessarily mean having a rare value on any other predictor. Moreover, one can see in the response function plots that a few rare data points at the tails of a distribution are very unlikely to affect the functional form for other values because the fitting procedures are very flexible. In effect, the rare observations are ignored. The option of recoding the rare values for any variable to some common value (e.g., some reasonable upper bound) would risk affecting the functional form elsewhere because for that value the data would no longer be sparse.

  9. The nominal or suspended sentence can be very short when judges take time served awaiting trial into account.

  10. Consider a single classification tree. Race would need to enter at a branch in the tree where the two conditional distributions (i.e., for race and for failure) had far more similar balance than their marginal distributions. Although this certainly could happen, there is nothing in the data partitioning process to help bring that about. And insofar as greater correspondence between the two distributions is unusual for any single tree, race cannot contribute much to forecasting accuracy over many trees.

  11. This method was suggested by Penn colleague Larry Brown who also noted that declines in forecasting accuracy were likely.

  12. One would probably want to maintain the marginal distribution of race.

References

  • Alpert, G. P., Dunham, R. G., & Smith, M. R. (2006). Investigating racial profiling by the Miami-Dade police department: A multimethod approach. Criminology and Public Policy, 6(1), 25–56.

    Google Scholar 

  • Baldus, D. C., Woodworth, G. C., & Pulaski, C. A., Jr. (1990). Equal justice and the death penalty. Lebanon, New Hampshire: University Press of New England.

    Google Scholar 

  • Berk, R. A. (2008a). Statistical learning from a regression perspective. New York: Springer.

    Google Scholar 

  • Berk, R. A. (2008b). Forecasting methods in crime and justice. Annual Review of Law and Social Science, 4, 173–192

    Article  Google Scholar 

  • Berk, R. A., Hickman, L., & Li, A. (2005). Statistical difficulties in determining the role of race in capital cases: A re-analysis of data from the state of Maryland. Journal of Quantitative Criminology, 21(4), 365–390

    Article  Google Scholar 

  • Berk, R. A., Sherman, L., Barnes, G., Kurtz, E., & Ahlman, L. (2009). Forecasting murder within a population of probationers and parolees: A high stakes application of statistical learning. Journal of the Royal Statistical Society (Series A), 172(part 1), 191–211.

    Google Scholar 

  • Blumstein, A. (1993). Racial disproportionality in US prison populations revisited. University of Colorado Law Review, 64, 743–760.

    Google Scholar 

  • Blumstein, A., Cohen, J., Martin, S. E., & Tonry, M. H. (Eds.). (1983). Research on sentencing: The search for reform (Vol. I and II). Washington, D.C.: National Academy Press.

    Google Scholar 

  • Breiman, L. (2001a). Random forests. Machine Learning, 45, 5–32.

    Article  Google Scholar 

  • Breiman, L. (2001b). Statistical modeling: Two cultures (with discussion). Statistical Science, 16, 199–231

    Article  Google Scholar 

  • Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classication and regression trees. Monterey: Wadsworth.

    Google Scholar 

  • Farrington, D. P. (1987). Predicting individual crime rates. In D. M. Gottfredson & M. Tonry (Eds.), Prediction and classification. Chicago: University of Chicago Press.

    Google Scholar 

  • Friedman, J. H. (2002). Stochastic gradient boosting. Computational Statistics and Data Analysis, 38, 367–378.

    Article  Google Scholar 

  • Goodman, L. A. (1953a). The use and validity of a prediction instrument. I. A reformulation of the use of a prediction instrument. American Journal of Sociology, 58, 503–510.

    Article  Google Scholar 

  • Goodman, L. A. (1953b). II. The validation of prediction. American Journal of Sociology, 58, 510–512.

    Article  Google Scholar 

  • Grogger, J., & Ridgeway, G. (2006). Testing for racial profiling in traffic stops from behind a veil of darkness. Journal of the American Statistical Association, 101(475), 878–887.

    Article  Google Scholar 

  • Hastie, T. J., & Tibshirani, R. (1990). Generalized additive models. New York: Chapman and Hall.

    Google Scholar 

  • Kelpper, S., Nagin, D., & Tierney, L.-J. (1983). Discrimination in the criminal justice system: A critical appraisal of the literature. In A. Blumstein, J. Cohen, S. E. Martin, & M. H. Tonry (Eds.), Research on sentencing: The search for reform (Vol. I and II, pp. 55–128). Washington, D.C.: National Academy Press.

    Google Scholar 

  • Klein, S. P., Berk, R. A., & Hickman, L. J. (2006). Race and the decision to seek the death penalty in federal cases. Santa Monica, CA: The Rand Corporation.

    Google Scholar 

  • Ohlin, L. E., & Duncan, O. D. (1949). The efficiency of prediction in criminology. American Journal of Sociology, 54, 441–452.

    Article  Google Scholar 

  • Western, B. (2007). Punishment and inequality in America. New York: Russell Sage Foundation.

    Google Scholar 

Download references

Acknowledgements

A special thanks goes to Al Blumstein for a number of very helpful suggestions for an earlier draft of this paper and to Jim Austin and Tom Stough for assistance in obtaining and making sense of the data. Conversations with Larry Brown about the broader statistical issues were also extremely helpful.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Richard Berk.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Berk, R. The Role of Race in Forecasts of Violent Crime. Race Soc Probl 1, 231–242 (2009). https://doi.org/10.1007/s12552-009-9017-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12552-009-9017-z

Keywords

Navigation