The Role of Race in Forecasts of Violent Crime

Berk, Richard

doi:10.1007/s12552-009-9017-z

The Role of Race in Forecasts of Violent Crime

Published: 03 November 2009

Volume 1, pages 231–242, (2009)
Cite this article

Race and Social Problems Aims and scope Submit manuscript

Richard Berk¹

640 Accesses
30 Citations
Explore all metrics

Abstract

This paper addresses the role of race in forecasts of failure on probation or parole. Failure is defined as committing a homicide or attempted homicide or being the victim of a homicide or an attempted homicide. These are very rare events in the population of individuals studied, which can make these outcomes extremely difficult to forecast accurately. Building in the relative costs of false positives and false negatives, machine learning procedures are applied to construct useful forecasts. The central question addressed is what role race should play as a predictor when as an empirical matter the majority of perpetrators and victims are young, African American, males.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Article Open access 01 April 2016

The Social Learning Theory of Crime and Deviance

Are Schools in Prison Worth It? The Effects and Economic Returns of Prison Education

Article Open access 21 October 2023

Notes

Statistical learning is also called machine learning. The two terms will be used interchangeably.
Using forecasts derived from the experiences of individuals under supervision in the community to inform release decision is tricky. The populations involved are somewhat different. The population for parole decisions is prison inmates. The population for probation release is convicted offenders at sentencing. The forecasts sought in this study were for a population of individuals already under supervision.
For example, logistic regression assumes that in the log-odds metric of the response, all predictors are linearly related to the response.
Indeed, it was tried. No true positives were correctly identified.
One false negative had approximately the cost of 20 false positives.
Consider an example. There are 198 individuals who failed. Suppose forecasting error increases from 20 to 25%, an increase of 5%, there would be approximately ten more false negatives. But with 10,959 individuals who did not fail, an increase of ten in the number of false positives would imply a tiny percentage change of .09%.
As before, there is no impact on false positives.
Cases with rare values for certain predictors could have been dropped from the analysis. But that would have threatened external validity, especially because having a rare value on any given predictor does not necessarily mean having a rare value on any other predictor. Moreover, one can see in the response function plots that a few rare data points at the tails of a distribution are very unlikely to affect the functional form for other values because the fitting procedures are very flexible. In effect, the rare observations are ignored. The option of recoding the rare values for any variable to some common value (e.g., some reasonable upper bound) would risk affecting the functional form elsewhere because for that value the data would no longer be sparse.
The nominal or suspended sentence can be very short when judges take time served awaiting trial into account.
Consider a single classification tree. Race would need to enter at a branch in the tree where the two conditional distributions (i.e., for race and for failure) had far more similar balance than their marginal distributions. Although this certainly could happen, there is nothing in the data partitioning process to help bring that about. And insofar as greater correspondence between the two distributions is unusual for any single tree, race cannot contribute much to forecasting accuracy over many trees.
This method was suggested by Penn colleague Larry Brown who also noted that declines in forecasting accuracy were likely.
One would probably want to maintain the marginal distribution of race.

References

Alpert, G. P., Dunham, R. G., & Smith, M. R. (2006). Investigating racial profiling by the Miami-Dade police department: A multimethod approach. Criminology and Public Policy, 6(1), 25–56.
Google Scholar
Baldus, D. C., Woodworth, G. C., & Pulaski, C. A., Jr. (1990). Equal justice and the death penalty. Lebanon, New Hampshire: University Press of New England.
Google Scholar
Berk, R. A. (2008a). Statistical learning from a regression perspective. New York: Springer.
Google Scholar
Berk, R. A. (2008b). Forecasting methods in crime and justice. Annual Review of Law and Social Science, 4, 173–192
Article Google Scholar
Berk, R. A., Hickman, L., & Li, A. (2005). Statistical difficulties in determining the role of race in capital cases: A re-analysis of data from the state of Maryland. Journal of Quantitative Criminology, 21(4), 365–390
Article Google Scholar
Berk, R. A., Sherman, L., Barnes, G., Kurtz, E., & Ahlman, L. (2009). Forecasting murder within a population of probationers and parolees: A high stakes application of statistical learning. Journal of the Royal Statistical Society (Series A), 172(part 1), 191–211.
Google Scholar
Blumstein, A. (1993). Racial disproportionality in US prison populations revisited. University of Colorado Law Review, 64, 743–760.
Google Scholar
Blumstein, A., Cohen, J., Martin, S. E., & Tonry, M. H. (Eds.). (1983). Research on sentencing: The search for reform (Vol. I and II). Washington, D.C.: National Academy Press.
Google Scholar
Breiman, L. (2001a). Random forests. Machine Learning, 45, 5–32.
Article Google Scholar
Breiman, L. (2001b). Statistical modeling: Two cultures (with discussion). Statistical Science, 16, 199–231
Article Google Scholar
Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classication and regression trees. Monterey: Wadsworth.
Google Scholar
Farrington, D. P. (1987). Predicting individual crime rates. In D. M. Gottfredson & M. Tonry (Eds.), Prediction and classification. Chicago: University of Chicago Press.
Google Scholar
Friedman, J. H. (2002). Stochastic gradient boosting. Computational Statistics and Data Analysis, 38, 367–378.
Article Google Scholar
Goodman, L. A. (1953a). The use and validity of a prediction instrument. I. A reformulation of the use of a prediction instrument. American Journal of Sociology, 58, 503–510.
Article Google Scholar
Goodman, L. A. (1953b). II. The validation of prediction. American Journal of Sociology, 58, 510–512.
Article Google Scholar
Grogger, J., & Ridgeway, G. (2006). Testing for racial profiling in traffic stops from behind a veil of darkness. Journal of the American Statistical Association, 101(475), 878–887.
Article Google Scholar
Hastie, T. J., & Tibshirani, R. (1990). Generalized additive models. New York: Chapman and Hall.
Google Scholar
Kelpper, S., Nagin, D., & Tierney, L.-J. (1983). Discrimination in the criminal justice system: A critical appraisal of the literature. In A. Blumstein, J. Cohen, S. E. Martin, & M. H. Tonry (Eds.), Research on sentencing: The search for reform (Vol. I and II, pp. 55–128). Washington, D.C.: National Academy Press.
Google Scholar
Klein, S. P., Berk, R. A., & Hickman, L. J. (2006). Race and the decision to seek the death penalty in federal cases. Santa Monica, CA: The Rand Corporation.
Google Scholar
Ohlin, L. E., & Duncan, O. D. (1949). The efficiency of prediction in criminology. American Journal of Sociology, 54, 441–452.
Article Google Scholar
Western, B. (2007). Punishment and inequality in America. New York: Russell Sage Foundation.
Google Scholar

Download references

Acknowledgements

A special thanks goes to Al Blumstein for a number of very helpful suggestions for an earlier draft of this paper and to Jim Austin and Tom Stough for assistance in obtaining and making sense of the data. Conversations with Larry Brown about the broader statistical issues were also extremely helpful.

Author information

Authors and Affiliations

Department of Statistics, University of Pennsylvania, Philadelphia, PA, USA
Richard Berk

Authors

Richard Berk
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Richard Berk.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Berk, R. The Role of Race in Forecasts of Violent Crime. Race Soc Probl 1, 231–242 (2009). https://doi.org/10.1007/s12552-009-9017-z

Download citation

Published: 03 November 2009
Issue Date: December 2009
DOI: https://doi.org/10.1007/s12552-009-9017-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Role of Race in Forecasts of Violent Crime

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

The Social Learning Theory of Crime and Deviance

Are Schools in Prison Worth It? The Effects and Economic Returns of Prison Education

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The Role of Race in Forecasts of Violent Crime

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

The Social Learning Theory of Crime and Deviance

Are Schools in Prison Worth It? The Effects and Economic Returns of Prison Education

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation