The Lack of Cross-Validation Can Lead to Inflated Results and Spurious Conclusions: A Re-Analysis of the MacArthur Violence Risk Assessment Study
Cross-validation is an important evaluation strategy in behavioral predictive modeling; without it, a predictive model is likely to be overly optimistic. Statistical methods have been developed that allow researchers to straightforwardly cross-validate predictive models by using the same data employed to construct the model. In the present study, cross-validation techniques were used to construct several decision-tree models with data from the MacArthur Violence Risk Assessment Study (Monahan et al., 2001). The models were then compared with the original (non-cross-validated) Classification of Violence Risk assessment tool. The results show that the measures of predictive model accuracy (AUC, misclassification error, sensitivity, specificity, positive and negative predictive values) degrade considerably when applied to a testing sample, compared with the training sample used to fit the model initially. In addition, unless false negatives (that is, incorrectly predicting individuals to be nonviolent) are considered more costly than false positives (that is, incorrectly predicting individuals to be violent), the models generally make few predictions of violence. The results suggest that employing cross-validation when constructing models can make an important contribution to increasing the reliability and replicability of psychological research.
KeywordsClassification trees Cross-validation Replicability Misclassification costs Random forests Violence prediction
Unable to display preview. Download preview PDF.
- BREIMAN, L., and SPECTOR, P. (1992), “Submodel Selection and Evaluation in Regression. The X-Random Case”, International Statistical Review, 291–319.Google Scholar
- GINI, C. (1912), Variability and Mutability: Contribution to the Study of Distributions and Report Statistics, Bologna, Italy: C. Cuppini.Google Scholar
- MONAHAN, J., STEADMAN, H.J., SILVER, E., APPELBAUM, P.S., ROBBINS, P.C., MULVEY, E.P., and BANKS, S. (2001), Rethinking Risk Assessment: The MacArthur Study of Mental Disorder and Violence, New York, NY: Oxford University Press.Google Scholar
- MOSSMAN, D. (2006), “Critique of Pure Risk Assessment or, Kant Meets Tarasoff”, University of Cincinnati Law Review, 75, 523–609.Google Scholar
- R CORE TEAM (2014), R: A Language and Environment for Statistical Computing (Version 3.1.1), Vienna, Austria, http://www.R-project.org/.
- SPSS, INC. (1993), SPSS for Windows (Release 6.0), Chicago, IL: SPSS, Inc.Google Scholar
- VRIEZE, S.I., and GROVE, W.M. (2008), “Predicting Sex Offender Recidivism. I. Correcting for Item Overselection and Accuracy Overestimation in Scale Development. II. Sampling Error-Induced Attenuation of Predictive Validity over Base Rate Information”, Law and Human Behavior, 32, 266–278.CrossRefGoogle Scholar