Abstract
Two complimentary studies were conducted to investigate the inter-rater reliability and performance of juvenile justice personnel when conducting the Structured Assessment of Violence Risk for Youth (SAVRY). Study 1 reports the performance on four standardized vignettes of 408 juvenile probation officers (JPOs) and social workers rating the SAVRY as part of their training. JPOs had high agreement with the expert consensus on the SAVRY rating of overall risk and total scores, but those trained by a peer master trainer outperformed those trained by an expert. Study 2 examined the field reliability of the SAVRY on 80 young offender cases rated by a JPO and a trained research assistant. In the field, intra-class correlation coefficients were ‘excellent’ for SAVRY total and most domain scores, and were ‘good’ for overall risk ratings. Results suggest that the SAVRY and structured professional judgment can be used reliably in the field by juvenile justice personnel and is comparable to reliability indices reported in more lab-like research studies; however, replication is essential.
Similar content being viewed by others
Notes
The race figures reported here were based on data from the Louisiana Office of Juvenile Justice. Unfortunately, the researchers did not have race data for each specific JPO or SW in Study 1 so it was not possible to conduct any later comparisons by race.
The ideal approach would have been to examine absolute agreement, meaning an item rating of Low+ or Moderate− would both be acceptable answers if the consensus rating was Low+ or Moderate−. Unfortunately, the researchers did not receive all of the training rating data in a format that provided the sliders so it was not possible to examine absolute agreement.
We examined whether practice and gender of rater were potential covariates by conducting five two-way (Vignette × Timing of Vignette [a measure of practice]) ANOVAs and five two-way (Vignette × Gender) ANOVAs, one for performance on the SAVRY total score and then for each of the four domain scores. This approach was preferred to one omnibus MANOVA because it allowed a liberal examination of potential covariates. We conducted these analyses across all cases rather than across raters, again to take a more liberal approach. There was not a significant main effect for Timing of Vignette on performance on any domain except the Protective Factor domain (F[2, 611] = 3.39, p = .03). For three domains, there was a significant interaction between Vignette and Timing of Vignette on performance (Historical—F[6, 611] = 7.47, p = .008; Individual—F[6, 611] = 7.47, p = .006; Protective—F[6, 611] = 7.69, p < .001). Most importantly, there was a significant main effect of Vignette on performance for every domain and the SAVRY total score. For the gender analyses, there was a main effect of Gender on performance on the Historical domain, such that females (M = 6.75; SE = .11) performed better than males (M = 6.39; SE = .13; F[1, 564] = 4.47, p = .04), and on the Protective Factor domain, such that males (M = 3.86; SE = .09) performed better than females (M = 3.60; SE = .08; F[1, 564] = 4.79, p = .03). Results of all analyses are available from the senior author.
References
Andrews, D. A. (1989). Recidivism is predictable and can be influenced: Using risk assessments to reduce recidivism. Forum on Corrections Research, 1(2), 11–18.
Andrews, D. A., & Bonta, J. (2002). The psychology of criminal conduct (3rd ed.). Cincinnati, OH: Anderson.
Andrews, D. A., Bonta, J., & Hoge, R. D. (1990). Classification for effective rehabilitation: Rediscovering psychology. Criminal Justice and Behavior, 17, 19–52. doi:10.1177/0093854890017001004.
Andrews, D. A., & Dowden, C. (2006). Risk principle of case classification in correctional treatment: A meta-analytic investigation. International Journal of Offender Therapy and Comparative Criminology, 50, 88–100. doi:10.1177/0306624X05282556.
Austin, J. (2006). How much risk can we take? The misuse of risk assessment in corrections. Federal Probation, 70(2), 58–63.
Barnoski, R. (2004). Assessing risk for re-offense: Validating the Washington State Juvenile Court Assessment (Report No. 04-03-1201). Olympia: Washington State Institute for Public Policy.
Barnoski, R., & Markussen, S. (2005). Washington state juvenile court assessment. In T. Grisso, G. Vincent, & D. Seagrave (Eds.), Mental health screening and assessment in juvenile justice (pp. 271–282). New York: Guilford Press.
Boccaccini, M. T., Turner, D., & Murrie, D. C. (2008). Do some evaluators report consistently higher or lower psychopathy scores than others? Findings from a statewide sample of sexually violent predator evaluations. Psychology, Public Policy, & Law, 14, 262–283. doi:10.1037/a0014523.
Borum, R., Bartel, P., & Forth, A. (2003/2006). Structured Assessment of Violence Risk in Youth (SAVRY). Odessa, FL: Psychological Assessment Resources, Inc.
Borum, R., Lodewijks, H., Bartel, P., & Forth, A. (2009). Structured Assessment of Violence Risk in Youth (SAVRY). In K. Douglas & R. Otto (Eds.), Handbook of Violence Risk Assessment (pp. 63–80). New York: Routledge.
Chen, B., Zaebst, D., & Seel, L. (2005). A macro to calculate kappa statistics for categorizations by multiple raters [cited 2005 Nov 29]. In SUGI 30 Proceedings, Philadelphia, PA, April 10–13, 2005, from http://www2.sas.com/proceedings/sugi30/155-30.pdf.
Cicchetti, D. V., & Sparrow, S. A. (1981). Developing criteria for establishing interrater reliability of specific items: applications to assessment of adaptive behavior. American Journal of Mental Deficiency, 86, 127–137.
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37–46. doi:10.1177/001316446002000104.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.
Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155–159. doi:10.1037/0033-2909.112.1.155.
Douglas, K. S., & Kropp, P. R. (2002). A prevention-based paradigm for violence risk assessment: Clinical and research applications. Criminal Justice and Behavior, 29, 617–658.
Douglas, K. S., & Reeves, K. (2010). The HCR-20 violence risk assessment scheme: Overview and re-view of the research. In R. Otto & K. S. Douglas (Eds.), Handbook of violence risk assessment (pp. 147–185). Oxford: Routledge/Taylor & Francis.
Fagan, J., & Zimring, F. E. (Eds.). (2000). The changing borders of juvenile justice: Transfer of adolescents to the criminal court. Chicago: The University of Chicago Press.
Fleiss, J. L. (1981). Balanced incomplete block designs for inter-rater reliability studies. Applied Psychological Measurement, 5, 105–112. doi:10.1177/014662168100500115.
Fleiss, J. L. (1986). The design and analysis of clinical experiments. New York: Wiley.
Fremouw, W. J., & Feindler, E. L. (1978). Peer versus professional models for study skills training. Journal of Counseling Psychology, 25(6), 576–580. doi:10.1037/0022-0167.25.6.576.
Gottfredson, D., & Tonry, M. (1988). Prediction and classification: Criminal justice decision-making. Chicago: Chicago University Press.
Green, A. M. (1997). Kappa statistics for multiple raters using categorical classifications. In Proceedings of the 22nd annual SAS User Group International conference, pp. 1110–1115.
Griffin, P., & Bozynski, M. (2003). National overviews: State juvenile justice profiles. Retrieved November 5, 2003, from http://www.ncjj.org/stateprofiles/.
Grisso, T. (2005). Why we need mental health screening and assessment in juvenile justice programs. In T. Grisso, G. Vincent, & D. Seagrave (Eds.), Mental health screening and assessment in juvenile justice (pp. 3–21). New York: Guilford Press.
Grisso, T., Vincent, G. M., & Seagrave, D. (2005). Mental health screening and assessment in juvenile justice. New York: Guilford Press.
Hare, R. D. (2003). Manual for the Hare Psychopathy Checklist—revised (2nd ed.). Toronto: Multi-Health Systems.
Hoge, R. D. (2002). Standardized instruments for assessing risk and need in youthful offenders. Criminal Justice and Behavior, 29, 380–396. doi:10.1177/0093854802029004003.
Hoge, R. D., & Andrews, D. A. (2006). Youth Level of Service/Case Management Inventory: User’s manual. North Tonawanda, NY: Multi-Health Systems.
Kurtz, J. R., Robins, T. G., & Schork, M. A. (1997). An evaluation of peer and professional trainers in a union-based occupational health and safety training program. Journal of Occupational and Environmental Medicine, 39(7), 661–671.
Landis, J., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159–174.
Lodewijks, H. P. B., Doreleijers, T. A. H., & de Ruiter, C. (2008). SAVRY risk assessment in violent Dutch adolescents: Relation to sentencing and recidivism. Criminal Justice and Behavior, 35, 696–709. doi:10.1177/0093854808316146.
Mulvey, E. P. (2005). Risk Assessment in Juvenile Justice Policy and Practice. In K. Heilbrun, N. E. Sevin Goldstein, & R. E. Redding (Eds.), Juvenile delinquency: Prevention, assessment, and intervention (pp. 209–231). New York: Oxford University Press.
Murrie, D. C., Boccaccini, M., Johnson, J., & Janke, C. (2008). Does interrater (dis)agreement on Psychopathy Checklist scores in Sexually Violent Predator trials suggest partisan allegiance in forensic evaluation? Law and Human Behavior, 32, 352–362. doi:10.1007/s10979-007-9097-5.
Olver, M. E., Stockdale, K. C., & Wormith, J. S. (2009). Risk assessment with young offenders: A meta-analysis of three assessment measures. Criminal Justice and Behavior, 36, 329–353. doi:10.1177/0093854809331457.
Otto, R. K., & Douglas, K. S. (Eds.). (2009). Handbook of violence risk assessment. New York: Routledge/Taylor & Francis Group.
Quinsey, V., Harris, G., Rice, M., & Cormier, C. (2006). Violent offenders: Appraising and managing risk (2nd ed.). Washington, DC: American Psychological Association.
Schmidt, F., Hoge, R., & Robertson, L. (2005). Reliability and validity analyses of the Youth Level of Services/Case Management Inventory. Criminal Justice and Behavior, 32(3), 329–344. doi:10.1177/0093854804274373.
Schwalbe, C. S. (2007). Risk assessment for juvenile justice: A meta-analysis. Law and Human Behavior, 31, 449–462. doi:10.1007/s10979-006-9071-7.
Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations: Uses in assessing reliability. Psychological Bulletin, 86, 420–428. doi:10.1037/0033-2909.86.2.420.
Vincent, G. M., Chapman, J., & Cook, N. E. (2011). Risk/Needs assessment in juvenile justice: Predictive validity of the SAVRY, racial differences, and contribution of needs factors. Criminal Justice and Behavior, 38(1), 42–62. doi:10.1177/0093854810386000.
Vincent, G. M., Terry, A., & Maney, S. (2009). Risk/Needs tools for antisocial behavior and violence among youthful populations. In J. Andrade (Ed.), Handbook of Violence Risk Assessment and Treatment for Forensic Mental Health Practitioners (pp. 337–424). New York: Springer.
Welsh, J., Schmidt, F., McKinnon, L., Chattha, H., & Meyers, J. (2008). A comparative study of adolescent risk assessment instruments: predictive and incremental validity. Assessment, 15, 104–115.
Acknowledgments
This research was funded by the John D. & Catherine T. MacArthur Foundation as part of the Models for Change Research Network. The authors would like to acknowledge Dr. Debra DePrato, MD, Associate Clinical Professor of Public Health, for making this study possible; the Louisiana Office of Juvenile Justice, particularly Mary Livers, PhD, Deputy Secretary and Kelly Clement, Regional Manager, for organizing all of the probation officer trainings and assisting us with gathering data; Patrick Bartel, PhD, for his SAVRY training; and both Dr. Bartel and Randy Borum, PsyD, for their guidance around implementing the SAVRY. Finally, the authors wish to thank our site research associates, Joshua Everett, MA, Brady Holtzclaw, MA, and Brittany Foreman.
Author information
Authors and Affiliations
Corresponding author
About this article
Cite this article
Vincent, G.M., Guy, L.S., Fusco, S.L. et al. Field Reliability of the SAVRY with Juvenile Probation Officers: Implications for Training. Law Hum Behav (2011). https://doi.org/10.1007/s10979-011-9284-2
Published:
DOI: https://doi.org/10.1007/s10979-011-9284-2