Skip to main content

Advertisement

Log in

Field Reliability of the SAVRY with Juvenile Probation Officers: Implications for Training

  • Original Article
  • Published:
Law and Human Behavior

Abstract

Two complimentary studies were conducted to investigate the inter-rater reliability and performance of juvenile justice personnel when conducting the Structured Assessment of Violence Risk for Youth (SAVRY). Study 1 reports the performance on four standardized vignettes of 408 juvenile probation officers (JPOs) and social workers rating the SAVRY as part of their training. JPOs had high agreement with the expert consensus on the SAVRY rating of overall risk and total scores, but those trained by a peer master trainer outperformed those trained by an expert. Study 2 examined the field reliability of the SAVRY on 80 young offender cases rated by a JPO and a trained research assistant. In the field, intra-class correlation coefficients were ‘excellent’ for SAVRY total and most domain scores, and were ‘good’ for overall risk ratings. Results suggest that the SAVRY and structured professional judgment can be used reliably in the field by juvenile justice personnel and is comparable to reliability indices reported in more lab-like research studies; however, replication is essential.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. The race figures reported here were based on data from the Louisiana Office of Juvenile Justice. Unfortunately, the researchers did not have race data for each specific JPO or SW in Study 1 so it was not possible to conduct any later comparisons by race.

  2. The ideal approach would have been to examine absolute agreement, meaning an item rating of Low+ or Moderate− would both be acceptable answers if the consensus rating was Low+ or Moderate−. Unfortunately, the researchers did not receive all of the training rating data in a format that provided the sliders so it was not possible to examine absolute agreement.

  3. We examined whether practice and gender of rater were potential covariates by conducting five two-way (Vignette × Timing of Vignette [a measure of practice]) ANOVAs and five two-way (Vignette × Gender) ANOVAs, one for performance on the SAVRY total score and then for each of the four domain scores. This approach was preferred to one omnibus MANOVA because it allowed a liberal examination of potential covariates. We conducted these analyses across all cases rather than across raters, again to take a more liberal approach. There was not a significant main effect for Timing of Vignette on performance on any domain except the Protective Factor domain (F[2, 611] = 3.39, p = .03). For three domains, there was a significant interaction between Vignette and Timing of Vignette on performance (Historical—F[6, 611] = 7.47, p = .008; Individual—F[6, 611] = 7.47, p = .006; Protective—F[6, 611] = 7.69, p < .001). Most importantly, there was a significant main effect of Vignette on performance for every domain and the SAVRY total score. For the gender analyses, there was a main effect of Gender on performance on the Historical domain, such that females (M = 6.75; SE = .11) performed better than males (M = 6.39; SE = .13; F[1, 564] = 4.47, p = .04), and on the Protective Factor domain, such that males (M = 3.86; SE = .09) performed better than females (M = 3.60; SE = .08; F[1, 564] = 4.79, p = .03). Results of all analyses are available from the senior author.

  4. According to Cohen (1988, 1992), a small-sized correlation is r = ±.10, a moderate-sized correlation is r = ±.30, and a large correlation is r = ±.50. The corresponding thresholds for standardized mean differences (i.e., Cohen’s d) are 0.2, 0.5, and 0.8.

References

  • Andrews, D. A. (1989). Recidivism is predictable and can be influenced: Using risk assessments to reduce recidivism. Forum on Corrections Research, 1(2), 11–18.

    Google Scholar 

  • Andrews, D. A., & Bonta, J. (2002). The psychology of criminal conduct (3rd ed.). Cincinnati, OH: Anderson.

    Google Scholar 

  • Andrews, D. A., Bonta, J., & Hoge, R. D. (1990). Classification for effective rehabilitation: Rediscovering psychology. Criminal Justice and Behavior, 17, 19–52. doi:10.1177/0093854890017001004.

    Article  Google Scholar 

  • Andrews, D. A., & Dowden, C. (2006). Risk principle of case classification in correctional treatment: A meta-analytic investigation. International Journal of Offender Therapy and Comparative Criminology, 50, 88–100. doi:10.1177/0306624X05282556.

    Article  PubMed  Google Scholar 

  • Austin, J. (2006). How much risk can we take? The misuse of risk assessment in corrections. Federal Probation, 70(2), 58–63.

    Google Scholar 

  • Barnoski, R. (2004). Assessing risk for re-offense: Validating the Washington State Juvenile Court Assessment (Report No. 04-03-1201). Olympia: Washington State Institute for Public Policy.

  • Barnoski, R., & Markussen, S. (2005). Washington state juvenile court assessment. In T. Grisso, G. Vincent, & D. Seagrave (Eds.), Mental health screening and assessment in juvenile justice (pp. 271–282). New York: Guilford Press.

    Google Scholar 

  • Boccaccini, M. T., Turner, D., & Murrie, D. C. (2008). Do some evaluators report consistently higher or lower psychopathy scores than others? Findings from a statewide sample of sexually violent predator evaluations. Psychology, Public Policy, & Law, 14, 262–283. doi:10.1037/a0014523.

    Article  Google Scholar 

  • Borum, R., Bartel, P., & Forth, A. (2003/2006). Structured Assessment of Violence Risk in Youth (SAVRY). Odessa, FL: Psychological Assessment Resources, Inc.

  • Borum, R., Lodewijks, H., Bartel, P., & Forth, A. (2009). Structured Assessment of Violence Risk in Youth (SAVRY). In K. Douglas & R. Otto (Eds.), Handbook of Violence Risk Assessment (pp. 63–80). New York: Routledge.

    Google Scholar 

  • Chen, B., Zaebst, D., & Seel, L. (2005). A macro to calculate kappa statistics for categorizations by multiple raters [cited 2005 Nov 29]. In SUGI 30 Proceedings, Philadelphia, PA, April 10–13, 2005, from http://www2.sas.com/proceedings/sugi30/155-30.pdf.

  • Cicchetti, D. V., & Sparrow, S. A. (1981). Developing criteria for establishing interrater reliability of specific items: applications to assessment of adaptive behavior. American Journal of Mental Deficiency, 86, 127–137.

    PubMed  Google Scholar 

  • Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37–46. doi:10.1177/001316446002000104.

    Article  Google Scholar 

  • Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155–159. doi:10.1037/0033-2909.112.1.155.

    Article  PubMed  Google Scholar 

  • Douglas, K. S., & Kropp, P. R. (2002). A prevention-based paradigm for violence risk assessment: Clinical and research applications. Criminal Justice and Behavior, 29, 617–658.

    Article  Google Scholar 

  • Douglas, K. S., & Reeves, K. (2010). The HCR-20 violence risk assessment scheme: Overview and re-view of the research. In R. Otto & K. S. Douglas (Eds.), Handbook of violence risk assessment (pp. 147–185). Oxford: Routledge/Taylor & Francis.

    Google Scholar 

  • Fagan, J., & Zimring, F. E. (Eds.). (2000). The changing borders of juvenile justice: Transfer of adolescents to the criminal court. Chicago: The University of Chicago Press.

    Google Scholar 

  • Fleiss, J. L. (1981). Balanced incomplete block designs for inter-rater reliability studies. Applied Psychological Measurement, 5, 105–112. doi:10.1177/014662168100500115.

    Article  Google Scholar 

  • Fleiss, J. L. (1986). The design and analysis of clinical experiments. New York: Wiley.

    Google Scholar 

  • Fremouw, W. J., & Feindler, E. L. (1978). Peer versus professional models for study skills training. Journal of Counseling Psychology, 25(6), 576–580. doi:10.1037/0022-0167.25.6.576.

    Article  Google Scholar 

  • Gottfredson, D., & Tonry, M. (1988). Prediction and classification: Criminal justice decision-making. Chicago: Chicago University Press.

    Google Scholar 

  • Green, A. M. (1997). Kappa statistics for multiple raters using categorical classifications. In Proceedings of the 22nd annual SAS User Group International conference, pp. 1110–1115.

  • Griffin, P., & Bozynski, M. (2003). National overviews: State juvenile justice profiles. Retrieved November 5, 2003, from http://www.ncjj.org/stateprofiles/.

  • Grisso, T. (2005). Why we need mental health screening and assessment in juvenile justice programs. In T. Grisso, G. Vincent, & D. Seagrave (Eds.), Mental health screening and assessment in juvenile justice (pp. 3–21). New York: Guilford Press.

    Google Scholar 

  • Grisso, T., Vincent, G. M., & Seagrave, D. (2005). Mental health screening and assessment in juvenile justice. New York: Guilford Press.

    Google Scholar 

  • Hare, R. D. (2003). Manual for the Hare Psychopathy Checklist—revised (2nd ed.). Toronto: Multi-Health Systems.

    Google Scholar 

  • Hoge, R. D. (2002). Standardized instruments for assessing risk and need in youthful offenders. Criminal Justice and Behavior, 29, 380–396. doi:10.1177/0093854802029004003.

    Google Scholar 

  • Hoge, R. D., & Andrews, D. A. (2006). Youth Level of Service/Case Management Inventory: User’s manual. North Tonawanda, NY: Multi-Health Systems.

    Google Scholar 

  • Kurtz, J. R., Robins, T. G., & Schork, M. A. (1997). An evaluation of peer and professional trainers in a union-based occupational health and safety training program. Journal of Occupational and Environmental Medicine, 39(7), 661–671.

    Article  PubMed  Google Scholar 

  • Landis, J., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159–174.

    Article  PubMed  Google Scholar 

  • Lodewijks, H. P. B., Doreleijers, T. A. H., & de Ruiter, C. (2008). SAVRY risk assessment in violent Dutch adolescents: Relation to sentencing and recidivism. Criminal Justice and Behavior, 35, 696–709. doi:10.1177/0093854808316146.

    Article  Google Scholar 

  • Mulvey, E. P. (2005). Risk Assessment in Juvenile Justice Policy and Practice. In K. Heilbrun, N. E. Sevin Goldstein, & R. E. Redding (Eds.), Juvenile delinquency: Prevention, assessment, and intervention (pp. 209–231). New York: Oxford University Press.

    Google Scholar 

  • Murrie, D. C., Boccaccini, M., Johnson, J., & Janke, C. (2008). Does interrater (dis)agreement on Psychopathy Checklist scores in Sexually Violent Predator trials suggest partisan allegiance in forensic evaluation? Law and Human Behavior, 32, 352–362. doi:10.1007/s10979-007-9097-5.

    Article  PubMed  Google Scholar 

  • Olver, M. E., Stockdale, K. C., & Wormith, J. S. (2009). Risk assessment with young offenders: A meta-analysis of three assessment measures. Criminal Justice and Behavior, 36, 329–353. doi:10.1177/0093854809331457.

    Article  Google Scholar 

  • Otto, R. K., & Douglas, K. S. (Eds.). (2009). Handbook of violence risk assessment. New York: Routledge/Taylor & Francis Group.

    Google Scholar 

  • Quinsey, V., Harris, G., Rice, M., & Cormier, C. (2006). Violent offenders: Appraising and managing risk (2nd ed.). Washington, DC: American Psychological Association.

    Book  Google Scholar 

  • Schmidt, F., Hoge, R., & Robertson, L. (2005). Reliability and validity analyses of the Youth Level of Services/Case Management Inventory. Criminal Justice and Behavior, 32(3), 329–344. doi:10.1177/0093854804274373.

    Article  Google Scholar 

  • Schwalbe, C. S. (2007). Risk assessment for juvenile justice: A meta-analysis. Law and Human Behavior, 31, 449–462. doi:10.1007/s10979-006-9071-7.

    Article  PubMed  Google Scholar 

  • Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations: Uses in assessing reliability. Psychological Bulletin, 86, 420–428. doi:10.1037/0033-2909.86.2.420.

    Article  PubMed  Google Scholar 

  • Vincent, G. M., Chapman, J., & Cook, N. E. (2011). Risk/Needs assessment in juvenile justice: Predictive validity of the SAVRY, racial differences, and contribution of needs factors. Criminal Justice and Behavior, 38(1), 42–62. doi:10.1177/0093854810386000.

    Article  Google Scholar 

  • Vincent, G. M., Terry, A., & Maney, S. (2009). Risk/Needs tools for antisocial behavior and violence among youthful populations. In J. Andrade (Ed.), Handbook of Violence Risk Assessment and Treatment for Forensic Mental Health Practitioners (pp. 337–424). New York: Springer.

    Google Scholar 

  • Welsh, J., Schmidt, F., McKinnon, L., Chattha, H., & Meyers, J. (2008). A comparative study of adolescent risk assessment instruments: predictive and incremental validity. Assessment, 15, 104–115.

    Article  PubMed  Google Scholar 

Download references

Acknowledgments

This research was funded by the John D. & Catherine T. MacArthur Foundation as part of the Models for Change Research Network. The authors would like to acknowledge Dr. Debra DePrato, MD, Associate Clinical Professor of Public Health, for making this study possible; the Louisiana Office of Juvenile Justice, particularly Mary Livers, PhD, Deputy Secretary and Kelly Clement, Regional Manager, for organizing all of the probation officer trainings and assisting us with gathering data; Patrick Bartel, PhD, for his SAVRY training; and both Dr. Bartel and Randy Borum, PsyD, for their guidance around implementing the SAVRY. Finally, the authors wish to thank our site research associates, Joshua Everett, MA, Brady Holtzclaw, MA, and Brittany Foreman.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gina M. Vincent.

About this article

Cite this article

Vincent, G.M., Guy, L.S., Fusco, S.L. et al. Field Reliability of the SAVRY with Juvenile Probation Officers: Implications for Training. Law Hum Behav (2011). https://doi.org/10.1007/s10979-011-9284-2

Download citation

  • Published:

  • DOI: https://doi.org/10.1007/s10979-011-9284-2

Keywords

Navigation