Skip to main content
Log in

Examining the Correspondence Between Teacher- and Observer-Report Treatment Integrity Measures

  • Original Paper
  • Published:
School Mental Health Aims and scope Submit manuscript

Abstract

Teacher-reported measures of treatment integrity (the extent to which prescribed practices are delivered as intended by teachers) have the potential to support efforts to evaluate and implement evidence-based interventions in early childhood settings. However, self-report treatment integrity measures have shown poor correspondence with observer-report treatment integrity measures, raising questions about score validity. This paper reports on the development and initial evaluation of the score reliability and validity of the Treatment Integrity Measure for Early Childhood Settings Teacher Report (TIMECS-TR), which is designed to address limitations of previous self-report treatment integrity measures that may have contributed to low correspondence with observer-rated measures. The TIMECS-TR includes 24 items designed to represent practices found in evidence-based interventions delivered in early childhood settings that target child social, emotional, and behavioral skills, rather than adherence to practices found in a specific evidence-based intervention. Fifty-four teachers (92.6% female, 7.4% male; 61.1% White) completed the TIMECS-TR weekly for a total of 618 times (M = 6.79 per child; SD = 2.16; range 2 to 11) about the practices they delivered with 91 children (45.1% female, 54.9% male; M = 4.53 years old; SD = 45.1% Black) who were at risk for emotional and behavioral disorders. Analyses indicated that the TIMECS-TR items evidenced mild to moderate test–retest score reliability over one week. However, analyses did not support the convergent score validity of the TIMECS-TR items or scale with observational ratings of the same practices. Teachers reported higher levels of practice delivery on the TIMECS-TR items relative to observer report. Overall, our findings raise concerns about the accuracy of teacher-report adherence measures. Lessons from this research can be used to identify possible reasons for the low correspondence between teacher- and observer-report treatment integrity measures so that future research can strive to dependably capture teacher delivery of the practices found in evidence-based interventions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bellg, A. J., Borrelli, B., Resnick, B., Hecht, J., Minicucci, D. S., Ory, M., Orwig, D., & Czajkowski, S. (2004). Enhancing treatment fidelity in health behavior change studies: Best practices and recommendations from the NIH behavior change consortium. Health Psychology, 23(5), 443–451. https://doi.org/10.1037/0278-6133.23.5.443

    Article  PubMed  Google Scholar 

  • Birch, S. H., & Ladd, G. W. (1997). The teacher–child relationship and children’s early school adjustment. Journal of School Psychology, 35(1), 61–79. https://doi.org/10.1016/S0022-4405(96)00029-5

    Article  Google Scholar 

  • Birch, S. H., & Ladd, G. W. (1998). Children’s interpersonal behaviors and the teacher–child relationship. Developmental Psychology, 34(5), 934–946. https://doi.org/10.1037/0012-1649.34.5.934

    Article  PubMed  Google Scholar 

  • Breitenstein, S. M., Gross, D., Garvey, C. A., Hill, C., Fogg, L., & Resnick, B. (2010). Implementation fidelity in community-based interventions. Research in Nursing & Health, 33(2), 164–173. https://doi.org/10.1002/nur.20373

    Article  Google Scholar 

  • Brown, R. D., & Hauenstein, N. M. A. (2005). Interrater agreement reconsidered: An alternative to the rwg indices. Organizational Research Methods, 8, 165–184. https://doi.org/10.1177/1094428105275376

    Article  Google Scholar 

  • Caron, E., Muggeo, M. A., Souer, H. R., Pella, J. E., & Ginsburg, G. S. (2019). Concordance between clinician, supervisor, and observer ratings of therapeutic competence in CBT and treatment as usual: does clinician competence or supervisor session observation improve agreement? Behavioural and Cognitive Psychotherapy, 48(3), 350–363. https://doi.org/10.1017/S1352465819000699

    Article  PubMed  Google Scholar 

  • Carroll, K. M., Nich, C., Sifry, R. L., Nuro, K. F., Frankforter, T. L., Ball, S. A., Fenton, L., & Rounsaville, B. J. (2000). A general system for evaluating therapist adherence and competence in psychotherapy research in the addictions. Drug and Alcohol Dependence, 57(3), 225–238. https://doi.org/10.1016/s0376-8716(99)00049-6

    Article  PubMed  Google Scholar 

  • Chapman, J. E., McCart, M. R., Letourneau, E. J., & Sheidow, A. J. (2013). Comparison of youth, caregiver, therapist, trained, and treatment expert raters of therapist adherence to a substance abuse treatment protocol. Journal of Consulting and Clinical Psychology, 81, 674–680. https://doi.org/10.1037/a0033021

    Article  PubMed  PubMed Central  Google Scholar 

  • Chorpita, B. F., & Daleiden, E. L. (2009). Mapping evidence-based treatments for children and adolescents: Application of the distillation and matching model to 615 treatments from 322 randomized trials. Journal of Consulting and Clinical Psychology, 77, 566–579. https://doi.org/10.1037/a0014565

    Article  PubMed  Google Scholar 

  • Cicchetti, D. (1994). Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychological Assessment, 6, 284–290. https://doi.org/10.1037/1040-3590.6.4.284

    Article  Google Scholar 

  • Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Erlbaum.

    Google Scholar 

  • Collier-Meek, M. A., Sanetti, L. M., Fallon, L., & Chafouleas, S. (2019). Exploring the influences of assessment method, intervention steps, intervention sessions, and observation timing on treatment fidelity estimates. Assessment for Effective Intervention, 46(1), 3–13. https://doi.org/10.1177/1534508419857228

    Article  Google Scholar 

  • Connors, E., Lawson, G., Wheatley-Rowe, D., & Hoover, S. (2020). Exploration, preparation, and implementation of standardized assessment in a multi-agency school behavioral health network. Administration and Policy in Mental Health and Mental Health Services Research. https://doi.org/10.1007/s10488-020-01082-7

    Article  PubMed  Google Scholar 

  • Dart, E. H., Collier-Meek, M. A., Chambers, C., & Murphy, A. (2020). Multi-informant assessment of treatment integrity in the classroom. Psychology in the Schools, 57, 805–822. https://doi.org/10.1002/pits.22351

    Article  Google Scholar 

  • De Los Reyes, A., Cook, C. R., Gresham, M., Bridget, A., Makol, A., & Wang, M. (2019). Informant discrepancies in assessments of psychosocial functioning in school-based services and research: Review and directions for future research. Journal of School Psychology, 74, 74–89. https://doi.org/10.1016/j.jsp.2019.05.005

    Article  PubMed  Google Scholar 

  • Dotterer, A. M., Burchinal, M., Cryant, D., Early, D., & Pianta, R. (2013). Universal and targeted pre-kindergarten programmes: A comparison of classroom characteristics and child outcomes. Early Child Development and Care, 183, 931–950. https://doi.org/10.1080/03004430.2012.698388

    Article  Google Scholar 

  • Fallon, L. M., Sanetti, L. M. H., Chafouleas, S. M., Faggella-Luby, M. N., & Briesch, A. M. (2018). Direct training to increase agreement between teachers’ and observers’ treatment integrity ratings. Assessment for Effective Interventions, 43(4), 196–211. https://doi.org/10.1177/1534508417738721

    Article  Google Scholar 

  • Fleiss, J. (1981). Balanced incomplete block designs for interrater reliability studies. Applied Psychological Measurement, 5, 105–112. https://doi.org/10.1177/014662168100500115

    Article  Google Scholar 

  • Hamre, B. K., & Pianta, R. C. (2001). Early teacher – child relationship and the trajectory of children’s school outcome through eighth grade. Child Development, 72, 625–638. https://doi.org/10.1111/1467-8624.00301

    Article  PubMed  Google Scholar 

  • Haynes, S. N., Richard, D. C., & Kubany, E. S. (1995). Content validity in psychological assessment: A functional approach to concepts and methods. Psychological Assessment, 7, 238–247. https://doi.org/10.1037/1040-3590.7.3.238

    Article  Google Scholar 

  • Henggeler, S. W., & Borduin, C. M. (1992). Multisystemic therapy adherence scales. Department of Psychiatry and Behavioral Sciences, Medical University of South Carolina.

    Google Scholar 

  • Henggeler, S. W., Pickrel, S. G., & Brondino, M. J. (1999). Multisystemic treatment of substance-abusing and dependent delinquents: Outcomes, treatment fidelity, and transportability. Mental Health Services Research, 1, 171–184. https://doi.org/10.1023/a:1022373813261

    Article  PubMed  Google Scholar 

  • Hogue, A. (2002). Adherence process research on developmental interventions: Filling in the middle. In A. Higgins-D’Alessandro & K. R. B. Jankowski (Eds.), New directions for child and adolescent development, Vol. 98: Science for society: Informing policy and practice through research in developmental psychology (pp. 67–74). Jossey Bass.

    Google Scholar 

  • Hogue, A., Liddle, H. A., & Rowe, C. (1996). Treatment adherence process research in family therapy: A rationale and some practical guidelines. Psychotherapy, 33, 332–345. https://doi.org/10.1037/0033-3204.33.2.332

    Article  Google Scholar 

  • Hogue, A., Dauber, S., Chinchilla, P., Fried, A., Henderson, C., Inclan, J., Reiner, R. H., & Liddle, H. A. (2008). Assessing fidelity in individual and family therapy for adolescent substance abuse. Journal of Substance Abuse Treatment, 35(2), 137–147. https://doi.org/10.1016/j.jsat.2007.09.002

    Article  PubMed  Google Scholar 

  • Hogue, A., Ozechowski, T. J., Robbins, M. S., & Waldron, H. B. (2013). Making fidelity an intramural game: Localizing quality assurance procedures to promote sustainability of evidence-based practices in usual care. Clinical Psychology: Science and Practice, 20, 60–77. https://doi.org/10.1111/cpsp.12023

    Article  Google Scholar 

  • Hogue, A., Dauber, S., Henderson, C. E., & Liddle, H. A. (2014). Reliability of therapist self-report on treatment targets and focus in family-based intervention. Administration and Policy in Mental Health and Mental Health Services Research, 41, 697–705. https://doi.org/10.1007/s10488-013-0520-6

    Article  PubMed  Google Scholar 

  • Hogue, A., Dauber, S., Lichvar, E., Bobek, M., & Henderson, C. E. (2015). Validity of therapist self-report ratings of fidelity to evidence-based practices for adolescent behavior problems: Correspondence between therapists and observers. Administration and Policy in Mental Health and Mental Health Services Research, 42, 229–243. https://doi.org/10.1007/s10488-014-0548-2

    Article  PubMed  Google Scholar 

  • Hogue, A., Bobek, M., Dauber, S., Henderson, C. E., McLeod, B. D., & Southam-Gerow, M. A. (2017). Distilling the core elements of family therapy for adolescent substance use: Conceptual and empirical solutions. Journal of Child and Adolescent Substance Abuse, 26(6), 437–453. https://doi.org/10.1080/1067828X.2017.1322020

    Article  PubMed  Google Scholar 

  • Hogue, A., Bobek, M., Dauber, S., Henderson, C. E., McLeod, B. D., & Southam-Gerow, M. A. (2019). Core elements of family therapy for adolescent substance use: Empirical distillation of three manualized treatments. Journal of Clinical Child and Adolescent Psychology, 48(1), 29–41. https://doi.org/10.1080/15374416.2018.1555762

    Article  PubMed  PubMed Central  Google Scholar 

  • Howes, C., & Ritchie, S. (1999). Attachment organizations in children with difficult life circumstances. Development and Psychopathology, 11(2), 251–268. https://doi.org/10.1017/S0954579499002047

    Article  PubMed  Google Scholar 

  • Hurlburt, M. S., Garland, A. F., Nguyen, K., & Brookman-Frazee, L. (2010). Child and family therapy process: Concordance of therapist and observational perspectives. Administration and Policy in Mental Health and Mental Health Services Research, 37, 230–244. https://doi.org/10.1007/s10488-009-0251-x

    Article  PubMed  Google Scholar 

  • James, L. R., Demaree, R. G., & Wolf, G. (1984). Estimating within-group interrater reliability with and without response bias. Journal of Applied Psychology, 69, 85–98. https://doi.org/10.1037/0021-9010.69.1.85

    Article  Google Scholar 

  • Lebreton, J. M., Burgess, J. R., Kaiser, R. B., Atchley, E. K., & James, L. R. (2003). The restriction of variance hypothesis and interrater reliability and agreement: Are ratings from multiple sources really dissimilar? Organizational Research Methods, 6(1), 80–128. https://doi.org/10.1177/1094428102239427

    Article  Google Scholar 

  • Little, R. J. A. (1988). A test of missing completely at random for multivariate data with missing values. Journal of the American Statistical Association, 83, 1198–1202. https://doi.org/10.1080/01621459.1988.10478722

    Article  Google Scholar 

  • Lyon, A. R., & Koerner, K. (2016). User-centered design for psychosocial intervention development and implementation. Clinical Psychology: Science and Practice, 23(2), 180–200. https://doi.org/10.1111/cpsp.12154

    Article  Google Scholar 

  • Margolin, G., Oliver, P., Gordis, E., O’Hearn, H., Medina, A., Ghosh, C., & Morland, L. (1998). The nuts and bolts of behavioral observation of marital and family interaction. Clinical Child and Family Psychology Review, 1(4), 195–213. https://doi.org/10.1023/a:1022608117322

    Article  PubMed  Google Scholar 

  • McLeod, B. D., & Sutherland, K. S. (2015). Scoring manual for the observational teacher–child relationship measure. Unpublished scoring manual prepared at Virginia Commonwealth University.

    Google Scholar 

  • McLeod, B. D., Southam-Gerow, M. A., & Weisz, J. R. (2009). Conceptual and methodological issues in treatment integrity measurement. School Psychology Review, 38, 541–546.

    Google Scholar 

  • McLeod, B. D., Southam-Gerow, M. A., Bair, C. E., Rodriguez, A., & Smith, M. M. (2013). Making a case for treatment integrity as a psychological treatment quality indicator. Clinical Psychology: Science and Practice, 20(1), 14–32. https://doi.org/10.1111/cpsp.12020

    Article  Google Scholar 

  • McLeod, B. D., Sutherland, K. S., Martinez, R. G., Conroy, M. A., Snyder, P. A., & Southam-Gerow, M. A. (2017). Identifying common practice elements to improve social, emotional, and behavioral outcomes of young children in early childhood classrooms. Prevention Science, 18(2), 204–213. https://doi.org/10.1007/s11121-016-0703-y

    Article  PubMed  Google Scholar 

  • McLeod, B. D., Sutherland, K. S., Broda, M., Granger, K. L., Martinez, R. G., Conroy, M. A., Snyder, P. A., & Southam-Gerow, M. A. (2020). Development and initial psychometrics of a generic treatment integrity measure designed to assess practice elements of evidence-based interventions for early childhood settings. Manuscript submitted for publication.

  • McLeod, B. D., Sutherland, K. S., Broda, M., Granger, K. L., Frey, A., & Markowicz, K. (2021). Development and initial psychometrics of the observational teacher-child interactions scale for early childhood settings. Manuscript in preparation.

  • Newborg, J. (2005). Battelle developmental inventory, 2nd edition, examiner’s manual. Riverside Publishing.

    Google Scholar 

  • Perepletchikova, F., Treat, T. A., & Kazdin, A. E. (2007). Treatment integrity in psychotherapy research: Analysis of the studies and examination of the associated factors. Journal of Consulting and Clinical Psychology, 75(6), 829–841. https://doi.org/10.1037/0022-006X.75.6.829

    Article  PubMed  Google Scholar 

  • Pianta, R. C., & Hamre, B. (2001). Students, teachers, and relationship support (STARS). Psychological Assessment Resources.

    Google Scholar 

  • Pianta, R. C., La Paro, K. M., Payne, C., Cox, M. J., & Bradley, R. (2002). The relation of kindergarten classroom environment to teacher, family, and school characteristics and child outcomes. Elementary School Journal, 102, 225–238. https://doi.org/10.1086/499701

    Article  Google Scholar 

  • Proctor, E., Silmere, H., Raghavan, R., Hovmand, P., Aarons, G., Bunger, A., Griffery, R., & Hensley, M. (2011). Outcomes for implementation research: Conceptual distinctions, measurement challenges, and research agenda. Administration and Policy in Mental Health and Mental Health Services Research, 38(2), 65–76. https://doi.org/10.1007/s10488-010-0319-7

    Article  PubMed  Google Scholar 

  • Reddy, L. A., Dudek, C. M., Fabiano, G. A., & Peters, S. (2015). Measuring teacher self-report on classroom practices: Construct validity and reliability of the classroom strategies scale—Teacher form. School Psychology Quarterly, 30(4), 513–533. https://doi.org/10.1037/spq0000110

    Article  PubMed  Google Scholar 

  • Reddy, L. A., Dudek, C. M., Rualo, A. J., & Fabiano, G. A. (2016). Concurrent validity of the classroom strategies scale—teacher form: A preliminary investigation. Educational Assessment, 21(4), 267–277. https://doi.org/10.1080/10627197.2016.1236675

    Article  Google Scholar 

  • Rosenthal, R., & Rosnow, R. L. (1984). Essentials of behavioral research: Methods and data analysis. New York: McGraw-Hill.

    Google Scholar 

  • Sanetti, L. M., & Collier-Meek, M. (2019). Increasing implementation science literacy to address the research-to-practice gap in school psychology. Journal of School Psychology, 76, 33–47. https://doi.org/10.1016/j.jsp.2019.07.008

    Article  Google Scholar 

  • Sanetti, L. M., Gritter, K. L., & Dobey, L. M. (2011). Treatment integrity of interventions with children in the school psychology literature from 1995 to 2008. School Psychology Review, 40(1), 72–84. https://doi.org/10.1177/0143034313476399

    Article  Google Scholar 

  • Sanetti, L. M., Charbonneau, S., Knight, A., Cochrane, W. S., Kulcyk, M. C. M., & Kraus, K. E. (2020). Treatment fidelity reporting in intervention outcome studies in the school psychology literature from 2009 to 2016. Psychology in the Schools, 57(6), 901–922. https://doi.org/10.1002/pits.22364

    Article  Google Scholar 

  • Schoenwald, S. K., Henggeler, S. W., Brondino, M. J., & Rowland, M. D. (2000). Multisystemic therapy: Monitoring treatment fidelity. Family Process, 39, 83–103. https://doi.org/10.1111/j.1545-5300.2000.39109.x

    Article  PubMed  Google Scholar 

  • Schoenwald, S. K., Garland, A. F., Chapman, J. E., Frazier, S. L., Sheidow, A. J., & Southam-Gerow, M. A. (2011). Toward the effective and efficient measurement of implementation fidelity. Administration and Policy in Mental Health and Mental Health Services Research, 38, 32–43. https://doi.org/10.1007/s10488-010-0321-0

    Article  PubMed  Google Scholar 

  • Snyder, P. A., Hemmeter, M. L., & Fox, L. (2015). Supporting implementation of evidence-based practices through practice-based coaching. Topics in Early Childhood Special Education, 35(3), 133–143. https://doi.org/10.1177/0271121415594925

    Article  Google Scholar 

  • Stanick, C. F., Halko, H. M., Nolen, E. A., Powell, B. J., Dorsey, C. N., Mettert, K. D., Weiner, B. J., Barwick, M., Wolfenden, L., Damschroder, L. J., & Lewis, C. C. (2019). Pragmatic measures for implementation research: development of the psychometric and pragmatic evidence rating scale (PAPERS). Translational Behavioral Medicine. Advance Online Publication. https://doi.org/10.1093/tbm/ibz164

    Article  Google Scholar 

  • Sutherland, K. S., & McLeod, B. D. (2015a). Scoring manual for the treatment integrity measure for early childhood settings: the adherence and competence scale. Unpublished scoring manual prepared at Virginia Commonwealth University.

    Google Scholar 

  • Sutherland, K. S., & McLeod, B. D. (2015b). Scoring manual for the treatment integrity measure for early childhood settings: the teacher report scale. Unpublished scoring manual prepared at Virginia Commonwealth University.

    Google Scholar 

  • Sutherland, K. S., Wehby, J. H., & Copeland, S. R. (2000). Effect of varying rates of behavior-specific praise on the on-task behavior of students with EBD. Journal of Emotional and Behavioral Disorders, 8(1), 2–8. https://doi.org/10.1177/106342660000800101

    Article  Google Scholar 

  • Sutherland, K. S., Lewis-Palmer, T., Stichter, J., & Morgan, P. (2008). Examining the influence of teacher behavior and classroom context on the behavioral and academic outcomes for students with emotional or behavioral disorders. Journal of Special Education, 41, 223–233. https://doi.org/10.1177/0022466907310372

    Article  Google Scholar 

  • Sutherland, K. S., McLeod, B. D., Conroy, M. A., & Cox, J. R. (2013). Measuring implementation of evidence-based programs targeting young children at risk for emotional/behavioral disorders conceptual issues and recommendations. Journal of Early Intervention, 35, 129–149. https://doi.org/10.1177/1053815113515025

    Article  Google Scholar 

  • Sutherland, K. S., McLeod, B. D., Conroy, M., Abrams, L., & Smith, M. M. (2014). Preliminary psychometric properties of the best in class adherence and competence scale. Journal of Emotional and Behavioral Disorders, 22(4), 249–259. https://doi.org/10.1177/1063426613497258

    Article  Google Scholar 

  • Sutherland, K. S., Conroy, M. A., Algina, J., Ladwig, C., Jessee, G., & Gyure, M. (2018). Reducing child problem behaviors and improving teacher-child interactions and relationships: A randomized controlled trial of BEST in CLASS. Early Childhood Research Quarterly, 42, 31–43. https://doi.org/10.1016/j.ecresq.2017.08.001

    Article  Google Scholar 

  • Sutherland, K. S., Conroy, M. A., & Granger, K. (2020). BEST in CLASS: A Tier-2 program for children with and at-risk for emotional/behavioral disorders. In T. Farmer, M. Conroy, E. Farmer, & K. Sutherland (Eds.), Handbook of research on emotional and behavioral disorders: interdisciplinary developmental perspectives on children and youth (pp. 214–226). Routledge/Taylor & Francis.

    Chapter  Google Scholar 

  • Trochim, W. M., & Donnelly, J. P. (2006). The research methods knowledge base (3rd ed.). Atomic Dog.

    Google Scholar 

  • Walker, H., Severson, H., & Feil, E. (1995). Early screening project: A proven child find process. Sopris West Publishing.

    Google Scholar 

  • Ware, N. C., Dickey, B., Tugenberg, T., & McHorney, C. A. (2003). CONNECT: A measure of continuity of care in mental health services. Administration and Policy in Mental Health and Mental Health Services Research, 5(4), 209–221. https://doi.org/10.1023/A:1026276918081

    Article  Google Scholar 

  • Yoder, P. J., Symons, F. J., & Lloyd, B. (2018). Observational measurement of behavior (2nd ed.). Brookes Publishing.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bryce D. McLeod.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Preparation of this article was supported in part by a grant from the Institute of Education Sciences (R305A140487; McLeod & Sutherland).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

McLeod, B.D., Sutherland, K.S., Broda, M. et al. Examining the Correspondence Between Teacher- and Observer-Report Treatment Integrity Measures. School Mental Health 14, 20–34 (2022). https://doi.org/10.1007/s12310-021-09437-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12310-021-09437-7

Keywords

Navigation