Teacher-reported measures of treatment integrity (the extent to which prescribed practices are delivered as intended by teachers) have the potential to support efforts to evaluate and implement evidence-based interventions in early childhood settings. However, self-report treatment integrity measures have shown poor correspondence with observer-report treatment integrity measures, raising questions about score validity. This paper reports on the development and initial evaluation of the score reliability and validity of the Treatment Integrity Measure for Early Childhood Settings Teacher Report (TIMECS-TR), which is designed to address limitations of previous self-report treatment integrity measures that may have contributed to low correspondence with observer-rated measures. The TIMECS-TR includes 24 items designed to represent practices found in evidence-based interventions delivered in early childhood settings that target child social, emotional, and behavioral skills, rather than adherence to practices found in a specific evidence-based intervention. Fifty-four teachers (92.6% female, 7.4% male; 61.1% White) completed the TIMECS-TR weekly for a total of 618 times (M = 6.79 per child; SD = 2.16; range 2 to 11) about the practices they delivered with 91 children (45.1% female, 54.9% male; M = 4.53 years old; SD = 45.1% Black) who were at risk for emotional and behavioral disorders. Analyses indicated that the TIMECS-TR items evidenced mild to moderate test–retest score reliability over one week. However, analyses did not support the convergent score validity of the TIMECS-TR items or scale with observational ratings of the same practices. Teachers reported higher levels of practice delivery on the TIMECS-TR items relative to observer report. Overall, our findings raise concerns about the accuracy of teacher-report adherence measures. Lessons from this research can be used to identify possible reasons for the low correspondence between teacher- and observer-report treatment integrity measures so that future research can strive to dependably capture teacher delivery of the practices found in evidence-based interventions.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Price excludes VAT (USA)
Tax calculation will be finalised during checkout.
Bellg, A. J., Borrelli, B., Resnick, B., Hecht, J., Minicucci, D. S., Ory, M., Orwig, D., & Czajkowski, S. (2004). Enhancing treatment fidelity in health behavior change studies: Best practices and recommendations from the NIH behavior change consortium. Health Psychology, 23(5), 443–451. https://doi.org/10.1037/0278-6188.8.131.523
Birch, S. H., & Ladd, G. W. (1997). The teacher–child relationship and children’s early school adjustment. Journal of School Psychology, 35(1), 61–79. https://doi.org/10.1016/S0022-4405(96)00029-5
Birch, S. H., & Ladd, G. W. (1998). Children’s interpersonal behaviors and the teacher–child relationship. Developmental Psychology, 34(5), 934–946. https://doi.org/10.1037/0012-16184.108.40.2064
Breitenstein, S. M., Gross, D., Garvey, C. A., Hill, C., Fogg, L., & Resnick, B. (2010). Implementation fidelity in community-based interventions. Research in Nursing & Health, 33(2), 164–173. https://doi.org/10.1002/nur.20373
Brown, R. D., & Hauenstein, N. M. A. (2005). Interrater agreement reconsidered: An alternative to the rwg indices. Organizational Research Methods, 8, 165–184. https://doi.org/10.1177/1094428105275376
Caron, E., Muggeo, M. A., Souer, H. R., Pella, J. E., & Ginsburg, G. S. (2019). Concordance between clinician, supervisor, and observer ratings of therapeutic competence in CBT and treatment as usual: does clinician competence or supervisor session observation improve agreement? Behavioural and Cognitive Psychotherapy, 48(3), 350–363. https://doi.org/10.1017/S1352465819000699
Carroll, K. M., Nich, C., Sifry, R. L., Nuro, K. F., Frankforter, T. L., Ball, S. A., Fenton, L., & Rounsaville, B. J. (2000). A general system for evaluating therapist adherence and competence in psychotherapy research in the addictions. Drug and Alcohol Dependence, 57(3), 225–238. https://doi.org/10.1016/s0376-8716(99)00049-6
Chapman, J. E., McCart, M. R., Letourneau, E. J., & Sheidow, A. J. (2013). Comparison of youth, caregiver, therapist, trained, and treatment expert raters of therapist adherence to a substance abuse treatment protocol. Journal of Consulting and Clinical Psychology, 81, 674–680. https://doi.org/10.1037/a0033021
Chorpita, B. F., & Daleiden, E. L. (2009). Mapping evidence-based treatments for children and adolescents: Application of the distillation and matching model to 615 treatments from 322 randomized trials. Journal of Consulting and Clinical Psychology, 77, 566–579. https://doi.org/10.1037/a0014565
Cicchetti, D. (1994). Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychological Assessment, 6, 284–290. https://doi.org/10.1037/1040-35220.127.116.114
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Erlbaum.
Collier-Meek, M. A., Sanetti, L. M., Fallon, L., & Chafouleas, S. (2019). Exploring the influences of assessment method, intervention steps, intervention sessions, and observation timing on treatment fidelity estimates. Assessment for Effective Intervention, 46(1), 3–13. https://doi.org/10.1177/1534508419857228
Connors, E., Lawson, G., Wheatley-Rowe, D., & Hoover, S. (2020). Exploration, preparation, and implementation of standardized assessment in a multi-agency school behavioral health network. Administration and Policy in Mental Health and Mental Health Services Research. https://doi.org/10.1007/s10488-020-01082-7
Dart, E. H., Collier-Meek, M. A., Chambers, C., & Murphy, A. (2020). Multi-informant assessment of treatment integrity in the classroom. Psychology in the Schools, 57, 805–822. https://doi.org/10.1002/pits.22351
De Los Reyes, A., Cook, C. R., Gresham, M., Bridget, A., Makol, A., & Wang, M. (2019). Informant discrepancies in assessments of psychosocial functioning in school-based services and research: Review and directions for future research. Journal of School Psychology, 74, 74–89. https://doi.org/10.1016/j.jsp.2019.05.005
Dotterer, A. M., Burchinal, M., Cryant, D., Early, D., & Pianta, R. (2013). Universal and targeted pre-kindergarten programmes: A comparison of classroom characteristics and child outcomes. Early Child Development and Care, 183, 931–950. https://doi.org/10.1080/03004430.2012.698388
Fallon, L. M., Sanetti, L. M. H., Chafouleas, S. M., Faggella-Luby, M. N., & Briesch, A. M. (2018). Direct training to increase agreement between teachers’ and observers’ treatment integrity ratings. Assessment for Effective Interventions, 43(4), 196–211. https://doi.org/10.1177/1534508417738721
Fleiss, J. (1981). Balanced incomplete block designs for interrater reliability studies. Applied Psychological Measurement, 5, 105–112. https://doi.org/10.1177/014662168100500115
Hamre, B. K., & Pianta, R. C. (2001). Early teacher – child relationship and the trajectory of children’s school outcome through eighth grade. Child Development, 72, 625–638. https://doi.org/10.1111/1467-8624.00301
Haynes, S. N., Richard, D. C., & Kubany, E. S. (1995). Content validity in psychological assessment: A functional approach to concepts and methods. Psychological Assessment, 7, 238–247. https://doi.org/10.1037/1040-3518.104.22.168
Henggeler, S. W., & Borduin, C. M. (1992). Multisystemic therapy adherence scales. Department of Psychiatry and Behavioral Sciences, Medical University of South Carolina.
Henggeler, S. W., Pickrel, S. G., & Brondino, M. J. (1999). Multisystemic treatment of substance-abusing and dependent delinquents: Outcomes, treatment fidelity, and transportability. Mental Health Services Research, 1, 171–184. https://doi.org/10.1023/a:1022373813261
Hogue, A. (2002). Adherence process research on developmental interventions: Filling in the middle. In A. Higgins-D’Alessandro & K. R. B. Jankowski (Eds.), New directions for child and adolescent development, Vol. 98: Science for society: Informing policy and practice through research in developmental psychology (pp. 67–74). Jossey Bass.
Hogue, A., Liddle, H. A., & Rowe, C. (1996). Treatment adherence process research in family therapy: A rationale and some practical guidelines. Psychotherapy, 33, 332–345. https://doi.org/10.1037/0033-322.214.171.1242
Hogue, A., Dauber, S., Chinchilla, P., Fried, A., Henderson, C., Inclan, J., Reiner, R. H., & Liddle, H. A. (2008). Assessing fidelity in individual and family therapy for adolescent substance abuse. Journal of Substance Abuse Treatment, 35(2), 137–147. https://doi.org/10.1016/j.jsat.2007.09.002
Hogue, A., Ozechowski, T. J., Robbins, M. S., & Waldron, H. B. (2013). Making fidelity an intramural game: Localizing quality assurance procedures to promote sustainability of evidence-based practices in usual care. Clinical Psychology: Science and Practice, 20, 60–77. https://doi.org/10.1111/cpsp.12023
Hogue, A., Dauber, S., Henderson, C. E., & Liddle, H. A. (2014). Reliability of therapist self-report on treatment targets and focus in family-based intervention. Administration and Policy in Mental Health and Mental Health Services Research, 41, 697–705. https://doi.org/10.1007/s10488-013-0520-6
Hogue, A., Dauber, S., Lichvar, E., Bobek, M., & Henderson, C. E. (2015). Validity of therapist self-report ratings of fidelity to evidence-based practices for adolescent behavior problems: Correspondence between therapists and observers. Administration and Policy in Mental Health and Mental Health Services Research, 42, 229–243. https://doi.org/10.1007/s10488-014-0548-2
Hogue, A., Bobek, M., Dauber, S., Henderson, C. E., McLeod, B. D., & Southam-Gerow, M. A. (2017). Distilling the core elements of family therapy for adolescent substance use: Conceptual and empirical solutions. Journal of Child and Adolescent Substance Abuse, 26(6), 437–453. https://doi.org/10.1080/1067828X.2017.1322020
Hogue, A., Bobek, M., Dauber, S., Henderson, C. E., McLeod, B. D., & Southam-Gerow, M. A. (2019). Core elements of family therapy for adolescent substance use: Empirical distillation of three manualized treatments. Journal of Clinical Child and Adolescent Psychology, 48(1), 29–41. https://doi.org/10.1080/15374416.2018.1555762
Howes, C., & Ritchie, S. (1999). Attachment organizations in children with difficult life circumstances. Development and Psychopathology, 11(2), 251–268. https://doi.org/10.1017/S0954579499002047
Hurlburt, M. S., Garland, A. F., Nguyen, K., & Brookman-Frazee, L. (2010). Child and family therapy process: Concordance of therapist and observational perspectives. Administration and Policy in Mental Health and Mental Health Services Research, 37, 230–244. https://doi.org/10.1007/s10488-009-0251-x
James, L. R., Demaree, R. G., & Wolf, G. (1984). Estimating within-group interrater reliability with and without response bias. Journal of Applied Psychology, 69, 85–98. https://doi.org/10.1037/0021-9010.69.1.85
Lebreton, J. M., Burgess, J. R., Kaiser, R. B., Atchley, E. K., & James, L. R. (2003). The restriction of variance hypothesis and interrater reliability and agreement: Are ratings from multiple sources really dissimilar? Organizational Research Methods, 6(1), 80–128. https://doi.org/10.1177/1094428102239427
Little, R. J. A. (1988). A test of missing completely at random for multivariate data with missing values. Journal of the American Statistical Association, 83, 1198–1202. https://doi.org/10.1080/01621459.1988.10478722
Lyon, A. R., & Koerner, K. (2016). User-centered design for psychosocial intervention development and implementation. Clinical Psychology: Science and Practice, 23(2), 180–200. https://doi.org/10.1111/cpsp.12154
Margolin, G., Oliver, P., Gordis, E., O’Hearn, H., Medina, A., Ghosh, C., & Morland, L. (1998). The nuts and bolts of behavioral observation of marital and family interaction. Clinical Child and Family Psychology Review, 1(4), 195–213. https://doi.org/10.1023/a:1022608117322
McLeod, B. D., & Sutherland, K. S. (2015). Scoring manual for the observational teacher–child relationship measure. Unpublished scoring manual prepared at Virginia Commonwealth University.
McLeod, B. D., Southam-Gerow, M. A., & Weisz, J. R. (2009). Conceptual and methodological issues in treatment integrity measurement. School Psychology Review, 38, 541–546.
McLeod, B. D., Southam-Gerow, M. A., Bair, C. E., Rodriguez, A., & Smith, M. M. (2013). Making a case for treatment integrity as a psychological treatment quality indicator. Clinical Psychology: Science and Practice, 20(1), 14–32. https://doi.org/10.1111/cpsp.12020
McLeod, B. D., Sutherland, K. S., Martinez, R. G., Conroy, M. A., Snyder, P. A., & Southam-Gerow, M. A. (2017). Identifying common practice elements to improve social, emotional, and behavioral outcomes of young children in early childhood classrooms. Prevention Science, 18(2), 204–213. https://doi.org/10.1007/s11121-016-0703-y
McLeod, B. D., Sutherland, K. S., Broda, M., Granger, K. L., Martinez, R. G., Conroy, M. A., Snyder, P. A., & Southam-Gerow, M. A. (2020). Development and initial psychometrics of a generic treatment integrity measure designed to assess practice elements of evidence-based interventions for early childhood settings. Manuscript submitted for publication.
McLeod, B. D., Sutherland, K. S., Broda, M., Granger, K. L., Frey, A., & Markowicz, K. (2021). Development and initial psychometrics of the observational teacher-child interactions scale for early childhood settings. Manuscript in preparation.
Newborg, J. (2005). Battelle developmental inventory, 2nd edition, examiner’s manual. Riverside Publishing.
Perepletchikova, F., Treat, T. A., & Kazdin, A. E. (2007). Treatment integrity in psychotherapy research: Analysis of the studies and examination of the associated factors. Journal of Consulting and Clinical Psychology, 75(6), 829–841. https://doi.org/10.1037/0022-006X.75.6.829
Pianta, R. C., & Hamre, B. (2001). Students, teachers, and relationship support (STARS). Psychological Assessment Resources.
Pianta, R. C., La Paro, K. M., Payne, C., Cox, M. J., & Bradley, R. (2002). The relation of kindergarten classroom environment to teacher, family, and school characteristics and child outcomes. Elementary School Journal, 102, 225–238. https://doi.org/10.1086/499701
Proctor, E., Silmere, H., Raghavan, R., Hovmand, P., Aarons, G., Bunger, A., Griffery, R., & Hensley, M. (2011). Outcomes for implementation research: Conceptual distinctions, measurement challenges, and research agenda. Administration and Policy in Mental Health and Mental Health Services Research, 38(2), 65–76. https://doi.org/10.1007/s10488-010-0319-7
Reddy, L. A., Dudek, C. M., Fabiano, G. A., & Peters, S. (2015). Measuring teacher self-report on classroom practices: Construct validity and reliability of the classroom strategies scale—Teacher form. School Psychology Quarterly, 30(4), 513–533. https://doi.org/10.1037/spq0000110
Reddy, L. A., Dudek, C. M., Rualo, A. J., & Fabiano, G. A. (2016). Concurrent validity of the classroom strategies scale—teacher form: A preliminary investigation. Educational Assessment, 21(4), 267–277. https://doi.org/10.1080/10627197.2016.1236675
Rosenthal, R., & Rosnow, R. L. (1984). Essentials of behavioral research: Methods and data analysis. New York: McGraw-Hill.
Sanetti, L. M., & Collier-Meek, M. (2019). Increasing implementation science literacy to address the research-to-practice gap in school psychology. Journal of School Psychology, 76, 33–47. https://doi.org/10.1016/j.jsp.2019.07.008
Sanetti, L. M., Gritter, K. L., & Dobey, L. M. (2011). Treatment integrity of interventions with children in the school psychology literature from 1995 to 2008. School Psychology Review, 40(1), 72–84. https://doi.org/10.1177/0143034313476399
Sanetti, L. M., Charbonneau, S., Knight, A., Cochrane, W. S., Kulcyk, M. C. M., & Kraus, K. E. (2020). Treatment fidelity reporting in intervention outcome studies in the school psychology literature from 2009 to 2016. Psychology in the Schools, 57(6), 901–922. https://doi.org/10.1002/pits.22364
Schoenwald, S. K., Henggeler, S. W., Brondino, M. J., & Rowland, M. D. (2000). Multisystemic therapy: Monitoring treatment fidelity. Family Process, 39, 83–103. https://doi.org/10.1111/j.1545-5300.2000.39109.x
Schoenwald, S. K., Garland, A. F., Chapman, J. E., Frazier, S. L., Sheidow, A. J., & Southam-Gerow, M. A. (2011). Toward the effective and efficient measurement of implementation fidelity. Administration and Policy in Mental Health and Mental Health Services Research, 38, 32–43. https://doi.org/10.1007/s10488-010-0321-0
Snyder, P. A., Hemmeter, M. L., & Fox, L. (2015). Supporting implementation of evidence-based practices through practice-based coaching. Topics in Early Childhood Special Education, 35(3), 133–143. https://doi.org/10.1177/0271121415594925
Stanick, C. F., Halko, H. M., Nolen, E. A., Powell, B. J., Dorsey, C. N., Mettert, K. D., Weiner, B. J., Barwick, M., Wolfenden, L., Damschroder, L. J., & Lewis, C. C. (2019). Pragmatic measures for implementation research: development of the psychometric and pragmatic evidence rating scale (PAPERS). Translational Behavioral Medicine. Advance Online Publication. https://doi.org/10.1093/tbm/ibz164
Sutherland, K. S., & McLeod, B. D. (2015a). Scoring manual for the treatment integrity measure for early childhood settings: the adherence and competence scale. Unpublished scoring manual prepared at Virginia Commonwealth University.
Sutherland, K. S., & McLeod, B. D. (2015b). Scoring manual for the treatment integrity measure for early childhood settings: the teacher report scale. Unpublished scoring manual prepared at Virginia Commonwealth University.
Sutherland, K. S., Wehby, J. H., & Copeland, S. R. (2000). Effect of varying rates of behavior-specific praise on the on-task behavior of students with EBD. Journal of Emotional and Behavioral Disorders, 8(1), 2–8. https://doi.org/10.1177/106342660000800101
Sutherland, K. S., Lewis-Palmer, T., Stichter, J., & Morgan, P. (2008). Examining the influence of teacher behavior and classroom context on the behavioral and academic outcomes for students with emotional or behavioral disorders. Journal of Special Education, 41, 223–233. https://doi.org/10.1177/0022466907310372
Sutherland, K. S., McLeod, B. D., Conroy, M. A., & Cox, J. R. (2013). Measuring implementation of evidence-based programs targeting young children at risk for emotional/behavioral disorders conceptual issues and recommendations. Journal of Early Intervention, 35, 129–149. https://doi.org/10.1177/1053815113515025
Sutherland, K. S., McLeod, B. D., Conroy, M., Abrams, L., & Smith, M. M. (2014). Preliminary psychometric properties of the best in class adherence and competence scale. Journal of Emotional and Behavioral Disorders, 22(4), 249–259. https://doi.org/10.1177/1063426613497258
Sutherland, K. S., Conroy, M. A., Algina, J., Ladwig, C., Jessee, G., & Gyure, M. (2018). Reducing child problem behaviors and improving teacher-child interactions and relationships: A randomized controlled trial of BEST in CLASS. Early Childhood Research Quarterly, 42, 31–43. https://doi.org/10.1016/j.ecresq.2017.08.001
Sutherland, K. S., Conroy, M. A., & Granger, K. (2020). BEST in CLASS: A Tier-2 program for children with and at-risk for emotional/behavioral disorders. In T. Farmer, M. Conroy, E. Farmer, & K. Sutherland (Eds.), Handbook of research on emotional and behavioral disorders: interdisciplinary developmental perspectives on children and youth (pp. 214–226). Routledge/Taylor & Francis.
Trochim, W. M., & Donnelly, J. P. (2006). The research methods knowledge base (3rd ed.). Atomic Dog.
Walker, H., Severson, H., & Feil, E. (1995). Early screening project: A proven child find process. Sopris West Publishing.
Ware, N. C., Dickey, B., Tugenberg, T., & McHorney, C. A. (2003). CONNECT: A measure of continuity of care in mental health services. Administration and Policy in Mental Health and Mental Health Services Research, 5(4), 209–221. https://doi.org/10.1023/A:1026276918081
Yoder, P. J., Symons, F. J., & Lloyd, B. (2018). Observational measurement of behavior (2nd ed.). Brookes Publishing.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Preparation of this article was supported in part by a grant from the Institute of Education Sciences (R305A140487; McLeod & Sutherland).
About this article
Cite this article
McLeod, B.D., Sutherland, K.S., Broda, M. et al. Examining the Correspondence Between Teacher- and Observer-Report Treatment Integrity Measures. School Mental Health 14, 20–34 (2022). https://doi.org/10.1007/s12310-021-09437-7