Skip to main content

Examining the Correspondence Between Teacher- and Observer-Report Treatment Integrity Measures

Abstract

Teacher-reported measures of treatment integrity (the extent to which prescribed practices are delivered as intended by teachers) have the potential to support efforts to evaluate and implement evidence-based interventions in early childhood settings. However, self-report treatment integrity measures have shown poor correspondence with observer-report treatment integrity measures, raising questions about score validity. This paper reports on the development and initial evaluation of the score reliability and validity of the Treatment Integrity Measure for Early Childhood Settings Teacher Report (TIMECS-TR), which is designed to address limitations of previous self-report treatment integrity measures that may have contributed to low correspondence with observer-rated measures. The TIMECS-TR includes 24 items designed to represent practices found in evidence-based interventions delivered in early childhood settings that target child social, emotional, and behavioral skills, rather than adherence to practices found in a specific evidence-based intervention. Fifty-four teachers (92.6% female, 7.4% male; 61.1% White) completed the TIMECS-TR weekly for a total of 618 times (M = 6.79 per child; SD = 2.16; range 2 to 11) about the practices they delivered with 91 children (45.1% female, 54.9% male; M = 4.53 years old; SD = 45.1% Black) who were at risk for emotional and behavioral disorders. Analyses indicated that the TIMECS-TR items evidenced mild to moderate test–retest score reliability over one week. However, analyses did not support the convergent score validity of the TIMECS-TR items or scale with observational ratings of the same practices. Teachers reported higher levels of practice delivery on the TIMECS-TR items relative to observer report. Overall, our findings raise concerns about the accuracy of teacher-report adherence measures. Lessons from this research can be used to identify possible reasons for the low correspondence between teacher- and observer-report treatment integrity measures so that future research can strive to dependably capture teacher delivery of the practices found in evidence-based interventions.

This is a preview of subscription content, access via your institution.

References

  1. Bellg, A. J., Borrelli, B., Resnick, B., Hecht, J., Minicucci, D. S., Ory, M., Orwig, D., & Czajkowski, S. (2004). Enhancing treatment fidelity in health behavior change studies: Best practices and recommendations from the NIH behavior change consortium. Health Psychology, 23(5), 443–451. https://doi.org/10.1037/0278-6133.23.5.443

    Article  PubMed  Google Scholar 

  2. Birch, S. H., & Ladd, G. W. (1997). The teacher–child relationship and children’s early school adjustment. Journal of School Psychology, 35(1), 61–79. https://doi.org/10.1016/S0022-4405(96)00029-5

    Article  Google Scholar 

  3. Birch, S. H., & Ladd, G. W. (1998). Children’s interpersonal behaviors and the teacher–child relationship. Developmental Psychology, 34(5), 934–946. https://doi.org/10.1037/0012-1649.34.5.934

    Article  PubMed  Google Scholar 

  4. Breitenstein, S. M., Gross, D., Garvey, C. A., Hill, C., Fogg, L., & Resnick, B. (2010). Implementation fidelity in community-based interventions. Research in Nursing & Health, 33(2), 164–173. https://doi.org/10.1002/nur.20373

    Article  Google Scholar 

  5. Brown, R. D., & Hauenstein, N. M. A. (2005). Interrater agreement reconsidered: An alternative to the rwg indices. Organizational Research Methods, 8, 165–184. https://doi.org/10.1177/1094428105275376

    Article  Google Scholar 

  6. Caron, E., Muggeo, M. A., Souer, H. R., Pella, J. E., & Ginsburg, G. S. (2019). Concordance between clinician, supervisor, and observer ratings of therapeutic competence in CBT and treatment as usual: does clinician competence or supervisor session observation improve agreement? Behavioural and Cognitive Psychotherapy, 48(3), 350–363. https://doi.org/10.1017/S1352465819000699

    Article  PubMed  Google Scholar 

  7. Carroll, K. M., Nich, C., Sifry, R. L., Nuro, K. F., Frankforter, T. L., Ball, S. A., Fenton, L., & Rounsaville, B. J. (2000). A general system for evaluating therapist adherence and competence in psychotherapy research in the addictions. Drug and Alcohol Dependence, 57(3), 225–238. https://doi.org/10.1016/s0376-8716(99)00049-6

    Article  PubMed  Google Scholar 

  8. Chapman, J. E., McCart, M. R., Letourneau, E. J., & Sheidow, A. J. (2013). Comparison of youth, caregiver, therapist, trained, and treatment expert raters of therapist adherence to a substance abuse treatment protocol. Journal of Consulting and Clinical Psychology, 81, 674–680. https://doi.org/10.1037/a0033021

    Article  PubMed  PubMed Central  Google Scholar 

  9. Chorpita, B. F., & Daleiden, E. L. (2009). Mapping evidence-based treatments for children and adolescents: Application of the distillation and matching model to 615 treatments from 322 randomized trials. Journal of Consulting and Clinical Psychology, 77, 566–579. https://doi.org/10.1037/a0014565

    Article  PubMed  Google Scholar 

  10. Cicchetti, D. (1994). Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychological Assessment, 6, 284–290. https://doi.org/10.1037/1040-3590.6.4.284

    Article  Google Scholar 

  11. Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Erlbaum.

    Google Scholar 

  12. Collier-Meek, M. A., Sanetti, L. M., Fallon, L., & Chafouleas, S. (2019). Exploring the influences of assessment method, intervention steps, intervention sessions, and observation timing on treatment fidelity estimates. Assessment for Effective Intervention, 46(1), 3–13. https://doi.org/10.1177/1534508419857228

    Article  Google Scholar 

  13. Connors, E., Lawson, G., Wheatley-Rowe, D., & Hoover, S. (2020). Exploration, preparation, and implementation of standardized assessment in a multi-agency school behavioral health network. Administration and Policy in Mental Health and Mental Health Services Research. https://doi.org/10.1007/s10488-020-01082-7

    Article  PubMed  Google Scholar 

  14. Dart, E. H., Collier-Meek, M. A., Chambers, C., & Murphy, A. (2020). Multi-informant assessment of treatment integrity in the classroom. Psychology in the Schools, 57, 805–822. https://doi.org/10.1002/pits.22351

    Article  Google Scholar 

  15. De Los Reyes, A., Cook, C. R., Gresham, M., Bridget, A., Makol, A., & Wang, M. (2019). Informant discrepancies in assessments of psychosocial functioning in school-based services and research: Review and directions for future research. Journal of School Psychology, 74, 74–89. https://doi.org/10.1016/j.jsp.2019.05.005

    Article  PubMed  Google Scholar 

  16. Dotterer, A. M., Burchinal, M., Cryant, D., Early, D., & Pianta, R. (2013). Universal and targeted pre-kindergarten programmes: A comparison of classroom characteristics and child outcomes. Early Child Development and Care, 183, 931–950. https://doi.org/10.1080/03004430.2012.698388

    Article  Google Scholar 

  17. Fallon, L. M., Sanetti, L. M. H., Chafouleas, S. M., Faggella-Luby, M. N., & Briesch, A. M. (2018). Direct training to increase agreement between teachers’ and observers’ treatment integrity ratings. Assessment for Effective Interventions, 43(4), 196–211. https://doi.org/10.1177/1534508417738721

    Article  Google Scholar 

  18. Fleiss, J. (1981). Balanced incomplete block designs for interrater reliability studies. Applied Psychological Measurement, 5, 105–112. https://doi.org/10.1177/014662168100500115

    Article  Google Scholar 

  19. Hamre, B. K., & Pianta, R. C. (2001). Early teacher – child relationship and the trajectory of children’s school outcome through eighth grade. Child Development, 72, 625–638. https://doi.org/10.1111/1467-8624.00301

    Article  PubMed  Google Scholar 

  20. Haynes, S. N., Richard, D. C., & Kubany, E. S. (1995). Content validity in psychological assessment: A functional approach to concepts and methods. Psychological Assessment, 7, 238–247. https://doi.org/10.1037/1040-3590.7.3.238

    Article  Google Scholar 

  21. Henggeler, S. W., & Borduin, C. M. (1992). Multisystemic therapy adherence scales. Department of Psychiatry and Behavioral Sciences, Medical University of South Carolina.

    Google Scholar 

  22. Henggeler, S. W., Pickrel, S. G., & Brondino, M. J. (1999). Multisystemic treatment of substance-abusing and dependent delinquents: Outcomes, treatment fidelity, and transportability. Mental Health Services Research, 1, 171–184. https://doi.org/10.1023/a:1022373813261

    Article  PubMed  Google Scholar 

  23. Hogue, A. (2002). Adherence process research on developmental interventions: Filling in the middle. In A. Higgins-D’Alessandro & K. R. B. Jankowski (Eds.), New directions for child and adolescent development, Vol. 98: Science for society: Informing policy and practice through research in developmental psychology (pp. 67–74). Jossey Bass.

    Google Scholar 

  24. Hogue, A., Liddle, H. A., & Rowe, C. (1996). Treatment adherence process research in family therapy: A rationale and some practical guidelines. Psychotherapy, 33, 332–345. https://doi.org/10.1037/0033-3204.33.2.332

    Article  Google Scholar 

  25. Hogue, A., Dauber, S., Chinchilla, P., Fried, A., Henderson, C., Inclan, J., Reiner, R. H., & Liddle, H. A. (2008). Assessing fidelity in individual and family therapy for adolescent substance abuse. Journal of Substance Abuse Treatment, 35(2), 137–147. https://doi.org/10.1016/j.jsat.2007.09.002

    Article  PubMed  Google Scholar 

  26. Hogue, A., Ozechowski, T. J., Robbins, M. S., & Waldron, H. B. (2013). Making fidelity an intramural game: Localizing quality assurance procedures to promote sustainability of evidence-based practices in usual care. Clinical Psychology: Science and Practice, 20, 60–77. https://doi.org/10.1111/cpsp.12023

    Article  Google Scholar 

  27. Hogue, A., Dauber, S., Henderson, C. E., & Liddle, H. A. (2014). Reliability of therapist self-report on treatment targets and focus in family-based intervention. Administration and Policy in Mental Health and Mental Health Services Research, 41, 697–705. https://doi.org/10.1007/s10488-013-0520-6

    Article  PubMed  Google Scholar 

  28. Hogue, A., Dauber, S., Lichvar, E., Bobek, M., & Henderson, C. E. (2015). Validity of therapist self-report ratings of fidelity to evidence-based practices for adolescent behavior problems: Correspondence between therapists and observers. Administration and Policy in Mental Health and Mental Health Services Research, 42, 229–243. https://doi.org/10.1007/s10488-014-0548-2

    Article  PubMed  Google Scholar 

  29. Hogue, A., Bobek, M., Dauber, S., Henderson, C. E., McLeod, B. D., & Southam-Gerow, M. A. (2017). Distilling the core elements of family therapy for adolescent substance use: Conceptual and empirical solutions. Journal of Child and Adolescent Substance Abuse, 26(6), 437–453. https://doi.org/10.1080/1067828X.2017.1322020

    Article  PubMed  Google Scholar 

  30. Hogue, A., Bobek, M., Dauber, S., Henderson, C. E., McLeod, B. D., & Southam-Gerow, M. A. (2019). Core elements of family therapy for adolescent substance use: Empirical distillation of three manualized treatments. Journal of Clinical Child and Adolescent Psychology, 48(1), 29–41. https://doi.org/10.1080/15374416.2018.1555762

    Article  PubMed  PubMed Central  Google Scholar 

  31. Howes, C., & Ritchie, S. (1999). Attachment organizations in children with difficult life circumstances. Development and Psychopathology, 11(2), 251–268. https://doi.org/10.1017/S0954579499002047

    Article  PubMed  Google Scholar 

  32. Hurlburt, M. S., Garland, A. F., Nguyen, K., & Brookman-Frazee, L. (2010). Child and family therapy process: Concordance of therapist and observational perspectives. Administration and Policy in Mental Health and Mental Health Services Research, 37, 230–244. https://doi.org/10.1007/s10488-009-0251-x

    Article  PubMed  Google Scholar 

  33. James, L. R., Demaree, R. G., & Wolf, G. (1984). Estimating within-group interrater reliability with and without response bias. Journal of Applied Psychology, 69, 85–98. https://doi.org/10.1037/0021-9010.69.1.85

    Article  Google Scholar 

  34. Lebreton, J. M., Burgess, J. R., Kaiser, R. B., Atchley, E. K., & James, L. R. (2003). The restriction of variance hypothesis and interrater reliability and agreement: Are ratings from multiple sources really dissimilar? Organizational Research Methods, 6(1), 80–128. https://doi.org/10.1177/1094428102239427

    Article  Google Scholar 

  35. Little, R. J. A. (1988). A test of missing completely at random for multivariate data with missing values. Journal of the American Statistical Association, 83, 1198–1202. https://doi.org/10.1080/01621459.1988.10478722

    Article  Google Scholar 

  36. Lyon, A. R., & Koerner, K. (2016). User-centered design for psychosocial intervention development and implementation. Clinical Psychology: Science and Practice, 23(2), 180–200. https://doi.org/10.1111/cpsp.12154

    Article  Google Scholar 

  37. Margolin, G., Oliver, P., Gordis, E., O’Hearn, H., Medina, A., Ghosh, C., & Morland, L. (1998). The nuts and bolts of behavioral observation of marital and family interaction. Clinical Child and Family Psychology Review, 1(4), 195–213. https://doi.org/10.1023/a:1022608117322

    Article  PubMed  Google Scholar 

  38. McLeod, B. D., & Sutherland, K. S. (2015). Scoring manual for the observational teacher–child relationship measure. Unpublished scoring manual prepared at Virginia Commonwealth University.

    Google Scholar 

  39. McLeod, B. D., Southam-Gerow, M. A., & Weisz, J. R. (2009). Conceptual and methodological issues in treatment integrity measurement. School Psychology Review, 38, 541–546.

    Google Scholar 

  40. McLeod, B. D., Southam-Gerow, M. A., Bair, C. E., Rodriguez, A., & Smith, M. M. (2013). Making a case for treatment integrity as a psychological treatment quality indicator. Clinical Psychology: Science and Practice, 20(1), 14–32. https://doi.org/10.1111/cpsp.12020

    Article  Google Scholar 

  41. McLeod, B. D., Sutherland, K. S., Martinez, R. G., Conroy, M. A., Snyder, P. A., & Southam-Gerow, M. A. (2017). Identifying common practice elements to improve social, emotional, and behavioral outcomes of young children in early childhood classrooms. Prevention Science, 18(2), 204–213. https://doi.org/10.1007/s11121-016-0703-y

    Article  PubMed  Google Scholar 

  42. McLeod, B. D., Sutherland, K. S., Broda, M., Granger, K. L., Martinez, R. G., Conroy, M. A., Snyder, P. A., & Southam-Gerow, M. A. (2020). Development and initial psychometrics of a generic treatment integrity measure designed to assess practice elements of evidence-based interventions for early childhood settings. Manuscript submitted for publication.

  43. McLeod, B. D., Sutherland, K. S., Broda, M., Granger, K. L., Frey, A., & Markowicz, K. (2021). Development and initial psychometrics of the observational teacher-child interactions scale for early childhood settings. Manuscript in preparation.

  44. Newborg, J. (2005). Battelle developmental inventory, 2nd edition, examiner’s manual. Riverside Publishing.

    Google Scholar 

  45. Perepletchikova, F., Treat, T. A., & Kazdin, A. E. (2007). Treatment integrity in psychotherapy research: Analysis of the studies and examination of the associated factors. Journal of Consulting and Clinical Psychology, 75(6), 829–841. https://doi.org/10.1037/0022-006X.75.6.829

    Article  PubMed  Google Scholar 

  46. Pianta, R. C., & Hamre, B. (2001). Students, teachers, and relationship support (STARS). Psychological Assessment Resources.

    Google Scholar 

  47. Pianta, R. C., La Paro, K. M., Payne, C., Cox, M. J., & Bradley, R. (2002). The relation of kindergarten classroom environment to teacher, family, and school characteristics and child outcomes. Elementary School Journal, 102, 225–238. https://doi.org/10.1086/499701

    Article  Google Scholar 

  48. Proctor, E., Silmere, H., Raghavan, R., Hovmand, P., Aarons, G., Bunger, A., Griffery, R., & Hensley, M. (2011). Outcomes for implementation research: Conceptual distinctions, measurement challenges, and research agenda. Administration and Policy in Mental Health and Mental Health Services Research, 38(2), 65–76. https://doi.org/10.1007/s10488-010-0319-7

    Article  PubMed  Google Scholar 

  49. Reddy, L. A., Dudek, C. M., Fabiano, G. A., & Peters, S. (2015). Measuring teacher self-report on classroom practices: Construct validity and reliability of the classroom strategies scale—Teacher form. School Psychology Quarterly, 30(4), 513–533. https://doi.org/10.1037/spq0000110

    Article  PubMed  Google Scholar 

  50. Reddy, L. A., Dudek, C. M., Rualo, A. J., & Fabiano, G. A. (2016). Concurrent validity of the classroom strategies scale—teacher form: A preliminary investigation. Educational Assessment, 21(4), 267–277. https://doi.org/10.1080/10627197.2016.1236675

    Article  Google Scholar 

  51. Rosenthal, R., & Rosnow, R. L. (1984). Essentials of behavioral research: Methods and data analysis. New York: McGraw-Hill.

    Google Scholar 

  52. Sanetti, L. M., & Collier-Meek, M. (2019). Increasing implementation science literacy to address the research-to-practice gap in school psychology. Journal of School Psychology, 76, 33–47. https://doi.org/10.1016/j.jsp.2019.07.008

    Article  Google Scholar 

  53. Sanetti, L. M., Gritter, K. L., & Dobey, L. M. (2011). Treatment integrity of interventions with children in the school psychology literature from 1995 to 2008. School Psychology Review, 40(1), 72–84. https://doi.org/10.1177/0143034313476399

    Article  Google Scholar 

  54. Sanetti, L. M., Charbonneau, S., Knight, A., Cochrane, W. S., Kulcyk, M. C. M., & Kraus, K. E. (2020). Treatment fidelity reporting in intervention outcome studies in the school psychology literature from 2009 to 2016. Psychology in the Schools, 57(6), 901–922. https://doi.org/10.1002/pits.22364

    Article  Google Scholar 

  55. Schoenwald, S. K., Henggeler, S. W., Brondino, M. J., & Rowland, M. D. (2000). Multisystemic therapy: Monitoring treatment fidelity. Family Process, 39, 83–103. https://doi.org/10.1111/j.1545-5300.2000.39109.x

    Article  PubMed  Google Scholar 

  56. Schoenwald, S. K., Garland, A. F., Chapman, J. E., Frazier, S. L., Sheidow, A. J., & Southam-Gerow, M. A. (2011). Toward the effective and efficient measurement of implementation fidelity. Administration and Policy in Mental Health and Mental Health Services Research, 38, 32–43. https://doi.org/10.1007/s10488-010-0321-0

    Article  PubMed  Google Scholar 

  57. Snyder, P. A., Hemmeter, M. L., & Fox, L. (2015). Supporting implementation of evidence-based practices through practice-based coaching. Topics in Early Childhood Special Education, 35(3), 133–143. https://doi.org/10.1177/0271121415594925

    Article  Google Scholar 

  58. Stanick, C. F., Halko, H. M., Nolen, E. A., Powell, B. J., Dorsey, C. N., Mettert, K. D., Weiner, B. J., Barwick, M., Wolfenden, L., Damschroder, L. J., & Lewis, C. C. (2019). Pragmatic measures for implementation research: development of the psychometric and pragmatic evidence rating scale (PAPERS). Translational Behavioral Medicine. Advance Online Publication. https://doi.org/10.1093/tbm/ibz164

    Article  Google Scholar 

  59. Sutherland, K. S., & McLeod, B. D. (2015a). Scoring manual for the treatment integrity measure for early childhood settings: the adherence and competence scale. Unpublished scoring manual prepared at Virginia Commonwealth University.

    Google Scholar 

  60. Sutherland, K. S., & McLeod, B. D. (2015b). Scoring manual for the treatment integrity measure for early childhood settings: the teacher report scale. Unpublished scoring manual prepared at Virginia Commonwealth University.

    Google Scholar 

  61. Sutherland, K. S., Wehby, J. H., & Copeland, S. R. (2000). Effect of varying rates of behavior-specific praise on the on-task behavior of students with EBD. Journal of Emotional and Behavioral Disorders, 8(1), 2–8. https://doi.org/10.1177/106342660000800101

    Article  Google Scholar 

  62. Sutherland, K. S., Lewis-Palmer, T., Stichter, J., & Morgan, P. (2008). Examining the influence of teacher behavior and classroom context on the behavioral and academic outcomes for students with emotional or behavioral disorders. Journal of Special Education, 41, 223–233. https://doi.org/10.1177/0022466907310372

    Article  Google Scholar 

  63. Sutherland, K. S., McLeod, B. D., Conroy, M. A., & Cox, J. R. (2013). Measuring implementation of evidence-based programs targeting young children at risk for emotional/behavioral disorders conceptual issues and recommendations. Journal of Early Intervention, 35, 129–149. https://doi.org/10.1177/1053815113515025

    Article  Google Scholar 

  64. Sutherland, K. S., McLeod, B. D., Conroy, M., Abrams, L., & Smith, M. M. (2014). Preliminary psychometric properties of the best in class adherence and competence scale. Journal of Emotional and Behavioral Disorders, 22(4), 249–259. https://doi.org/10.1177/1063426613497258

    Article  Google Scholar 

  65. Sutherland, K. S., Conroy, M. A., Algina, J., Ladwig, C., Jessee, G., & Gyure, M. (2018). Reducing child problem behaviors and improving teacher-child interactions and relationships: A randomized controlled trial of BEST in CLASS. Early Childhood Research Quarterly, 42, 31–43. https://doi.org/10.1016/j.ecresq.2017.08.001

    Article  Google Scholar 

  66. Sutherland, K. S., Conroy, M. A., & Granger, K. (2020). BEST in CLASS: A Tier-2 program for children with and at-risk for emotional/behavioral disorders. In T. Farmer, M. Conroy, E. Farmer, & K. Sutherland (Eds.), Handbook of research on emotional and behavioral disorders: interdisciplinary developmental perspectives on children and youth (pp. 214–226). Routledge/Taylor & Francis.

    Chapter  Google Scholar 

  67. Trochim, W. M., & Donnelly, J. P. (2006). The research methods knowledge base (3rd ed.). Atomic Dog.

    Google Scholar 

  68. Walker, H., Severson, H., & Feil, E. (1995). Early screening project: A proven child find process. Sopris West Publishing.

    Google Scholar 

  69. Ware, N. C., Dickey, B., Tugenberg, T., & McHorney, C. A. (2003). CONNECT: A measure of continuity of care in mental health services. Administration and Policy in Mental Health and Mental Health Services Research, 5(4), 209–221. https://doi.org/10.1023/A:1026276918081

    Article  Google Scholar 

  70. Yoder, P. J., Symons, F. J., & Lloyd, B. (2018). Observational measurement of behavior (2nd ed.). Brookes Publishing.

    Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Bryce D. McLeod.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Preparation of this article was supported in part by a grant from the Institute of Education Sciences (R305A140487; McLeod & Sutherland).

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

McLeod, B.D., Sutherland, K.S., Broda, M. et al. Examining the Correspondence Between Teacher- and Observer-Report Treatment Integrity Measures. School Mental Health (2021). https://doi.org/10.1007/s12310-021-09437-7

Download citation

Keywords

  • Treatment integrity
  • Teacher implementation
  • Practice elements
  • Early childhood