Journal of General Internal Medicine

, Volume 20, Issue 12, pp 1159–1164 | Cite as

What is the validity evidence for assessments of clinical teaching?

  • Thomas J. Beckman
  • David A. Cook
  • Jayawant N. Mandrekar
Clinical Review


BACKGROUND: Although a variety of validity evidence should be utilized when evaluating assessment tools, a review of teaching assessments suggested that authors pursue a limited range of validity evidence.

OBJECTIVES: To develop a method for rating validity evidence and to quantify the evidence supporting scores from existing clinical teaching assessment instruments.

DESIGN: A comprehensive search yielded 22 articles on clinical teaching assessments. Using standards outlined by the American Psychological and Education Research Associations, we developed a method for rating the 5 categories of validity evidence reported in each article. We then quantified the validity evidence by summing the ratings for each category. We also calculated weighted κ coefficients to determine interrater reliabilities for each category of validity evidence.

MAIN RESULTS: Content and Internal Structure evidence received the highest ratings (27 and 32, respectively, of 44 possible). Relation to Other Variables, Consequences, and Response Process received the lowest ratings (9, 2, and 2, respectively). Interrater reliability was good for Content, Internal Structure, and Relation to Other Variables (κ range 0.52 to 0.96, all P values <.01), but poor for Consequences and Response Process.

CONCLUSIONS: Content and Internal Structure evidence is well represented among published assessments of clinical teaching. Evidence for Relation to Other Variables, Consequences, and Response Process receive little attention, and future research should emphasize these categories. The low interrater reliability for Response Process and Consequences likely reflects the scarcity of reported evidence. With further development, our method for rating the validity evidence should prove useful in various settings.

Key Words

validity clinical teaching evaluation studies 


  1. 1.
    Downing SM. Validity: on the meaningful interpretation of assessment data. Med Educ. 2003;37:830–7.PubMedCrossRefGoogle Scholar
  2. 2.
    Crossley J, Humphris G, Jolly B. Assessing health professionals. Med Educ. 2002;36:800–4.PubMedCrossRefGoogle Scholar
  3. 3.
    Beckman TJ, Ghosh AK, Cook DA, Erwin PJ, Mandrekar JN. How reliable are assessments of clinical teaching? A review of the published instruments. J Gen Intern Med. 2004;19:971–7.PubMedCrossRefGoogle Scholar
  4. 4.
    Beckman TJ, Lee MC, Rohren CH. Evaluating an instrument for the peer review of inpatient teaching. Med Teach. 2003;25:131–5.PubMedCrossRefGoogle Scholar
  5. 5.
    Benbassat J, Bachar E. Validity of students’ ratings of clinical instructors. Med Educ. 1981;15:373–6.PubMedGoogle Scholar
  6. 6.
    Cohen R, McRae H, Jamieson C. Teaching effectiveness of surgeons. Am J Surg. 1996;171:612–4.PubMedCrossRefGoogle Scholar
  7. 7.
    Copeland HL, Hewson MG. Developing and testing an instrument to measure the effectiveness of clinical teaching in an academic medical center. Acad Med. 2000;75:161–6.PubMedCrossRefGoogle Scholar
  8. 8.
    Donnelly MB, Woolliscroft JO. Evaluation of clinical instructors by third year medical students. Acad Med. 1989;64:159–64.PubMedCrossRefGoogle Scholar
  9. 9.
    Donner-Banzhoff N, Merle H, Baum E, Basler HD. Feedback for general practice trainers: developing and testing a standardized instrument using the importance-quality-score method. Med Educ. 2003;37:772–7.PubMedCrossRefGoogle Scholar
  10. 10.
    Guyatt GH, Nishikawa J, Willan A, et al. A measurement process for evaluating clinical teachers in internal medicine. Can Med Assoc J. 1993;149:1097–102.Google Scholar
  11. 11.
    Hayward RA, Williams BC, Gruppen LD, Rosenbaum D. Measuring attending physician performance in a general medicine outpatient clinic. J Gen Intern Med. 1995;10:504–10.PubMedCrossRefGoogle Scholar
  12. 12.
    Irby DM, Rakestraw P. Evaluating clinical teaching in medicine. J Med Educ. 1981;56:181–6.PubMedGoogle Scholar
  13. 13.
    James PA, Osborne JW. A measure of medical instructional quality in ambulatory settings: the MedIQ. Fam Med. 1999;31:263–9.PubMedGoogle Scholar
  14. 14.
    Litzelman DK, Westmorland GR, Skeff KM, Stratos GA. Student and resident evaluations of faculty—how reliable are they? Acad Med. 1999;74(suppl):s25–7.PubMedCrossRefGoogle Scholar
  15. 15.
    Litzelman DK, Stratos GA, Marriott DJ, Skeff KM. Factorial validation of a widely disseminated educational framework for evaluating clinical teachers. Acad Med. 1998;73:688–95.PubMedCrossRefGoogle Scholar
  16. 16.
    McGill MK, McClure C, Commerford K. A system for evaluating teaching in the ambulatory setting. Fam Med. 1986;18:173–4.Google Scholar
  17. 17.
    McLeod PJ, James CA, Abrahamowicz M. Clinical tutor evaluation: a 5-year study by students on an in-patient service and residents in an ambulatory care clinic. Med Educ. 1993;27:48–53.PubMedCrossRefGoogle Scholar
  18. 18.
    Ramsbottom-Lucier MT, Gillmore GM, Irby DM, Ramsey PG. Evaluation of clinical teaching by general internal medicine faculty in outpatient and inpatient settings. Acad Med. 1994;69:152–4.PubMedCrossRefGoogle Scholar
  19. 19.
    Risucci DA, Lutsky L, Rosati RJ, Tortolani AJ. Reliability and accuracy of resident evaluations of surgical faculty. Eval Health Prof. 1992;15:313–24.PubMedCrossRefGoogle Scholar
  20. 20.
    Shellenberger S, Mahan JM. A factor analytic study of teaching in off-campus general practice clerkships. Med Educ. 1982;16:151–5.PubMedGoogle Scholar
  21. 21.
    Solomon DJ, Speer AJ, Rosebraugh CJ, DiPette DJ. The reliability of medical student ratings of teaching. Eval Health Prof. 1997;20:343–52.PubMedCrossRefGoogle Scholar
  22. 22.
    Steiner IP, Franc-Law J, Kelly KD, Rowe BH. Faculty evaluation by residents in an emergency medicine program: a new evaluation instrument. Acad Emerg Med. 2000;7:1015–21.PubMedGoogle Scholar
  23. 23.
    Tortolani AJ, Rissucci DA, Rosati RJ. Resident evaluation of surgical faculty. J Surg Res. 1991;51:186–91.PubMedCrossRefGoogle Scholar
  24. 24.
    Williams BC, Litzelman DK, Babbott SF, Lubitz RM, Hofer TP. Validation of a global measure of faculty’s clinical teaching performance. Acad Med. 2002;77:177–80.PubMedCrossRefGoogle Scholar
  25. 25.
    Smith CA, Varkey AB, Evans AT, Reilly BM. Evaluating the performance of inpatient attending physicians: a new instrument for today’s teaching hospitals. J Gen Intern Med. 2004;19:766–71.PubMedCrossRefGoogle Scholar
  26. 26.
    American Education Research Association and American Psychological Association. Standards for Educational and Psychological Testing. Washington, DC: American Education Research Association; 1999.Google Scholar
  27. 27.
    Messick S Validity. In: Linn RL, ed. Educational Measurement. 3rd ed. Phoenix, Ariz: Oryx Press; 1993.Google Scholar
  28. 28.
    Fleiss JL, Cohen J. The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educ Psychol Meas. 1973;33:613–9.CrossRefGoogle Scholar
  29. 29.
    Fleiss JL, Cohen J, Everitt BS. Large sample standard errors of kappa and weighted kappa. Psychol Bull. 1969;72:323–7.CrossRefGoogle Scholar
  30. 30.
    Landis JR, Koch GG. The measure of observer agreement for categorical data. Biometrics. 1977;33:159–74.PubMedCrossRefGoogle Scholar
  31. 31.
    Downing SM. Reliability: on the reproducibility of assessment data. Med Educ. 2004;38:1006–12.PubMedCrossRefGoogle Scholar
  32. 32.
    Carney PA, Neirenberg DW, Pipas CF, Brooks WB, Stukel TA, Keller AM. Educational epidemiology: applying population-based design and analytic approaches to study medical education. JAMA. 2004;292:1044–50.PubMedCrossRefGoogle Scholar
  33. 33.
    Beckman TJ, Cook DA. Educational epidemiology (letter). JAMA. 2004;292:1969.CrossRefGoogle Scholar

Copyright information

© Society of General Internal Medicine 2005

Authors and Affiliations

  • Thomas J. Beckman
    • 1
  • David A. Cook
    • 1
  • Jayawant N. Mandrekar
    • 2
  1. 1.Division of General Internal Medicine, Department of Internal MedicineMayo Clinic College of Medicine, Mayo Clinic and Mayo FoundationRochesterUSA
  2. 2.Division of Biostatistics, Department of Health Sciences ResearchMayo Clinic College of Medicine, Mayo Clinic and Mayo FoundationRochesterUSA

Personalised recommendations