Journal of General Internal Medicine

, Volume 19, Issue 9, pp 971–977 | Cite as

How reliable are assessments of clinical teaching?

A review of the published instruments
  • Thomas J. Beckman
  • Amit K. Ghosh
  • David A. Cook
  • Patricia J. Erwin
  • Jayawant N. Mandrekar


BACKGROUND: Learner feedback is the primary method for evaluating clinical faculty, despite few existing standards for measuring learner assessments.

OBJECTIVE: To review the published literature on instruments for evaluating clinical teachers and to summarize themes that will aid in developing universally appealing tools.

DESIGN: Searching 5 electronic databases revealed over 330 articles. Excluded were reviews, editorials, and qualitative studies. Twenty-one articles describing instruments designed for evaluating clinical faculty by learners were found. Three investigators studied these papers and tabulated characteristics of the learning environments and validation methods. Salient themes among the evaluation studies were determined.

MAIN RESULTS: Many studies combined evaluations from both outpatient and inpatient settings and some authors combined evaluations from different learner levels. Wide ranges in numbers of teachers, evaluators, evaluations, and scale items were observed. The most frequently encountered statistical methods were factor analysis and determining internal consistency reliability with Cronbach’s α. Less common methods were the use of test-retest reliability, interrater reliability, and convergent validity between validated instruments. Fourteen domains of teaching were identified and the most frequently studied domains were interpersonal and clinical-teaching skills.

CONCLUSIONS: Characteristics of teacher evaluations vary between educational settings and between different learner levels, indicating that future studies should utilize more narrowly defined study populations. A variety of validation methods including temporal stability, interrater reliability, and convergent validity should be considered. Finally, existing data support the validation of instruments comprised solely of interpersonal and clinical-teaching domains.

Key words

validity evaluation studies medical faculty 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Jones RG, Froom JD. Faculty and administration view of problems in faculty evaluations. Acad Med. 1994;69:476–83.PubMedCrossRefGoogle Scholar
  2. 2.
    Beckman TJ, Lee MC, Mandrekar JN. A comparison of clinical teaching evaluations by resident and peer physicians. Med Teach. 2004;26:321–5.PubMedCrossRefGoogle Scholar
  3. 3.
    Downing SM. Validity: on the meaningful interpretation of assessment data. Med Educ. 2003;37:830–7.PubMedCrossRefGoogle Scholar
  4. 4.
    Crossley J, Humphris G, Jolly B. Assessing health professionals. Med Educ. 2002;36:800–4.PubMedCrossRefGoogle Scholar
  5. 5.
    Beckman TJ, Lee MC, Rohren CH, Pankratz VS. Evaluating an instrument for the peer review of inpatient teaching. Med Teach. 2003;25:131–5.PubMedCrossRefGoogle Scholar
  6. 6.
    Benbassat J, Bachar E. Validity of students’ ratings of clinical instructors. Med Educ. 1981;15:373–6.PubMedGoogle Scholar
  7. 7.
    Cohen R, McRae H, Jamieson C. Teaching effectiveness of surgeons. Am J Surg. 1996;171:612–4.PubMedCrossRefGoogle Scholar
  8. 8.
    Copeland HL, Hewson MG. Developing and testing an instrument to measure the effectiveness of clinical teaching in an academic medical center. Acad Med. 2000;75:161–6.PubMedCrossRefGoogle Scholar
  9. 9.
    Donnelly MB, Woolliscroft JO. Evaluation of instructors by third year medical students. Acad Med. 1989;64:159–64.PubMedCrossRefGoogle Scholar
  10. 10.
    Donner-Banzhoff N, Merle H, Baum E, Basler HD. Feedback for general practice trainers: developing and testing a standardized instrument using the importance-quality-score method. Med Educ. 2003;37:772–7.PubMedCrossRefGoogle Scholar
  11. 11.
    Guyatt GH, Nishikawa J, Willan A, et al. A measurement process for evaluating clinical teachers in internal medicine. Can Med Assoc J. 1993;149:1097–102.Google Scholar
  12. 12.
    Hayward RA, Williams BC, Gruppen LD, Rosenbaum D. Measuring attending physician performance in a general medicine outpatient clinic. J Gen Intern Med. 1995;10:504–10.PubMedCrossRefGoogle Scholar
  13. 13.
    Irby DM, Rakestraw P. Evaluating clinical teaching in medicine. J Med Educ. 1981;56:181–6.PubMedGoogle Scholar
  14. 14.
    James PA, Osborne JW. A measure of medical instructional quality in ambulatory settings: the MedIQ. Fam Med. 1999;31:263–9.PubMedGoogle Scholar
  15. 15.
    Litzelman DK, Westmorland GR, Skeff KM, Stratos GA. Student and resident evaluations of faculty—how reliable are they? Acad Med. 1999;74(suppl Oct):s25-s27.PubMedCrossRefGoogle Scholar
  16. 16.
    Litzelman DK, Stratos GA, Marriott DJ, Skeff KM. Factorial validation of a widely disseminated educational framework for evaluating clinical teachers. Acad Med. 1998;73:688–95.PubMedCrossRefGoogle Scholar
  17. 17.
    Mcgill MK, McClure C, Commerford K. A system for evaluating teaching in the ambulatory setting. Fam Med. 1986;18:173–4.Google Scholar
  18. 18.
    McLeod PJ, James CA, Abrahamowicz M. Clinical tutor evaluation: a 5-year study by students on an in-patient service and residents in an ambulatory care clinic. Med Educ. 1993;27:48–54.PubMedCrossRefGoogle Scholar
  19. 19.
    Ramsbottom-Lucier MT, Gillmore GM, Irby DM, Ramsey PG. Evaluation of clinical teaching by general internal medicine faculty in outpatient and inpatient settings. Acad Med. 1994;69:152–4.PubMedCrossRefGoogle Scholar
  20. 20.
    Risucci DA, Lutsky L, Rosati RJ, Tortolani AJ. Reliability and accuracy of resident evaluations of surgical faculty. Eval Health Prof. 1992;15:313–24.PubMedCrossRefGoogle Scholar
  21. 21.
    Shellenberger S, Mahan JM. A factor analytic study of teaching in off-campus general practice clerkships. Med Educ. 1982;16:151–5.PubMedGoogle Scholar
  22. 22.
    Solomon DJ, Speer AJ, Rosebraugh CJ, DiPette DJ. The reliability of medical student ratings of clinical teaching. Eval Health Prof. 1997;20:343–52.PubMedCrossRefGoogle Scholar
  23. 23.
    Steiner IP, Franc-Law J, Kelly KD, Rowe BH. Faculty evaluation by residents in an emergency medicine program: a new evaluation instrument. Acad Emerg Med. 2000;7:1015–21.PubMedGoogle Scholar
  24. 24.
    Tortolani AJ, Rissucci DA, Rosati RJ. Resident evaluation of surgical faculty. J Surg Res. 1991;51:186–91.PubMedCrossRefGoogle Scholar
  25. 25.
    Williams BC, Litzelman DK, Babbott SF, Lubitz RM, Hofer TP. Validation of a global measure of faculty’s clinical teaching performance. Acad Med. 2002;77:177–80.PubMedCrossRefGoogle Scholar
  26. 26.
    Snell L, Tallett S, Haist S, et al. A review of the evaluation of clinical teaching: new perspectives and challenges. Med Educ. 2000;34:862–70.PubMedCrossRefGoogle Scholar
  27. 27.
    Williams RG, Klamen DA, McGaghie WC. Cognitive, social and environmental sources of bias in clinical performance ratings. Teach Learn Med. 2003;15:270–92.PubMedCrossRefGoogle Scholar
  28. 28.
    American Educational Research Association, American Psychological Association, National Council on Measurement in Education. Standards for Educational and Psychological Testing. Washington, DC: American Educational Research Association; 1999.Google Scholar
  29. 29.
    Nunnally JC, Berstein IH. Psychometric Theory. 3rd ed. New York: McGraw-Hill. 1994:211–54.Google Scholar
  30. 30.
    DeVillis RF. Scale Development: Theory and Applications. London: Sage Publications. 1991;94:102–37.Google Scholar
  31. 31.
    Durning SJ, Cation LJ, Jackson JL. The reliability and validity of the American Board of Internal Medicine Monthly Evaluation Form. Acad Med. 2003;78:1175–82.PubMedCrossRefGoogle Scholar
  32. 32.
    Schwab DP. Construct validity in organizational behavior. Res Organ Behav. 1980;2:3–43.Google Scholar
  33. 33.
    Perkoff GT. Teaching clinical medicine in the ambulatory setting: an idea whose time may have finally come. N Engl J Med. 1986;314:27–31.PubMedCrossRefGoogle Scholar
  34. 34.
    Downing DM, Haladyna TM. Validity threats: overcoming interference with proposed interpretations of assessment data. Med Educ. 2004;38:327–33.PubMedCrossRefGoogle Scholar
  35. 35.
    Irby DM. Evaluating instruction in medical education. J Med Educ. 1983;58:844–9.PubMedGoogle Scholar
  36. 36.
    Downing SM. Reliability: on the reproducibility of assessment data. Med Educ. In Press.Google Scholar
  37. 37.
    Howell DC. Statistical Methods for Psychology. 5th ed. Pacific Grove, Calif: Duxbury; 2002.Google Scholar
  38. 38.
    McMillan JH, Schumacher S. Research in Education: A Conceptual Introduction. 5th ed. New York: Addison Wesley Longman; 2001.Google Scholar

Copyright information

© Society of General Internal Medicine 2004

Authors and Affiliations

  • Thomas J. Beckman
    • 1
  • Amit K. Ghosh
    • 1
  • David A. Cook
    • 1
  • Patricia J. Erwin
    • 2
  • Jayawant N. Mandrekar
    • 3
  1. 1.Department of Internal Medicine, Department of Medicine, Mayo Clinic College of MedicineMayo Clinic and Mayo FoundationRochester
  2. 2.Plummer Medical LibraryMayo Clinic College of MedicineRochester
  3. 3.Department of Health Sciences ResearchDivision of Biostatistics, Mayo Clinic and Mayo FoundationRochester

Personalised recommendations