Journal of General Internal Medicine

, Volume 34, Issue 5, pp 669–676 | Cite as

Clerkship Grading Committees: the Impact of Group Decision-Making for Clerkship Grading

  • Annabel K. Frank
  • Patricia O’Sullivan
  • Lynnea M. Mills
  • Virginie Muller-Juge
  • Karen E. HauerEmail author
Original Research



Faculty and students debate the fairness and accuracy of medical student clerkship grades. Group decision-making is a potential strategy to improve grading.


To explore how one school’s grading committee members integrate assessment data to inform grade decisions and to identify the committees’ benefits and challenges.


This qualitative study used semi-structured interviews with grading committee chairs and members conducted between November 2017 and March 2018.


Participants included the eight core clerkship directors, who chaired their grading committees. We randomly selected other committee members to invite, for a maximum of three interviews per clerkship.


Interviews were recorded, transcribed, and analyzed using inductive content analysis.

Key Results

We interviewed 17 committee members. Within and across specialties, committee members had distinct approaches to prioritizing and synthesizing assessment data. Participants expressed concerns about the quality of assessments, necessitating careful scrutiny of language, assessor identity, and other contextual factors. Committee members were concerned about how unconscious bias might impact assessors, but they felt minimally impacted at the committee level. When committee members knew students personally, they felt tension about how to use the information appropriately. Participants described high agreement within their committees; debate was more common when site directors reviewed students’ files from other sites prior to meeting. Participants reported multiple committee benefits including faculty development and fulfillment, as well as improved grading consistency, fairness, and transparency. Groupthink and a passive approach to bias emerged as the two main threats to optimal group decision-making.


Grading committee members view their practices as advantageous over individual grading, but they feel limited in their ability to address grading fairness and accuracy. Recommendations and support may help committees broaden their scope to address these aspirations.


medical education-qualitative methods medical education-undergraduate evaluation clerkship grading group decision-making grading committees clinical competence 


Funding Information

All funding for this project was provided by the University of California, San Francisco, School of Medicine.

Compliance with Ethical Standards

The University of California, San Francisco (UCSF), Institutional Review Board approved this study as exempt. We emailed a consent document in advance, discussed it before interviews, and obtained verbal consent.

Conflict of Interest

The authors declare that they do not have a conflict of interest.

Supplementary material

11606_2019_4879_MOESM1_ESM.docx (1.2 mb)
ESM 1 (DOCX 1.21 MB)


  1. 1.
    National Resident Matching Program. Data Release and Research Committee. Results of the 2016 NRMP Program Director Survey. Accessed December 8, 2018.
  2. 2.
    Cullen MW, Reed DA, Halvorsen AJ, et al. Selection criteria for internal medicine residency applicants and professionalism ratings during internship. Mayo Clin Proc. 2011;86(3):197–202.CrossRefGoogle Scholar
  3. 3.
    Alexander EK, Osman NY, Walling JL, Mitchell VG. Variation and Imprecision of Clerkship Grading in U.S. Medical Schools. Acad Med. 2012;87(8):1070–6.CrossRefGoogle Scholar
  4. 4.
    Kogan JR, Conforti LN, Iobst WF, Holmboe ES. Reconceptualizing Variable Rater Assessments as Both an Educational and Clinical Care Problem. Acad Med. 2014;89(5):721–7.CrossRefGoogle Scholar
  5. 5.
    Goldstein SD, Lindeman B, Colbert-Getz J, et al. Faculty and resident evaluations of medical students on a surgery clerkship correlate poorly with standardized exam scores. Am J Surg. 2014;207(2):231–5.CrossRefGoogle Scholar
  6. 6.
    Takayama H, Grinsell R, Brock D, Foy H, Pellegrini C, Horvath K. Is it Appropriate to Use Core Clerkship Grades in the Selection of Residents? Curr Surg. 2006;63(6):391–6.CrossRefGoogle Scholar
  7. 7.
    Zaidi NLB, Kreiter CD, Castaneda PR, et al. Generalizability of Competency Assessment Scores Across and Within Clerkships: How Students, Assessors, and Clerkships Matter. Acad Med. 2018;93(8)1212–7.CrossRefGoogle Scholar
  8. 8.
    Pelgrim EAM, Kramer AWM, Mokkink HGA, van den Elsen L, Grol RPTM, van der Vleuten CPM. In-training assessment using direct observation of single-patient encounters: a literature review. Adv Health Sci Educ. 2011;16(1):131–42.CrossRefGoogle Scholar
  9. 9.
    Kogan JR, Holmboe ES, Hauer KE. Tools for direct observation and assessment of clinical skills of medical trainees: a systematic review. JAMA. 2009;302(12):1316–26.CrossRefGoogle Scholar
  10. 10.
    Hauer KE, Ten Cate O, Boscardin CK, et al. Ensuring Resident Competence: A Narrative Review of the Literature on Group Decision Making to Inform the Work of Clinical Competency Committees. J Grad Med Educ. 2016;8(2):156–64.Google Scholar
  11. 11.
    Oudkerk Pool A, Govaerts MJB, Jaarsma DADC, Driessen EW. From aggregation to interpretation: how assessors judge complex data in a competency-based portfolio. Adv Health Sci Educ. 2018;23(2):275–87.CrossRefGoogle Scholar
  12. 12.
    Hatala R, Sawatsky AP, Dudek N, Ginsburg S, Cook DA. Using In-Training Evaluation Report (ITER) Qualitative Comments to Assess Medical Students and Residents: A Systematic Review. Acad Med. 2017;92(6):868–79.CrossRefGoogle Scholar
  13. 13.
    Hemmer PA, Papp KK, Mechaber AJ, Durning SJ. Evaluation, Grading, and Use of the RIME Vocabulary on Internal Medicine Clerkships: Results of a National Survey and Comparison to Other Clinical Clerkships. Teach Learn Med. 2008;20(2):118–26.CrossRefGoogle Scholar
  14. 14.
    Wass V, Van der Vleuten C, Shatzer J, Jones R. Assessment of clinical competence. The Lancet. 2001;357(9260):945–49.CrossRefGoogle Scholar
  15. 15.
    Hawkins RE, Margolis MJ, Durning SJ, Norcini JJ. Constructing a Validity Argument for the Mini-Clinical Evaluation Exercise: A Review of the Research: Acad Med. 2010;85(9):1453–61.CrossRefGoogle Scholar
  16. 16.
    Yeates P, O’Neill P, Mann K, Eva K. Seeing the same thing differently: Mechanisms that contribute to assessor differences in directly-observed performance assessments. Adv Health Sci Educ. 2013;18(3):325–41.CrossRefGoogle Scholar
  17. 17.
    Carline JD, Paauw DS, Thiede KW, Ramsey PG. Factors affecting the reliability of ratings of students’ clinical skills in a medicine clerkship. J Gen Intern Med. 1992;7(5):506–10.CrossRefGoogle Scholar
  18. 18.
    Lee KB, Jeffe DB. “Making the Grade:” Noncognitive Predictors of Medical Students’ Clinical Clerkship Grades. J Natl Med Assoc. 2007;99(10):1138–50.Google Scholar
  19. 19.
    Noureddine L, Medina J. Learning to Break the Shell: Introverted Medical Students Transitioning into Clinical Rotations. Acad Med. 2018;93(6);822CrossRefGoogle Scholar
  20. 20.
    Schuh LA, London Z, Neel R, et al. Education Research: Bias and poor interrater reliability in evaluating the neurology clinical skills examination. Neurology. 2009;73(11):904–8.CrossRefGoogle Scholar
  21. 21.
    Riese A, Rappaport L, Alverson B, Park S, Rockney RM. Clinical Performance Evaluations of Third-Year Medical Students and Association With Student and Evaluator Gender: Acad Med. 2017;92(6):835–40.CrossRefGoogle Scholar
  22. 22.
    Lee V, Brain K, Martin J. Factors Influencing Mini-CEX Rater Judgments and Their Practical Implications: A Systematic Literature Review. Acad Med. 2017;92(6):880–7.CrossRefGoogle Scholar
  23. 23.
    Boatright D, Ross D, O’Connor P, Moore E, Nunez-Smith M. Racial Disparities in Medical Student Membership in the Alpha Omega Alpha Honor Society. JAMA Intern Med. 2017;177(5):659–65.CrossRefGoogle Scholar
  24. 24.
    Ross DA, Boatright D, Nunez-Smith M, Jordan A, Chekroud A, Moore EZ. Differences in words used to describe racial and gender groups in Medical Student Performance Evaluations. PLoS One. 2017;12(8):e0181659.CrossRefGoogle Scholar
  25. 25.
    Accreditation Council for Graduate Medical Education. Common Program Requirements. Accessed December 8, 2018.
  26. 26.
    Surowiecki J. The Wisdom of Crowds. Knopf Doubleday Publishing Group; 2005.Google Scholar
  27. 27.
    Michaelsen LK, Watson WE, Black RH. A Realistic Test of Individual Versus Group Consensus Decision Making. J Appl Psychol. 1989;74(5):834–9.CrossRefGoogle Scholar
  28. 28.
    Kerr NL, Tindale RS. Group Performance and Decision Making. Annu Rev Psychol. 2004;55(1):623–55CrossRefGoogle Scholar
  29. 29.
    Klocke, U. How to improve decision making in small groups: Effects of dissent and training interventions. Small Group Res. 2007;38(3):437–68.Google Scholar
  30. 30.
    Beran TN, Kaba A, Caird J, McLaughlin K. The good and bad of group conformity: a call for a new programme of research in medical education. Med Educ. 2014;48(9):851–9.CrossRefGoogle Scholar
  31. 31.
    Ekpenyong A, Baker E, Harris I, et al. How do clinical competency committees use different sources of data to assess residents’ performance on the internal medicine milestones? A mixed methods pilot study. Med Teach. 2017;39(10):1074–83.CrossRefGoogle Scholar
  32. 32.
    Donato AA, Alweis R, Wenderoth S. Design of a clinical competency committee to maximize formative feedback. J Community Hosp Intern Med Perspect. 2016;6(6):33533.CrossRefGoogle Scholar
  33. 33.
    Schumacher DJ, King B, Barnes MM, et al. Influence of Clinical Competency Committee Review Process on Summative Resident Assessment Decisions. J Grad Med Educ. 2018;10(4):429–37.CrossRefGoogle Scholar
  34. 34.
    Stasser G. A Primer of Social Decision Scheme Theory: Models of Group Influence, Competitive Model-Testing, and Prospective Modeling. Organ Behav Hum Decis Process. 1999;80(1):3–20.CrossRefGoogle Scholar
  35. 35.
    Chahine S, Cristancho S, Padgett J, Lingard L. How do small groups make decisions? Perspect Med Educ. 2017;6(3):192–8.CrossRefGoogle Scholar
  36. 36.
    Gaglione MM, Moores L, Pangaro L, Hemmer PA. Does Group Discussion of Student Clerkship Performance at an Education Committee Affect an Individual Committee Member’s Decisions? Acad Med. 2005;80(10 Suppl):S55–8.CrossRefGoogle Scholar
  37. 37.
    Battistone MJ, Milne C, Sande MA, Pangaro LN, Hemmer PA, Shomaker TS. The Feasibility and Acceptability of Implementing Formal Evaluation Sessions and Using Descriptive Vocabulary to Assess Student Performance on a Clinical Clerkship. Teach Learn Med. 2002;14(1):5–10.CrossRefGoogle Scholar
  38. 38.
    Hsieh H-F, Shannon SE. Three Approaches to Qualitative Content Analysis. Qual Health Res. 2005;15(9):1277–88.CrossRefGoogle Scholar
  39. 39.
    Morse JM. The significance of saturation. Qual Health Res. 1995;5(2);147–9.CrossRefGoogle Scholar
  40. 40.
    Glaser BG, Strauss AL. The constant comparative method in qualitative analysis. In: The Discovery of Grounded Theory: Strategies for Qualitative Research. Chicago: Aldine Transaction; 1967.Google Scholar
  41. 41.
    Bowen GA. Grounded Theory and Sensitizing Concepts. Int J Qual Methods. 2006;5(3):12–23.CrossRefGoogle Scholar
  42. 42.
    Barry CA, Britten N, Barber N, Bradley C, Stevenson F. Using Reflexivity to Optimize Teamwork in Qualitative Research. Qual Health Res. 1999;9(1):26–44.CrossRefGoogle Scholar
  43. 43.
    Tavares W, Ginsburg S, Eva KW. Selecting and Simplifying: Rater Performance and Behavior When Considering Multiple Competencies. Teach Learn Med. 2016;28(1):41–51.CrossRefGoogle Scholar
  44. 44.
    Durand RP, Levine JH, Lichtenstein LS, Fleming GA, Ross GR. Teachers’ perceptions concerning the relative values of personal and clinical characteristics and their influence on the assignment of students’ clinical grades. Med Educ. 1988;22(4):335–41.CrossRefGoogle Scholar
  45. 45.
    Ginsburg S, Regehr G, Lingard L, Eva KW. Reading between the lines: faculty interpretations of narrative evaluation comments. Med Educ. 2015;49(3):296–306.CrossRefGoogle Scholar
  46. 46.
    Janis IL. Groupthink. Psychol Today 1971;5:43–6, 74-6.Google Scholar
  47. 47.
    Kinnear B, Warm EJ, Hauer KE. Twelve tips to maximize the value of a clinical competency committee in postgraduate medical education. Med Teach. 2018.Google Scholar
  48. 48.
    Moss-Racusin CA, Dovidio JF, Brescoll VL, Graham MJ, Handelsman J. Science faculty’s subtle gender biases favor male students. Proc Natl Acad Sci. 2012;109(41):16474–9.CrossRefGoogle Scholar
  49. 49.
    Conrad SS, Addams AN, Young GH. Holistic Review in Medical School Admissions and Selection: A Strategic, Mission-Driven Response to Shifting Societal Needs. Acad Med. 2016;91(11):1472–4.CrossRefGoogle Scholar
  50. 50.
    Lurie SJ, Mooney CJ. Assessing a Method to Limit Influence of Standardized Tests on Clerkship Grades. Teach Learn Med. 2012;24(4):287–91.CrossRefGoogle Scholar

Copyright information

© Society of General Internal Medicine 2019

Authors and Affiliations

  • Annabel K. Frank
    • 1
    • 2
  • Patricia O’Sullivan
    • 1
  • Lynnea M. Mills
    • 1
  • Virginie Muller-Juge
    • 1
  • Karen E. Hauer
    • 1
    Email author
  1. 1.Department of Medicine University of California, San FranciscoSan FranciscoUSA
  2. 2.Perelman School of MedicineUniversity of PennsylvaniaPhiladelphiaUSA

Personalised recommendations