Skip to main content
Log in

Gender Differences in Work-Based Assessment Scores and Narrative Comments After Direct Observation

  • Original Research
  • Published:
Journal of General Internal Medicine Aims and scope Submit manuscript

Abstract

Background

While some prior studies of work-based assessment (WBA) numeric ratings have not shown gender differences, they have been unable to account for the true performance of the resident or explore narrative differences by gender.

Objective

To explore gender differences in WBA ratings as well as narrative comments (when scripted performance was known).

Design

Secondary analysis of WBAs obtained from a randomized controlled trial of a longitudinal rater training intervention in 2018–2019. Participating faculty (n = 77) observed standardized resident–patient encounters and subsequently completed rater assessment forms (RAFs).

Subjects

Participating faculty in longitudinal rater training.

Main Measures

Gender differences in mean entrustment ratings (4-point scale) were assessed with multivariable regression (adjusted for scripted performance, rater and resident demographics, and the interaction between study arm and time period [pre- versus post-intervention]). Using pre-specified natural language processing categories (masculine, feminine, agentic, and communal words), multivariable linear regression was used to determine associations of word use in the narrative comments with resident gender, race, and skill level, faculty demographics, and interaction between the study arm and the time period (pre- versus post-intervention).

Key Results

Across 1527 RAFs, there were significant differences in entrustment ratings between women and men standardized residents (2.29 versus 2.54, respectively, p < 0.001) after correction for resident skill level. As compared to men, feminine terms were more common for comments of what the resident did poorly among women residents (β 0.45, CI 0.12–0.78, p 0.01). This persisted despite adjusting for the faculty’s entrustment ratings. There were no other significant linguistic differences by gender.

Conclusions

Contrasting prior studies, we found entrustment rating differences in a simulated WBA which persisted after adjusting for the resident’s scripted performance. There were also linguistic differences by gender after adjusting for entrustment ratings, with feminine terms being used more frequently in comments about women in some, but not all narrative comments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Data Availability

The datasets during and/or analyzed during the current study available from the corresponding author on reasonable request.

References

  1. Klein R, Julian KA, Snyder ED, et al. Gender bias in resident assessment in graduate medical education: review of the literature. J Gen Intern Med. 2019;34(5):712-719.

  2. Arkin N, Lai C, Kiwakyou LM, et al. What’s in a Word? Qualitative and quantitative analysis of leadership language in anesthesiology resident feedback. J Grad Med Educ. 2019;11(1):44-52. https://doi.org/10.4300/jgme-d-18-00377.1.

  3. Mueller AS, Jenkins TM, Osborne M, Dayal A, O’Connor DM, Arora VM. Gender differences in attending physicians’ feedback to residents: a qualitative analysis. J Grad Med Educ. 2017;9(5):577-585. https://doi.org/10.4300/JGME-D-17-00126.1.

  4. Klein R, Ufere NN, Rao SR, et al. Association of gender with learner assessment in graduate medical education. JAMA Netw Open. 2020;3(7):e2010888-e2010888. https://doi.org/10.1001/jamanetworkopen.2020.10888.

  5. Holmboe ES, Huot SJ, Brienza RS, Hawkins RE. The association of faculty and residents’ gender on faculty evaluations of internal medicine residents in 16 residencies. Acad Med. 2009;84(3):381-384.

  6. Rand VE, Hudes ES, Browner WS, Wachter RM, Avins AL. Effect of evaluator and resident gender on the American Board of Internal Medicine evaluation scores. J Gen Intern Med. 1998;13(10):670-674.

  7. Turrentine FE, Dreisbach CN, St Ivany AR, Hanks JB, Schroen AT. Influence of gender on surgical residency applicants’ recommendation letters. J Am Coll Surg. 2019;228(4):356-365.e3. https://doi.org/10.1016/j.jamcollsurg.2018.12.020.

  8. Li S, Fant AL, McCarthy DM, Miller D, Craig J, Kontrick A. Gender differences in language of standardized letter of evaluation narratives for emergency medicine residency applicants. AEM Educ Train. 2017;1(4):334-339. https://doi.org/10.1002/aet2.10057.

  9. Chen S, Beck Dallaghan GL, Shaheen A. Implicit gender bias in third-year surgery clerkship MSPE narratives. J Surg Educ. 2021;78(4):1136-1143. https://doi.org/10.1016/j.jsurg.2020.10.011.

  10. Khan S, Kirubarajan A, Shamsheri T, Clayton A, Mehta G. Gender bias in reference letters for residency and academic medicine: a systematic review. Postgrad Med J. 2023;99(1170):272-278. https://doi.org/10.1136/postgradmedj-2021-140045.

  11. Ross DA, Boatright D, Nunez-Smith M, Jordan A, Chekroud A, Moore EZ. Differences in words used to describe racial and gender groups in medical student performance evaluations. PLoS ONE. 2017;12(8):e0181659. https://doi.org/10.1371/journal.pone.0181659.

  12. Trix F, Psenka C. Exploring the color of glass: letters of recommendation for female and male medical faculty. Discourse Soc. 2003;14(2):191-220. https://doi.org/10.1177/0957926503014002277.

  13. Babal JC, Webber S, Nacht CL, et al. Recognizing and mitigating gender bias in medical teaching assessments. J Grad Med Educ. 2022;14(2):139-143. https://doi.org/10.4300/JGME-D-21-00774.1.

  14. Anderson HL, Kurtz J, West DC. Implementation and use of workplace-based assessment in clinical learning environments: a scoping review. Acad Med. 2021;96(11S):S164. https://doi.org/10.1097/ACM.0000000000004366.

  15. Ginsburg S, Gold W, Cavalcanti RB, Kurabi B, McDonald-Blumer H. Competencies “plus”: the nature of written comments on internal medicine residents’ evaluation forms. Acad Med J Assoc Am Med Coll. 2011;86(10 Suppl):S30-34. https://doi.org/10.1097/ACM.0b013e31822a6d92.

  16. Cook DA, Kuper A, Hatala R, Ginsburg S. When assessment data are words: validity evidence for qualitative educational assessments. Acad Med J Assoc Am Med Coll. 2016;91(10):1359-1369. https://doi.org/10.1097/ACM.0000000000001175.

  17. Young JQ, Sugarman R, Holmboe E, O’Sullivan PS. Advancing our understanding of narrative comments generated by direct observation tools: lessons from the psychopharmacotherapy-structured clinical observation. J Grad Med Educ. 2019;11(5):570-579. https://doi.org/10.4300/JGME-D-19-00207.1.

  18. Weber D, Kinnear B, Kelleher M, et al. Effect of resident and assessor gender on entrustment-based observational assessment in an internal medicine residency program. MedEdPublish. 2021;11:2. https://doi.org/10.12688/mep.17410.1.

  19. Tsugawa Y, Jena AB, Figueroa JF, Orav EJ, Blumenthal DM, Jha AK. Comparison of hospital mortality and readmission rates for Medicare patients treated by male vs female physicians. JAMA Intern Med. 2017;177(2):206-213.

  20. Wallis CJD, Jerath A, Aminoltejari K, et al. Surgeon sex and long-term postoperative outcomes among patients undergoing common surgeries. JAMA Surg. 2023;158(11):1185-1194. https://doi.org/10.1001/jamasurg.2023.3744.

  21. Heath JK, Weissman GE, Clancy CB, Shou H, Farrar JT, Dine CJ. Assessment of gender-based linguistic differences in physician trainee evaluations of medical faculty using automated text mining. JAMA Netw Open. 2019;2(5):e193520-e193520. https://doi.org/10.1001/jamanetworkopen.2019.3520.

  22. Kogan JR, Dine CJ, Conforti LN, Holmboe ES. Can rater training improve the quality and accuracy of workplace-based assessment narrative comments and entrustment ratings? A randomized controlled trial. Acad Med J Assoc Am Med Coll. 2023;98(2):237-247. https://doi.org/10.1097/ACM.0000000000004819.

  23. Calaman S, Hepps JH, Bismilla Z, et al. The creation of standard-setting videos to support faculty observations of learner performance and entrustment decisions. Acad Med J Assoc Am Med Coll. 2016;91(2):204-209. https://doi.org/10.1097/ACM.0000000000000853.

  24. Cullen MJ, Zhou Y, Sackett PR, Mustapha T, Hane J, Culican SM. Differences in trainee evaluations of faculty by rater and ratee gender. Acad Med J Assoc Am Med Coll. 2023;98(10):1196-1203. https://doi.org/10.1097/ACM.0000000000005260.

  25. Riese A, Rappaport L, Alverson B, Park S, Rockney RM. Clinical performance evaluations of third-year medical students and association with student and evaluator gender. Acad Med J Assoc Am Med Coll. 2017;92(6):835-840. https://doi.org/10.1097/ACM.0000000000001565.

  26. Ginsburg S, Gingerich A, Kogan JR, Watling CJ, Eva KW. Idiosyncrasy in assessment comments: do faculty have distinct writing styles when completing in-training evaluation reports? Acad Med J Assoc Am Med Coll. 2020;95 (11S Association of American Medical Colleges Learn Serve Lead: Proceedings of the 59th Annual Research in Medical Education Presentations):S81-S88. https://doi.org/10.1097/ACM.0000000000003643.

  27. Boyd R, Ashokkumar A, Seraj S, Pennebaker J. The Development and Psychometric Properties of LIWC-22. 2022. https://doi.org/10.13140/RG.2.2.23890.43205.

  28. Tausczik YR, Pennebaker JW. The psychological meaning of words: LIWC and computerized text analysis methods. J Lang Soc Psychol. 2010;29(1):24-54. https://doi.org/10.1177/0261927X09351676.

  29. Ginsburg S, Stroud L, Lynch M, Melvin L, Kulasegaram K. Beyond the ratings: gender effects in written comments from clinical teaching assessments. Adv Health Sci Educ Theory Pract. 2022;27(2):355-374. https://doi.org/10.1007/s10459-021-10088-1.

  30. Linguistic Markers of Psychological Change Surrounding September 11, 2001 - Michael A. Cohn, Matthias R. Mehl, James W. Pennebaker, 2004. Accessed September 20, 2023. https://doi.org/10.1111/j.0956-7976.2004.00741.x.

  31. Monzani D, Vergani L, Pizzoli SFM, Marton G, Pravettoni G. Emotional tone, analytical thinking, and somatosensory processes of a sample of italian tweets during the first phases of the COVID-19 pandemic: observational study. J Med Internet Res. 2021;23(10):e29820. https://doi.org/10.2196/29820.

  32. Madera JM, Hebl MR, Martin RC. Gender and letters of recommendation for academia: agentic and communal differences. J Appl Psychol. 2009;94(6):1591-1599. https://doi.org/10.1037/a0016539.

  33. Gaucher D, Friesen J, Kay AC. Evidence that gendered wording in job advertisements exists and sustains gender inequality. J Pers Soc Psychol. 2011;101(1):109-128. https://doi.org/10.1037/a0022530.

  34. The big two dictionaries: Capturing agency and communion in natural language - Pietraszkiewicz - 2019 - European Journal of Social Psychology - Wiley Online Library. Accessed September 20, 2023. https://doi.org/10.1002/ejsp.2561.

Download references

Acknowledgements:

Contributors: Not applicable

Funding

Not applicable

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Janae K. Heath MD, MS.

Ethics declarations

Conflict of Interest:

The authors declare that they do not have a conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 40 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Heath, J.K., Kogan, J.R., Holmboe, E.S. et al. Gender Differences in Work-Based Assessment Scores and Narrative Comments After Direct Observation. J GEN INTERN MED (2024). https://doi.org/10.1007/s11606-024-08645-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11606-024-08645-6

KEY WORDS

Navigation