Beyond the ratings: gender effects in written comments from clinical teaching assessments

Ginsburg, Shiphra; Stroud, Lynfa; Lynch, Meghan; Melvin, Lindsay; Kulasegaram, Kulamakan

doi:10.1007/s10459-021-10088-1

Beyond the ratings: gender effects in written comments from clinical teaching assessments

Published: 28 January 2022

Volume 27, pages 355–374, (2022)
Cite this article

Advances in Health Sciences Education Aims and scope Submit manuscript

Shiphra Ginsburg ORCID: orcid.org/0000-0002-4595-6650^1,2,3,9,
Lynfa Stroud^2,4,
Meghan Lynch⁵,
Lindsay Melvin⁶ &
…
Kulamakan Kulasegaram^2,7,8

780 Accesses
2 Citations
7 Altmetric
Explore all metrics

Abstract

Assessment of clinical teachers by learners is problematic. Construct-irrelevant factors influence ratings, and women teachers often receive lower ratings than men. However, most studies focus only on numeric scores. Therefore, the authors analyzed written comments on 4032 teacher assessments, representing 282 women and 448 men teachers in one Department of Medicine, to explore for gender differences. NVivo was used to search for 61 evidence- and theoretically-based terms purported to reflect teaching excellence, which were analyzed using 2 × 2 chi-squared tests. The Linguistic Index and Word Count (LIWC) was used to categorize comment data, which were analyzed using linear regressions. The only significant difference in NVivo was that men were more likely than women to have the word “available” in a comment (OR 1.4, p < .05). A subset of LIWC variables showed significant gender differences, but all effects were modest. Men teachers had more positive emotion words written about them, while negative emotion words appeared equally. Significant differences occurred more often between the men and women residents who wrote the comments, rather than those attributed to the gender of the teachers. For example, women residents used more social and gender-related words (β 1.87, p < 0.001) and fewer words related to power or achievement (β −3.78, p < 0.001) than men residents. Profound gender differences were not found in teacher assessment comments in this large, diverse academic department of medicine, which differs from other studies. The authors explore possible reasons including differences in departmental culture and issues related to the methods used.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analysis of narrative assessments of internal medicine resident performance: are there differences associated with gender or race and ethnicity?

Article Open access 17 January 2024

Gender Differences in Work-Based Assessment Scores and Narrative Comments After Direct Observation

Article 30 January 2024

Hedging to save face: a linguistic analysis of written comments on in-training evaluation reports

Article 17 July 2015

References

@facultyfocus. (2018). What to Do About Those Negative Comments on Course Evaluations. @facultyfocus; [updated 2018–05–30; accessed]. https://www.facultyfocus.com/articles/educational-assessment/negative-comments-on-course-evaluations/.
Avoiding gender bias in reference writing. (2021). [Accessed]. https://csw.arizona.edu/sites/default/files/avoiding_gender_bias_in_letter_of_reference_writing.pdf.
Billick, M., Rassos, J., & Ginsburg, S. (2021). Dressing the part: Gender differences in residents’ perceptions of feedback in internal medicine. Academic Medicine. https://doi.org/10.1097/ACM.0000000000004487
Article Google Scholar
de Groot, J., Brunet, A., Kaplan, A. S., & Bagby, M. (2003). A comparison of evaluations of male and female psychiatry supervisors. Academic Psychiatry, 27(1), 39–43.
Article Google Scholar
Dory, V., Cummings, B.-A., Mondou, M., & Young, M. (2019). Nudging clinical supervisors to provide better in-training assessment reports [journal article]. Perspectives on Medical Education, 9, 66–70.
Article Google Scholar
Dudek, N. L., Marks, M., Bandiera, G., White, J., & Wood, T. J. (2013). Quality in-training evaluation reports–does feedback drive faculty performance? Academic Medicine., 88(8), 1129–1134.
Article Google Scholar
Fassiotto, M., Li, J., Maldonado, Y., & Kothary, N. (2018). Female surgeons as counter stereotype: The impact of gender perceptions on trainee evaluations of physician faculty. Journal of Surgical Education, 75(5), 1140–1148.
Article Google Scholar
Files, J. A., Mayer, A. P., Ko, M. G., Friedrich, P., Jenkins, M., Bryan, M. J., Vegunta, S., Wittich, C. M., Lyle, M. A., Melikian, R., Duston, T., Chang, Y.-H.H., & Hayes, S. N. (2017). Speaker introductions at internal medicine grand rounds: Forms of address reveal gender bias. Journal of Women’s Health, 26(5), 413–419.
Article Google Scholar
Fluit, C. R. M. G., Feskens, R., Bolhuis, S., Grol, R., Wensing, M., & Laan, R. (2015). Understanding resident ratings of teaching in the workplace: A multi-centre study. Advances in Health Sciences Education., 20(3), 691–707.
Article Google Scholar
Ginsburg, S., Gingerich, A., Kogan, J. R., Watling, C. J., & Eva, K. W. (2020a). Idiosyncrasy in assessment comments: Do faculty have distinct writing styles when completing in-training evaluation reports? Academic Medicine, 95, S81–S88.
Article Google Scholar
Ginsburg, S., Gold, W., Cavalcanti, R. B., Kurabi, B., & McDonald-Blumer, H. (2011). Competencies “plus”: The nature of written comments on internal medicine residents’ evaluation forms. Academic Medicine, 86(10 Suppl), S30-34.
Article Google Scholar
Ginsburg, S., Kogan, J. R., Gingerich, A., Lynch, M., & Watling, C. J. (2020b). Taken out of context: hazards in the interpretation of written assessment comments. Academic Medicine, 95(7), 1082–1088.
Article Google Scholar
Ginsburg, S., Regehr, G., Lingard, L., & Eva, K. W. (2015). Reading between the lines: Faculty interpretations of narrative evaluation comments. Medical Education, 49(3), 296–306.
Article Google Scholar
Ginsburg, S., van der Vleuten, C., Eva, K. W., & Lingard, L. (2016). Hedging to save face: A linguistic analysis of written comments on in-training evaluation reports. Advances in Health Sciences Education: Theory and Practice, 21(1), 175–188.
Article Google Scholar
Hamermesh, D. S., & Parker, A. (2005a). Beauty in the classroom: Instructors’ pulchritude and putative pedagogical productivity. Economics of Education Review, 24(4), 369–376.
Article Google Scholar
Hamermesh, D. S., & Parker, A. (2005b). Beauty in the classroom: Instructors’ pulchritude and putative pedagogical productivity. Economics of Education Review, 24(4), 369–376.
Article Google Scholar
Heath, J. K., Clancy, C. B., Carillo-Perez, A., & Dine, C. J. (2020). Assessment of gender-based qualitative differences within trainee evaluations of faculty. Annals of the American Thoracic Society, 17(5), 621–626.
Article Google Scholar
Heath, J. K., Weissman, G. E., Clancy, C. B., Shou, H., Farrar, J. T., & Dine, C. J. (2019). Assessment of gender-based linguistic differences in physician trainee evaluations of medical faculty using automated text mining. JAMA Network Open, 2(5), e193520–e193520.
Article Google Scholar
Hessler, M., Pöpping, D. M., Hollstein, H., Ohlenburg, H., Arnemann, P. H., Massoth, C., Seidel, L. M., Zarbock, A., & Wenk, M. (2018). Availability of cookies during an academic course session affects evaluation of teaching. Medical Education, 52(10), 1064–1072.
Article Google Scholar
Hirshfield, L. E. (2014). ‘She’s not good with crying’: The effect of gender expectations on graduate students’ assessments of their principal investigators. Gender and Education, 26(6), 601–617.
Article Google Scholar
Hui, K., Sukhera, J., Vigod, S., Taylor, V. H., & Zaheer, J. (2020). Recognizing and addressing implicit gender bias in medicine. Canadian Medical Association Journal, 192(42), E1269–E1270.
Article Google Scholar
Jones, R. F., & Froom, J. D. (1994). Faculty and administration views of problems in faculty evaluation. Academic Medicine, 69(6), 476–483.
Article Google Scholar
MacNell, L., Driscoll, A., & Hunt, A. N. (2015). What’s in a name: Exposing gender bias in student ratings of teaching. Innovative Higher Education, 40(4), 291–303.
Article Google Scholar
Madera, J. M., Hebl, M. R., & Martin, R. C. (2009). Gender and letters of recommendation for academia: Agentic and communal differences. Journal of Applied Psychology, 94(6), 1591–1599.
Article Google Scholar
McOwen, K. S., Bellini, L. M., Guerra, C. E., & Shea, J. A. (2007). Evaluation of clinical faculty: Gender and minority implications. Academic Medicine, 82(10 Suppl), S94-96.
Article Google Scholar
Mitchell, K. M. W., & Martin, J. (2018). Gender bias in student evaluations. Political Science & Politics, 51(03), 648–652.
Article Google Scholar
Morgan, H. K., Purkiss, J. A., Porter, A. C., Lypson, M. L., Santen, S. A., Christner, J. G., Grum, C. M., & Hammoud, M. M. (2016). Student evaluation of faculty physicians: Gender differences in teaching evaluations. Journal of Women’s Health (2002), 25(5), 453–456.
Article Google Scholar
Myers, K. A., Zibrowski, E. M., & Lingard, L. (2011). A mixed-methods analysis of residents’ written comments regarding their clinical supervisors. Academic Medicine, 86(10), S21–S24.
Article Google Scholar
Nebeker, C. A., Basson, M. D., Haan, P. S., Davis, A. T., Ali, M., Gupta, R. N., Osmer, R. L., Hardaway, J. C., Peshkepija, A. N., McLeod, M. K., et al. (2017). Do female surgeons learn or teach differently? American Journal of Surgery, 213(2), 282–287.
Article Google Scholar
Newman, M. L., Groom, C. J., Handelman, L. D., & Pennebaker, J. W. (2008). Gender differences in language use: an analysis of 14,000 text samples. Discourse Processes, 45(3), 211–236.
Article Google Scholar
Pennebaker, J. W., Booth, R.J., Boyd, R., Francis, M. E. (2015). LIWC Operator's Manual. Austin, Texas.
Riniolo, T. C., Johnson, K. C., Sherman, T. R., & Misso, J. A. (2006). Hot or not: Do professors perceived as physically attractive receive higher student evaluations? The Journal of General Psychology, 133(1), 19–35.
Article Google Scholar
Rozin, P., & Royzman, E. B. (2001). Negativity bias, negativity dominance, and contagion. Personality and Social Psychology Review, 5(4), 296–320.
Article Google Scholar
Rubini, M., & Menegatti, M. (2014). Hindering women’s careers in academia: gender linguistic bias in personnel selection. Journal of Language and Social Psychology, 33(6), 632–650.
Article Google Scholar
Schmader, T., Whitehead, J., & Wysocki, V. H. (2007). A linguistic comparison of letters of recommendation for male and female chemistry and biochemistry job applicants. Sex Roles, 57(7–8), 509–514.
Article Google Scholar
Schmidt, B. (2020). Gendered Language in Teacher Reviews. [accessed]. http://benschmidt.org/profGender/#%7B%22database%22%3A%22RMP%22%2C%22plotType%22%3A%22pointchart%22%2C%22method%22%3A%22return_json%22%2C%22search_limits%22%3A%7B%22word%22%3A%5B%22funny%22%5D%2C%22department__id%22%3A%7B%22%24lte%22%3A25%7D%7D%2C%22aesthetic%22%3A%7B%22x%22%3A%22WordsPerMillion%22%2C%22y%22%3A%22department%22%2C%22color%22%3A%22gender%22%7D%2C%22counttype%22%3A%5B%22WordCount%22%2C%22TotalWords%22%5D%2C%22groups%22%3A%5B%22unigram%22%5D%2C%22testGroup%22%3A%22C%22%7D.
Shellito, A. D., de Virgilio, C., Lee, G., Aarons, C. B., Namm, J. P., Smink, D. S., Tanner, T., Brasel, K. J., Poola, V. P., & Calhoun, K. E. (2020). Investigating association between sex and faculty teaching evaluation in general surgery residency programs: a multi-institutional study. Journal of the American College of Surgeons, 231(3), 309-315.e301.
Article Google Scholar
Storage, D., Horne, Z., Cimpian, A., & Leslie, S.-J. (2016). The frequency of “brilliant” and “genius” in teaching evaluations predicts the representation of women and african americans across fields. PLoS One, 11(3), e0150194.
Article Google Scholar
Stroud, L., Freeman, R., Kulasegaram, M. K., Cil, T. D., & Ginsburg, S. (2020). Gender effects in assessment of clinical teaching: Does concordance matter? Journal of Graduate Medical Education, 12(6), 710–716.
Article Google Scholar
Tausczik, Y. R., & Pennebaker, J. W. (2010). The psychological meaning of words: LIWC and computerized text analysis methods. Journal of Language and Social Psychology., 29(1), 24–54.
Article Google Scholar
Toma, C. L., & D’Angelo, J. D. (2015). Tell-tale words: Linguistic cues used to infer the expertise of online medical advice. Journal of Language and Social Psychology, 34(1), 25–45.
Article Google Scholar
Trix, F., & Psenka, C. (2003). Exploring the color of glass: Letters of recommendation for female and male medical faculty. Discourse & Society, 14(2), 191–220.
Article Google Scholar
Uijtdehaage, S., & O’Neal, C. (2015). A curious case of the phantom professor: Mindless teaching evaluations by medical students. Medical Education, 49(9), 928–932.
Article Google Scholar
Uttl, B., White, C. A., & Gonzalez, D. W. (2016). Meta-analysis of faculty’s teaching effectiveness: Student evaluation of teaching ratings and student learning are not related. Studies in Educational Evaluation, 54, 22–42.
Article Google Scholar
van der Leeuw, R. M., Overeem, K., Arah, O. A., Heineman, M. J., & Lombarts, K. M. J. M. H. (2013). Frequency and determinants of residents’ narrative feedback on the teaching performance of faculty: Narratives in numbers. Academic Medicine, 88(9), 1324–1331.
Article Google Scholar
Zabaleta, F. (2007). The use and misuse of student evaluations of teaching. Teaching in Higher Education, 12(1), 55–76.
Article Google Scholar

Download references

Acknowledgements

The authors wish to thank Mr. Ed Lorens, Research Officer in the Department of Medicine, for compiling and anonymizing the data.

Funding

Dr. Ginsburg is supported as the Canada Research Chair for Health Professions Education.

Author information

Authors and Affiliations

Department of Medicine, Sinai Health System, Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada
Shiphra Ginsburg
Wilson Centre for Research in Education, University Health Network and University of Toronto, Toronto, Ontario, Canada
Shiphra Ginsburg, Lynfa Stroud & Kulamakan Kulasegaram
Canada Research Chair in Health Professions Education, Ottawa, Canada
Shiphra Ginsburg
Department of Medicine, Sunnybrook HSC and Temerty Faculty of Medicine, Toronto, Ontario, Canada
Lynfa Stroud
Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
Meghan Lynch
Department of Medicine, Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada
Lindsay Melvin
Department of Family and Community Medicine, Temerty Faculty of Medicine, Toronto, Ontario, Canada
Kulamakan Kulasegaram
Temerty Chair in Learner Assessment and Program Evaluation, University of Toronto, Toronto, Ontario, Canada
Kulamakan Kulasegaram
Mount Sinai Hospital, 433-600, University Ave., Toronto, Ontario, M5G 1X5, Canada
Shiphra Ginsburg

Authors

Shiphra Ginsburg
View author publications
You can also search for this author in PubMed Google Scholar
Lynfa Stroud
View author publications
You can also search for this author in PubMed Google Scholar
Meghan Lynch
View author publications
You can also search for this author in PubMed Google Scholar
Lindsay Melvin
View author publications
You can also search for this author in PubMed Google Scholar
Kulamakan Kulasegaram
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shiphra Ginsburg.

Ethics declarations

Ethical approval

The Research Ethics Board at the University of Toronto gave approval for this study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

	Number of individuals with this code attached									Percentage of total codes by gender
	M with code	M without code	W with code	W without code	Total (730)					61.36% Gender_Teacher = Man (448)	38.60% Gender_Teacher = Woman (282)	Total (730)
1: Available	148	300	72	210	220	The chi-square statistic is 4.6283. The p-value is 0.031449. Significant at p < 0.05			1: Available	67.27%	32.73%	100%
2: Unavailable	3	445	0	282	3	NS			2: Unavailable	100%	0%	100%
3: Approachable	141	307	94	188	235				3: Approachable	60%	40%	100%
4: Not approachable, uncomfortable	3	445	3	279	6				4: Not approachable, uncomfortable	50%	50%	100%
5: Comfortable, welcoming, safe	0	448	0	282	0				5: Comfortable, welcoming, safe	0.00%	0.00%	0%
6: Comfortable	32	416	30	252	62				6: Comfortable	51.61%	48.39%	100%
7: Welcoming	44	404	22	260	66				7: Welcoming	66.67%	33.33%	100%
8: Safe environment	26	422	21	261	47				8: Safe environment	55%	45%	100%
9: Support	224	224	159	123	383				9: Support	58.49%	41.51%	100%
10: Explore limits	14	434	9	273	23				10: Explore limits	60.87%	39.13%	100%
11: Autonomy	57	391	38	244	95				11: Autonomy	60.00%	40.00%	100%
12: Micromanage, hands-on	7	441	8	274	15				12: Micromanage, hands-on	47%	53%	100%
13: Independence	103	345	69	213	172				13: Independence	59.88%	40.12%	100%
14: Feedback	138	310	88	194	226				14: Feedback	61%	39%	100%
15: Feedback—neg	15	433	10	272	25				15: Feedback—neg	60.00%	40.00%	100%
16: Personality	10	438	4	278	14				16: Personality	71.43%	28.57%	100%
17: Personality characteristics	0	448	0	282	0				17: Personality characteristics	0.00%	0.00%	0%
18: Friendly	56	392	29	253	85				18: Friendly	65.88%	34.12%	100%
19: Intimidating	5	443	4	278	9				19: Intimidating	55.56%	44.44%	100%
20: Not intimidating	11	437	7	275	18				20: Not intimidating	61%	39%	100%
21: Kind	86	362	60	222	146				21: Kind	59%	41%	100%
22: Caring	32	416	27	255	59				22: Caring	54.24%	45.76%	100%
23: Empathic	15	433	10	272	25				23: Empathic	60.00%	40.00%	100%
24: Warm	9	439	11	271	20				24: Warm	45.00%	55.00%	100%
25: Cold	2	446	1	281	3				25: Cold	66.67%	33.33%	100%
26: Belittling, condescending etc	5	443	4	278	9				26: Belittling, condescending etc	55.56%	44.44%	100%
27: Enthusiastic	48	400	37	245	85				27: Enthusiastic	56.47%	43.53%	100%
28: Eager	13	435	6	276	19				28: Eager	68.42%	31.58%	100%
29: Fun, exciting	69	379	32	250	101				29: Fun, exciting	68.32%	31.68%	100%
30: Sense of humour, funny	23	425	7	275	30	Sig X = 3.099,p < 0.079		30: Sense of humour, funny	77%	23%	100%
31: Person	58	390	28	254	86				31: Person	67.44%	32.56%	100%
32: Human	11	437	3	279	14				32: Human	79%	21%	100%
33: Respect	105	343	68	214	173				33: Respect	60.69%	39.31%	100%
34: Disrespect	0	448	1	281	1				34: Disrespect	0.00%	100.00%	100%
35: Learner	50	398	38	244	88				35: Learner	56.82%	43.18%	100%
36: Learning-top	0	448	0	282	0				36: Learning-top	0.00%	0.00%	0%
37: Learning	213	235	138	144	351				37: Learning	60.68%	39.32%	100%
38: Learned	83	365	47	235	130				38: Learned	63.85%	36.15%	100%
39: Teacher	324	124	198	84	522				39: Teacher	62.07%	37.93%	100%
40: Teaching	20	428	11	271	31				40:Teaching	64.52%	35.48%	100%
41: Educator	35	413	32	250	67				41: Educator	52.24%	47.76%	100%
42: Attending	49	399	20	262	69	2.99	1	0.084		71.01%	28.99%	100%
43: Supervisor	75	373	41	241	116				43: Supervisor	64.66%	35.34%	100%
44: Doctor	25	423	8	274	33	3.018^a	1	0.082		76%	24%	100%
45: Dr	53	395	26	256	79				45: Dr	67.09%	32.91%	100%
46: Physician	108	340	66	216	174				46: Physician	62.07%	37.93%	100%
47: Clinician	40	408	29	253	69				47: Clinician	57.97%	42.03%	100%
48: Positive Adjectives	0	448	0	282	0				48: Positive Adjectives	0.00%	0.00%	0%
49: Good	189	259	110	172	299				49: Good	63.21%	36.79%	100%
50: Excellent	290	158	173	109	463				50: Excellent	62.63%	37.37%	100%
51: Exemplary	57	391	27	255	84				51: Exemplary	67.86%	32.14%	100%
52: Outstanding	71	377	44	238	115				52: Outstanding	61.74%	38.26%	100%
53: Exceptional	109	339	60	222	169				53: Exceptional	64.50%	35.50%	100%
54: Role Model	159	289	113	169	272				54: Role Model	58.46%	41.54%	100%
55: Evidence	47	401	24	258	71				55: Evidence	66.20%	33.80%	100%
56: Bedside manner	40	408	23	259	63				56: Bedside manner	63.49%	36.51%	100%
57: Time—pos	163	285	109	173	272				57: Time—pos	60%	40%	100%
58: Time—neg	28	420	20	262	48				58: Time—neg	58.33%	41.67%	100%
59: Efficient	82	366	39	243	121				59: Efficient	67.77%	32.23%	100%
60: Inefficient	6	442	3	279	9				60: Inefficient	66.67%	33.33%	100%
61: Disorganized	3	445	3	279	6				61: Disorganized	50%	50%	100%
Total (unique)	437		276		713				Total (unique)	61.29%	38.71%	100%

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ginsburg, S., Stroud, L., Lynch, M. et al. Beyond the ratings: gender effects in written comments from clinical teaching assessments. Adv in Health Sci Educ 27, 355–374 (2022). https://doi.org/10.1007/s10459-021-10088-1

Download citation

Received: 25 June 2021
Accepted: 12 December 2021
Published: 28 January 2022
Issue Date: May 2022
DOI: https://doi.org/10.1007/s10459-021-10088-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Beyond the ratings: gender effects in written comments from clinical teaching assessments

Abstract

Access this article

Similar content being viewed by others

Analysis of narrative assessments of internal medicine resident performance: are there differences associated with gender or race and ethnicity?

Gender Differences in Work-Based Assessment Scores and Narrative Comments After Direct Observation

Hedging to save face: a linguistic analysis of written comments on in-training evaluation reports

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethical approval

Additional information

Publisher's Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Beyond the ratings: gender effects in written comments from clinical teaching assessments

Abstract

Access this article

Similar content being viewed by others

Analysis of narrative assessments of internal medicine resident performance: are there differences associated with gender or race and ethnicity?

Gender Differences in Work-Based Assessment Scores and Narrative Comments After Direct Observation

Hedging to save face: a linguistic analysis of written comments on in-training evaluation reports

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethical approval

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation