Skip to main content

Evaluating Student Evaluations of Teaching: a Review of Measurement and Equity Bias in SETs and Recommendations for Ethical Reform

Abstract

Student evaluations of teaching are ubiquitous in the academe as a metric for assessing teaching and frequently used in critical personnel decisions. Yet, there is ample evidence documenting both measurement and equity bias in these assessments. Student Evaluations of Teaching (SETs) have low or no correlation with learning. Furthermore, scholars using different data and different methodologies routinely find that women faculty, faculty of color, and other marginalized groups are subject to a disadvantage in SETs. Extant research on bias on teaching evaluations tend to review only the aspect of the literature most pertinent to that study. In this paper, we review a novel dataset of over 100 articles on bias in student evaluations of teaching and provide a nuanced review of this broad but established literature. We find that women and other marginalized groups do face significant biases in standard evaluations of teaching – however, the effect of gender is conditional upon other factors. We conclude with recommendations for the judicious use of SETs and avenues for future research.

This is a preview of subscription content, access via your institution.

Notes

  1. For now, the entirety of this discussion and related research is binary in its orientation. We recognize that gender is more complex than women and men and acknowledge that gender identity that does not overtly conform to the binary likely complicates evaluations of teaching further than the existing body of knowledge has even identified

  2. A full list of articles and article summaries are available at < redacted > 

  3. Though see Basow and Montgomery (2005), which finds no significant interactions between student and faculty gender

  4. Research also finds that the role of attractiveness is more relevant to women, who are more likely to get comments about their appearance (Mitchell & Martin, 2018; Key & Ardoin, 2019). This is problematic given that attractiveness has been shown to be correlated with evaluations of instructional quality (Rosen, 2018)

References

  • Abel, M. H., & Meltzer, A. L. (2007). Student ratings of a male and female professors’ lecture on sex discrimination in the workforce. Sex Roles, 57(3–4), 173–180

    Google Scholar 

  • Abrami, P. C. (2001). Improving judgments about teaching effectiveness using teacher rating forms. New Directions for Institutional Research, 2001(109), 59–87

    Google Scholar 

  • Adams, M. J. D., & Umbach, P. D. (2012). Nonresponse and online student evaluations of teaching: understanding the influence of salience, fatigue, and academic environments. Research in Higher Education, 53(5), 576–591

    Google Scholar 

  • Anderson, K. J. (2010). Students’ stereotypes of professors: An exploration of the double violations of ethnicity and gender. Social Psychology of Education, 13(4), 459–472

    Google Scholar 

  • Anderson, K. J., & Kanner, M. (2011). Inventing a Gay Agenda: Students’ Perceptions of Lesbian and Gay Professors 1. Journal of Applied Social Psychology, 41(6), 1538–1564

    Google Scholar 

  • Anderson, K. J., & Smith, G. (2005). Students’ preconceptions of professors: Benefits and barriers according to ethnicity and gender. Hispanic Journal of Behavioral Sciences, 27(2), 184–201

    Google Scholar 

  • Aguirre Jr, A. (2000). Women and Minority Faculty in the Academic Workplace: Recruitment, Retention, and Academic Culture. ASHE-ERIC Higher Education Report, Volume 27, Number 6. Jossey-Bass Higher and Adult Education Series. Jossey-Bass, 350 Sansome St., San Francisco, CA 94104-1342.

  • APSA. (2011). Political science in the 21st century edited by report of the task force on political science in the 21st century

  • Arbuckle, J., & Williams, B. D. (2003). Students’ perceptions of expressiveness: Age and gender effects on teacher evaluations. Sex Roles, 49(9–10), 507–516

    Google Scholar 

  • Arreola, R. A. (2004). Developing a comprehensive faculty evaluation system. Magna Publications

  • Bachen, C. M., McLoughlin, M. M., & Garcia, S. S. (1999). Assessing the role of gender in college students’ evaluations of faculty. Communication Education, 48(3), 193–210

    Google Scholar 

  • Baker, P., & Copp, M. (1997). Gender matters most: the interaction of gendered expectations, feminist course content, and pregnancy in student course evaluations. Teaching Sociology: 29–43

  • Barbezat, D. A., & Hughes, J. W. (2005). Salary structure effects and the gender pay gap in academia. Research in Higher Education, 46(6), 621–640.

    Google Scholar 

  • Bos, A. L., Sweet-Cushman, J., & Schneider, M. C. (2019). Family-friendly academic conferences: a missing link to fix the “leaky pipeline”? Politics, Groups, and Identities, 7(3), 748–758.

    Google Scholar 

  • Basow, S. A., & Distenfeld, M. S. (1985). Teacher expressiveness: More important for male teachers than female teachers? Journal of Educational Psychology, 77(1), 45

    Google Scholar 

  • Basow, S. A., & Howe, K. G. (1987). Evaluations of college professors: Effects of professors’ sex-type, and sex, and students’ sex. Psychological Reports, 60(2), 671–678

    Google Scholar 

  • Basow, S. A. (1995). Student evaluations of college professors: When gender matters. Journal of Educational Psychology, 87(4), 656

    Google Scholar 

  • Basow, S. A. (2000). Best and worst professors: Gender patterns in students’ choices. Sex Roles, 43(5–6), 407–417

    Google Scholar 

  • Basow, S. A., & Montgomery, S. (2005). Student ratings and professor self-ratings of college teaching: Effects of gender and divisional affiliation. Journal of Personnel Evaluation in Education, 18(2), 91–106

    Google Scholar 

  • Basow, S. A., & Silberg, N. T. (1987). Student evaluations of college professors: Are female and male professors rated differently? Journal of Educational Psychology, 79(3), 308

    Google Scholar 

  • Bennett, S. K. (1982). Student perceptions of and expectations for male and female instructors: Evidence relating to the question of gender bias in teaching evaluation. Journal of Educational Psychology, 74(2), 170

    Google Scholar 

  • Benton, S. L., & Cashin, W. E. (2012). Student ratings of teaching: a summary of research and literature (IDEA Paper no. 50). Manhattan, KS: The IDEA Center

  • Bian, L., Leslie, S.-J., & Cimpian, A. (2017). Gender stereotypes about intellectual ability emerge early and influence children’s interests. Science, 355(6323), 389–391

    Google Scholar 

  • Boring, A. (2017). Gender biases in student evaluations of teaching. Journal of Public Economics, 145, 27–41

    Google Scholar 

  • Boring, A., Ottoboni, K., & Stark, P. (2016). Student evaluations of teaching (mostly) do not measure teaching effectiveness. ScienceOpen Research

  • Bray, J. H., & Howard, G. S. (1980). Interaction of teacher and student sex and sex role orientations and student evaluations of college instruction. Contemporary Educational Psychology,5(3), 241–248

    Google Scholar 

  • Burns-Glover, A. L., & Veith, D. J. (1995). Revisiting gender and teaching evaluations: Sex still makes a difference. Journal of Social Behavior and Personality, 10(4), 69

    Google Scholar 

  • Centra, J. A. (2000). Evaluating the Teaching Portfolio: A Role for Colleagues. New Directions for Teaching and Learning, 83, 87–93

    Google Scholar 

  • Centra, J. A., & Gaubatz, N. B. (1998). Is there gender bias in student ratings of instruction. Journal of Higher Education, 70, 17–33

    Google Scholar 

  • Chamberlin, M. S., & Hickey, J. S. (2001). Student evaluations of faculty performance: The role of gender expectationis in differential evaluations. Educational Research Quarterly, 25(2), 3

    Google Scholar 

  • Chapman, D. D., & Joines, J. A. (2017). Strategies for Increasing Response Rates for Online End-of-Course Evaluations. International Journal of Teaching and Learning in Higher Education, 29(1), 47–60

    Google Scholar 

  • Chávez, K., & Mitchell, K. M. (2020). Exploring bias in student evaluations: Gender, race, and ethnicity. PS: Political Science & Politics53(2), 270-274.

  • Chism, N. V. N. (2007). Peer Review of Teaching. A Sourcebook. Bolton Massachusetts: Anker

  • Eagly, A. H., & Karau, S. J. (2002). Role congruity theory of prejudice toward female leaders. Psychological Review, 109(3), 573

    Google Scholar 

  • El-Alayli, A., Hansen-Brown, A. A., & Ceynar, M. (2018). Dancing backwards in high heels: Female professors experience more work demands and special favor requests, particularly from academically entitled students. Sex Roles, 79(3–4), 136–150

    Google Scholar 

  • Elmore, P. B., & LaPointe, K. A. (1974). Effects of teacher sex and student sex on the evaluation of college instructors. Journal of Educational Psychology, 66(3), 386.

    Google Scholar 

  • Elmore, P. B., & LaPointe, K. A. (1975). Effect of teacher sex, student sex, and teacher warmth on the evaluation of college instructors. Journal of Educational Psychology, 67(3), 368

    Google Scholar 

  • Esarey, J., & Valdes, N. (2020). Unbiased, reliable, and valid student evaluations can still be unfair. Assessment & Evaluation in Higher Education

  • Ewing, V. L., Stukas Jr, A. A., & Sheehan, E. P. (2003). Student prejudice against gay male and lesbian lecturers. The Journal of Social Psychology, 143(5), 569–579

    Google Scholar 

  • Fan, Y., Shepherd, L. J., Slavich, E., Waters, D., Stone, M., Abel, R., & Johnston, E. L. (2019). Gender and cultural bias in student evaluations: Why representation matters. PLoS One, 14(2), e0209749

    Google Scholar 

  • Feldman, K. A. (1992). College students’ views of male and female college teachers: Part I—Evidence from the social laboratory and experiments. Research in Higher Education, 33(3), 317–375

    Google Scholar 

  • Fischer, E., & Hänze, M. (2019). Bias hypotheses under scrutiny: investigating the validity of student assessment of university teaching by means of external observer ratings. Assessment & Evaluation in Higher Education, 44(5), 772–786

    Google Scholar 

  • Franklin, J. (2001). Interpreting the numbers: Using a narrative to help others read student evaluations of your teaching accurately. New Directions for Teaching and Learning, 87, 85–100

    Google Scholar 

  • Franklin, J., & Theall, M. (1995). The relationship of disciplinary differences and the value of class preparation time to student ratings of teaching. New Directions for Teaching and Learning, 1995(64), 41–48

    Google Scholar 

  • Freeman, H. R. (1994). Student evaluations of college instructors: Effects of type of course taught, instructor gender and gender role, and student gender. Journal of Educational Psychology, 86(4), 627

    Google Scholar 

  • Greenwald, A. G., & Gillmore, G. M. (1997). No pain, no gain? The importance of measuring course workload in student ratings of instruction. Journal of Educational Psychology, 89(4), 743

    Google Scholar 

  • Hamermesh, D. S., & Parker, A. (2005). Beauty in the classroom: Instructors’ pulchritude and putative pedagogical productivity. Economics of Education Review, 24(4), 369–376

    Google Scholar 

  • Harris, M. B. (1975). Sex role stereotypes and teacher evaluations. Journal of Educational Psychology, 67(6), 751

    Google Scholar 

  • Ḥaṭiva, N. (2013a). Student ratings of instruction: a practical approach to designing, operating, and reporting. Oron Publications

  • Ḥaṭiva, N. (2013b). Student ratings of instruction: Recognizing effective teaching. Oron Publications

  • Hessler, M., Pöpping, D. M., Hollstein, H., Ohlenburg, H., Arnemann, P. H., Massoth, C., et al. (2018). Availability of cookies during an academic course session affects evaluation of teaching. Medical Education, 52(10), 1064–1072

    Google Scholar 

  • Himelein, M. J. (2018). Pitfalls of using student comments in the evaluation of faculty. Academic Briefing: Expert Advice for Higher Ed Leaders. https://www.academicbriefing.com/human-resources/faculty-evaluation/pitfalls-of-using-student-comments-evaluation-of-faculty/

  • Kaschak, E. (1978). Sex bias in student evaluations of college professors. Psychology of Women Quarterly, 2(3), 235–243

    Google Scholar 

  • Kaschak, E. (1981). Another look at sex bias in students’ evaluations of professors: Do winners get the recognition that they have been given? Psychology of Women Quarterly, 5(5_suppl), 767–772

    Google Scholar 

  • Key, E., & Ardoin, P. (2019). Students rate male instructors more highly than female instructors. We tried to counter that hidden bias. Washington Post. Accessed 3 Sep 2019. https://www.washingtonpost.com/politics/2019/08/20/students-rate-male-instructors-more-highly-than-female-instructors-we-tried-counter-that-hidden-bias/

  • Kierstead, D., D’agostino, P., & Dill, H. (1988). Sex role stereotyping of college professors: Bias in students’ ratings of instructors. Journal of Educational Psychology, 80(3), 342

    Google Scholar 

  • Leslie, S.-J., Cimpian, A., Meyer, M., & Freeland, E. (2015). Expectations of brilliance underlie gender distributions across academic disciplines. Science, 347(6219), 262–265

    Google Scholar 

  • Lindahl, M. W., & Unger, M. L. (2010). Cruelty in student teaching evaluations. College Teaching, 58(3), 71–76

    Google Scholar 

  • Linse, A. R. (2017). Interpreting and using student ratings data: Guidance for faculty serving as administrators and on evaluation committees. Studies in Educational Evaluation, 54, 94–106

    Google Scholar 

  • MacNell, L., Driscoll, A., & Hunt, A. N. (2015). What’s in a name: Exposing gender bias in student ratings of teaching. Innovative Higher Education, 40(4), 291–303

    Google Scholar 

  • Marsh, H. W. (1980). Research on students’ evaluations of teaching effectiveness. Instructional Evaluation, 4(5), 5–13

    Google Scholar 

  • Marsh, H. W. (1982a). Factors affecting students’ evaluations of the same course taught by the same instructor on different occasions. American Educational Research Journal, 19(4), 485–497

  • Marsh, H. W. (1982b). Validity of students’ evaluations of college teaching: A multitrait–multimethod analysis. Journal of Educational Psychology, 74(2), 264

  • Marsh, H. W. (1984). Students’ evaluations of university teaching: Dimensionality, reliability, validity, potential baises, and utility. Journal of Educational Psychology, 76(5), 707

    Google Scholar 

  • Martin, E. (1984). Power and authority in the classroom: Sexist stereotypes in teaching evaluations. Signs: Journal of Women in Culture and Society, 9(3), 482–492

    Google Scholar 

  • McPherson, M. A., Todd Jewell, R., & Kim, M. (2009). What determines student evaluation scores? A random effects analysis of undergraduate economics classes. Eastern Economic Journal, 35(1), 37–51

    Google Scholar 

  • Mengel, F., Sauermann, J., & Zölitz, U. (2018). Gender bias in teaching evaluations. Journal of the European Economic Association, 17(2), 535–566

    Google Scholar 

  • Miles, P., & House, D. (2015). The Tail Wagging the Dog; An Overdue Examination of Student Teaching Evaluations. International Journal of Higher Education, 4(2), 116–126

    Google Scholar 

  • Miller, J., & Seldin, P. (2014). Changing Practices in Faculty Evaluations: Can Better Evaluation Make a Difference? Academe, 100(3), 35–38

    Google Scholar 

  • Miller, J., & Chamberlin, M. (2000). Women are teachers, men are professors: A study of student perceptions. Teaching Sociology, 28(4), 283

    Google Scholar 

  • Mitchell, K. M. W., & Martin, J. (2018). Gender bias in student evaluations. Political Science & Politics, 51(3), 648–652

    Google Scholar 

  • Murray, H. G. (1984). The impact of formative and summative evaluation of teaching in North American universities. Assessment and Evaluation in Higher Education, 9(2), 117–132

    Google Scholar 

  • Murray, H. G. (1997). Does evaluation of teaching lead to improvement of teaching? The International Journal for Academic Development, 2(1), 8–23.

    Google Scholar 

  • Perna, L. W. (2005). The benefits of higher education: Sex, racial/ethnic, and socioeconomic group differences. The Review of Higher Education, 29(1), 23–52.

    Google Scholar 

  • Peterson, D. A. M., Biederman, L. A., Andersen, D., Ditonto, T. M., & Roe, K. (2019). Mitigating gender bias in student evaluations of teaching. PLoS One, 14(5), e0216241

    Google Scholar 

  • Piatak, J., & Mohr, Z. (2019). More gender bias in academia? Examining the influence of gender and formalization on student worker rule following. Journal of Behavioral Public Administration, 2(2)

  • Reid, L. D. (2010). The role of perceived race and gender in the evaluation of college teaching on RateMyProfessors. Com. Journal of Diversity in Higher Education, 3(3), 137

    Google Scholar 

  • Ridgeway, C. L. (2011). Framed by gender: How gender inequality persists in the modern world Oxford University Press

  • Rivera, L. A., & Tilcsik, A. (2019). Scaling Down Inequality: Rating Scales, Gender Bias, and the Architecture of Evaluation. American Sociological Review, 84(2), 248–274

    Google Scholar 

  • Rosen, A. S. (2018). Correlations, trends and potential biases among publicly accessible web-based student evaluations of teaching: a large-scale study of RateMyProfessors. com data. Assessment & Evaluation in Higher Education, 43(1), 31–44

    Google Scholar 

  • Rowden, G. V., & Carlson, R. E. (1996). Gender issues and students’ perceptions of instructors’ immediacy and evaluation of teaching and course. Psychological Reports, 78(3), 835–839

    Google Scholar 

  • Seldin, P., Miller, J. E., & Seldin, C. A. (2010). The teaching portfolio: A practical guide to improved performance and promotion/tenure decisions. John Wiley & Sons

  • Sidanius, J., & Crane, M. (1989). Job evaluation and gender: The case of university faculty. Journal of Applied Social Psychology, 19(2), 174–197

    Google Scholar 

  • Sinclair, L., & Kunda, Z. (2000). Motivated stereotyping of women: She’s fine if she praised me but incompetent if she criticized me. Personality and Social Psychology Bulletin, 26(11), 1329–1342.

    Google Scholar 

  • Smith, B. P., & Hawkins, B. (2011). Examining student evaluations of black college faculty: does race matter? Journal of Negro Education, 80(2)

  • Spooren, P., Brockx, B., & Mortelmans, D. (2013). On the validity of student evaluation of teaching: The state of the art. Review of Educational Research, 83(4), 598–642

    Google Scholar 

  • Sprague, J., & Massoni, K. (2005). Student evaluations and gendered expectations: What we can’t count can hurt us. Sex Roles, 53(11–12), 779–793

    Google Scholar 

  • Stark, P., & Freishtat, R. (2014). An evaluation of course evaluations. ScienceOpen. Center for Teaching and Learning, University of California, Berkley. Retrieved https://www.scienceopen.com/document

  • Storage, D., Horne, Z., Cimpian, A., & Leslie, S.-J. (2016). The frequency of “brilliant” and “genius” in teaching evaluations predicts the representation of women and African Americans across fields. PLoS One, 11(3), e0150194

    Google Scholar 

  • Subtirelu, N. C. (2015). “She does have an accent but…”: Race and language ideology in students’ evaluations of mathematics instructors on RateMyProfessors. com. Language in Society, 44(1), 35–62

    Google Scholar 

  • Theall, M., & Franklin, J. (2001). Looking for bias in all the wrong places: A search for truth or a witch hunt in student ratings of instruction? New Directions for Institutional Research, 2001(109), 45–56

    Google Scholar 

  • Uttl, B., White, C. A., & Gonzalez, D. W. (2017). Meta-analysis of faculty’s teaching effectiveness: Student evaluation of teaching ratings and student learning are not related. Studies in Educational Evaluation, 54, 22–42

    Google Scholar 

  • Uttl, B., White, C. A., & Morin, A. (2013). The numbers tell it all: students don’t like numbers! PLoS One, 8(12), e83443

    Google Scholar 

  • Wachtel, H. K. (1998). Student evaluation of college teaching effectiveness: A brief review. Assessment & Evaluation in Higher Education, 23(2), 191–212

    Google Scholar 

  • Wagner, N., Rieger, M., & Voorvelt, K. (2016). Gender, ethnicity and teaching evaluations: Evidence from mixed teaching teams. Economics of Education Review, 54, 79–94

    Google Scholar 

  • Wallace, S. L., Lewis, A. K., & Allen, M. D. (2019). The State of the Literature on Student Evaluations of Teaching and an Exploratory Analysis of Written Comments: Who Benefits Most? College Teaching, 67(1), 1–14

    Google Scholar 

  • Wallisch, P., & Cachia, J. (2019). Determinants of perceived teaching quality: the role of divergent interpretations of expectations

  • Wigington, H., Tollefson, N., & Rodriguez, E. (1989). Students’ ratings of instructors revisited: Interactions among class and instructor variables. Research in Higher Education, 30(3), 331–344

    Google Scholar 

  • Whitworth, J. E., Price, B. A., & Randall, C. H. (2002). Factors that affect college of business student opinion of teaching and learning. Journal of Education for Business, 77(5), 282–289

    Google Scholar 

  • Wright, S. L., & Jenkins-Guarnieri, M. A. (2012). Student evaluations of teaching: combining the meta-analyses and demonstrating further evidence for effective use. Assessment & Evaluation in Higher Education, 37(6), 683–699

    Google Scholar 

  • Youmans, R. J., & Jee, B. D. (2007). Fudging the numbers: Distributing chocolate influences student evaluations of an undergraduate course. Teaching of Psychology, 34(4), 245–247

    Google Scholar 

  • Young, S., Rush, L., & Shaw, D. (2009). Evaluating Gender Bias in Ratings of University Instructors’ Teaching Effectiveness. International Journal for the Scholarship of Teaching and Learning, 3(2), n2

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jennie Sweet-Cushman.

Ethics declarations

Conflict of Interest

The authors hereby acknowledge no financial or non-financial conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kreitzer, R.J., Sweet-Cushman, J. Evaluating Student Evaluations of Teaching: a Review of Measurement and Equity Bias in SETs and Recommendations for Ethical Reform. J Acad Ethics 20, 73–84 (2022). https://doi.org/10.1007/s10805-021-09400-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10805-021-09400-w

Keywords

  • Teaching evaluations
  • Gender stereotypes
  • Gender bias
  • Gender