What’s in a Name: Exposing Gender Bias in Student Ratings of Teaching

Abstract

Student ratings of teaching play a significant role in career outcomes for higher education instructors. Although instructor gender has been shown to play an important role in influencing student ratings, the extent and nature of that role remains contested. While difficult to separate gender from teaching practices in person, it is possible to disguise an instructor’s gender identity online. In our experiment, assistant instructors in an online class each operated under two different gender identities. Students rated the male identity significantly higher than the female identity, regardless of the instructor’s actual gender, demonstrating gender bias. Given the vital role that student ratings play in academic career trajectories, this finding warrants considerable attention.

This is a preview of subscription content, log in to check access.

Figure 1

Notes

  1. 1.

    To clarify the language we use throughout the paper, we refer to all three persons responsible for grading and directly interacting with students as “instructors.” The course “professor” was the person responsible for course design and content preparation, while the two “assistant instructors” worked under the professor’s direction to manage and teach their respective discussion groups.

  2. 2.

    A one-way ANOVA test confirmed that there was no significant variation among all six groups’ discussion board grades and overall grades for the course.

  3. 3.

    We acknowledge that the application of parametric analytical techniques (ANOVA, MANOVA, and t-tests) to ordinal data (the Likert scale responses) remains controversial among social scientists and statisticians. (See Knapp (1990) for a relatively balanced review of the debate.) We side with the arguments of Gaito (1980) and Armstrong (1981) and argue that it is appropriate to do so in our case as the concept being measured is interval, even if the data labels are not. This practice is common within higher education research. (e.g. Centra & Gaubatz [2000] Young, Rush, & Shaw [2009]; Basow [1995]; and Knol et al. [2013])

  4. 4.

    While we acknowledge that a significance level of .05 is conventional in social science and higher education research, we side with Skipper, Guenther, and Nass (1967), Labovitz (1968), and Lai (1973) in pointing out the arbitrary nature of conventional significance levels. Considering our study design, we have used a significance level of .10 for some tests where: 1) the results support the hypothesis and we are consequently more willing to reject the null hypothesis of no difference; 2) our hypothesis is strongly supported theoretically and by empirical results in other studies that use lower significance levels; 3) our small n may be obscuring large differences; and 4) the gravity of an increased risk of Type I error is diminished in light of the benefit of decreasing the risk of a Type II error (Labovitz, 1968; Lai, 1973).

References

  1. Abrami, P. C., d’Apollonia, S., & Rosenfield, S. (2007). The dimensionality of student ratings of instruction: What we know and what we do not. In R. P. Perry & J. C. Smart (Eds.), The scholarship of teaching and learning in higher education: An evidence-based perspective (pp. 385–445). Dordrecht, The Netherlands: Springer.

    Google Scholar 

  2. Acker, J. (1990). Hierarchies, job, and bodies: A theory of gendered organizations. Gender and Society, 4, 81–95.

    Article  Google Scholar 

  3. Andersen, K., & Miller, E. D. (1997). Gender and student evaluations of teaching. Ps-Political Science and Politics, 30, 216–219.

    Article  Google Scholar 

  4. Armstrong, G. D. (1981). Parametric statistics and ordinal data: A pervasive misconception. Nursing Research, 30, 60–62.

    Article  Google Scholar 

  5. Bachen, C. M., McLoughlin, M. M., & Garcia, S. S. (1999). Assessing the role of gender in college students' evaluations of faculty. Communication Education, 48, 193–210.

    Article  Google Scholar 

  6. Basow, S. A. (1995). Student evaluations of college professors: When gender matters. Journal of Educational Psychology, 87, 656–665.

    Article  Google Scholar 

  7. Basow, S. A., & Montgomery, S. (2005). Student ratings and professor self-rating of college teaching: Effects of gender and divisional affiliation. Journal of Personnel Evaluation in Education, 18, 91–106.

    Article  Google Scholar 

  8. Basow, S. A., Phelan, J. E., & Capotosto, L. (2006). Gender patterns in college students' choices of their best and worst professors. Psychology of Women Quarterly, 30, 25–35.

    Article  Google Scholar 

  9. Basow, S. A., & Silberg, N. T. (1987). Student evaluations of college professors: Are female and male professors rated differently? Journal of Educational Psychology, 79, 308–314.

    Article  Google Scholar 

  10. Bennett, S. K. (1982). Student perceptions of and expectations for male and female instructors: Evidence relating to the question of gender bias in teaching evaluation. Journal of Educational Psychology, 74, 170–179.

    Article  Google Scholar 

  11. Benton, S. L., & Cashin, W. E. (2014). Student ratings of instruction in college and university courses. In M. B. Paulsen (Ed.), Higher education: Handbook of theory and research (pp. 279–326). Dordrecht, The Netherlands: Springer.

    Google Scholar 

  12. Burns-Glover, A. L., & Veith, D. J. (1995). Revisiting gender and teaching evaluations: Sex still makes a difference. Journal of Social Behavior and Personality, 10, 69–80.

    Google Scholar 

  13. Centra, J. A. (2007). Differences in responses to the student instructional report: Is it bias? Princeton, NJ: Educational Testing Service.

    Google Scholar 

  14. Centra, J. A., & Gaubatz, N. B. (2000). Is there gender bias in student evaluations of teaching? Journal of Higher Education, 71, 17–33.

    Article  Google Scholar 

  15. Chamberlin, M. S., & Hickey, J. S. (2001). Student evaluations of faculty performance: The role of gender expectations in differential evaluations. Educational Research Quarterly, 25, 3–14.

    Google Scholar 

  16. Curtis, J. W. (2011). Persistent inequity: Gender and academic employment. Report from the American Association of University Professors. Retrieved from http://www.aaup.org/NR/rdonlyres/08E023AB-E6D8-4DBD-99A0-24E5EB73A760/0/persistent_inequity.pdf

  17. Dalmia, S., Giedeman, D. C., Klein, H. A., & Levenburg, N. M. (2005). Women in academia: An analysis of their expectations, performance and pay. Forum on Public Policy, 1, 160–177.

    Google Scholar 

  18. Davis, B. G. (2009). Tools for teaching (2nd ed.). San Francisco, CA: Jossey-Bass.

    Google Scholar 

  19. Feldman, K. A. (1992). College students’ views of male and female college teachers: Evidence from the social laboratory and experiments – Part 1. Research in Higher Education, 33, 317–375.

    Article  Google Scholar 

  20. Feldman, K. A. (1993). College students’ views of male and female college teachers: Evidence from the social laboratory and experiments – Part 2. Research in Higher Education, 34, 151–211.

    Article  Google Scholar 

  21. Gaito, J. (1980). Measurement scales and statistics: Resurgence of an old misconception. Psychological Bulletin, 87, 564–567.

    Article  Google Scholar 

  22. Garson, G. D. (2012). General linear models: Multivariate GLM & MANOVA/MANCOVA. Asheboro, NC: Statistical Associates.

    Google Scholar 

  23. Goldberg, P. (1968). Are women prejudiced against women? Trans-action, 5, 28–30.

    Google Scholar 

  24. Greenwald, A. G. (1997). Validity concerns and usefulness of student ratings of instruction. American Psychologist, 52, 1182–1186.

    Article  Google Scholar 

  25. Hair, J. F., Jr., Anderson, R. E., Tatham, R. L., & Black, W. C. (1998). Multivariate data analysis with readings (5th ed.). Englewood Cliffs, NJ: Prentice-Hall.

    Google Scholar 

  26. Hampton, S. E., & Reiser, R. A. (2004). Effects of a theory-based feedback and consultation process on instruction and learning in college classrooms. Research in Higher Education, 45, 497–527.

    Article  Google Scholar 

  27. Johnson, V. E. (2003). Grade inflation: A crisis in college education. New York, NY: Springer.

    Google Scholar 

  28. Johnson, A. (2006). Power, privilege, and difference. Boston, MA: McGraw-Hill.

    Google Scholar 

  29. Knapp, T. R. (1990). Treating ordinal scales as interval scales: An attempt to resolve the controversy. Nursing Research, 39, 121–123.

    Article  Google Scholar 

  30. Knol, M. H., Veld, R., Vorst, H. C. M., van Driel, J. H., & Mellenbergh, G. J. (2013). Experimental effects of student evaluations coupled with collaborative consultation on college professors’ instructional skills. Research in Higher Education, 54, 825–850.

    Article  Google Scholar 

  31. Labovitz, S. (1968). Criteria for selecting a significance level: A note on the sacredness of.05. The American Sociologist, 3, 220–222.

    Google Scholar 

  32. Lai, M.K. (1973). The case against tests of statistical significance. Report from the Teacher Education Division Publication Series. Retrieved from http://files.eric.ed.gov/fulltext/ED093926.pdf

  33. Liu, O. L. (2012). Student evaluation of instruction: In the new paradigm of distance education. Research in Higher Education, 53, 471–486.

    Article  Google Scholar 

  34. Lorber, J. (1994). Paradoxes of gender. New Haven, CT: Yale University Press.

    Google Scholar 

  35. Marsh, H. W. (2001). Distinguishing between good (useful) and bad workloads on students’ evaluations of teaching. American Educational Research Journal, 38, 183–212.

    Article  Google Scholar 

  36. Marsh, H. W. (2007). Students’ evaluations of university teaching: Dimensionality, reliability, validity, potential biases and usefulness. In R. P. Perry & J. C. Smart (Eds.), The scholarship of teaching and learning in higher education: An evidence-based perspective (pp. 319–383). Dordrecht, The Netherlands: Springer.

    Google Scholar 

  37. Miller, J., & Chamberlin, M. (2000). Women are teachers, men are professors: A study of student perceptions. Teaching Sociology, 28, 283–298.

    Article  Google Scholar 

  38. Monroe, K., Ozyurt, S., Wrigley, T., & Alexander, A. (2008). Gender equality in academia: Bad news from the trenches, and some possible solutions. Perspectives on Politics, 6, 215–233.

    Article  Google Scholar 

  39. Morgan, S. L., & Winship, C. (2007). Counterfactuals and causal inference: Methods and principles for social research. Cambridge, MA: Cambridge University Press.

    Google Scholar 

  40. Morris, L. V. (2011). Women in higher education: Access, success, and the future. Innovative Higher Education, 36, 145–147.

    Article  Google Scholar 

  41. Morrison, K., & Johnson, T. (2013). Editorial. Educational Research and Evaluation, 19, 579–584.

    Article  Google Scholar 

  42. Murray, H. G. (2007). Low-inference teaching behaviors and college teaching effectiveness: Recent developments and controversies. In R. P. Perry & J. C. Smart (Eds.), The scholarship of teaching and learning in higher education: An evidence-based perspective (pp. 145–183). Dordrecht, The Netherlands: Springer.

    Google Scholar 

  43. O’Sullivan, P. D., Hunt, S. K., & Lippert, L. R. (2004). Mediated immediacy: A language of affiliation in a technological age. Journal of Language and Social Psychology, 23, 464–490.

    Article  Google Scholar 

  44. Paludi, M. A., & Strayer, L. A. (1985). What’s in an author’s name? Differential evaluations of performance as a function of author’s name. Sex Roles, 12, 353–361.

    Article  Google Scholar 

  45. Perry, R. P., & Smart, J. C. (Eds.). (2007). The scholarship of teaching and learning in higher education: An evidence-based perspective. Dordrecht, The Netherlands: Springer.

    Google Scholar 

  46. Risman, B. J. (2004). Gender as a social structure: Theory wrestling with activism. Gender & Society, 18, 429–450.

    Article  Google Scholar 

  47. Rowden, G. V., & Carlson, R. E. (1996). Gender issues and students' perceptions of instructors' immediacy and evaluation of teaching and course. Psychological Reports, 78, 835–839.

    Article  Google Scholar 

  48. Sandler, B. R. (1991). Women faculty at work in the classroom, or, why it still hurts to be a woman in labor. Communication Education, 40, 6–15.

    Article  Google Scholar 

  49. Sidanius, J., & Crane, M. (1989). Job evaluation and gender: The case of university faculty. Journal of Applied Social Psychology, 19, 174–197.

    Article  Google Scholar 

  50. Simeone, A. (1987). Academic women: Working toward equality. South Hadley, MA: Bergin and Garvey.

    Google Scholar 

  51. Skipper, J. K., Guenther, A. C., & Nass, G. (1967). The sacredness of.05: A note concerning the uses of statistical levels of significance in social science. The American Sociologist, 1, 16–18.

    Google Scholar 

  52. Sprague, J., & Massoni, K. (2005). Student evaluations and gendered expectations: What we can't count can hurt us. Sex Roles, 53, 779–793.

    Article  Google Scholar 

  53. Statham, A., Richardson, L., & Cook, J. A. (1991). Gender and university teaching: A negotiated difference. Albany, NY: State University of New York Press.

    Google Scholar 

  54. Subramanya, S. R. (2014). Toward a more effective and useful end-of-course evaluation scheme. Journal of Research in Innovative Teaching, 7, 143–157.

    Google Scholar 

  55. Svanum, S., & Aigner, C. (2011). The influences of course effort, mastery and performance goals, grade expectancies, and earned course grades on student ratings of course satisfaction. British Journal of Educational Psychology, 81, 667–679.

    Article  Google Scholar 

  56. Svinicki, M., & McKeachie, W. J. (2010). McKeachie’s teaching tips: Strategies, research, and theory for college and university teachers (13th ed.). Belmont, CA: Wadsworth.

    Google Scholar 

  57. Theall, M., Abrami, P. C., & Mets, L. A. (Eds.). (2001). The student ratings debate: Are they valid? How can we best use them? San Francisco, CA: Jossey-Bass.

    Google Scholar 

  58. West, C., & Zimmerman, D. H. (1987). Doing gender. Gender & Society, 1, 125–151.

    Article  Google Scholar 

  59. Young, S., Rush, L., & Shaw, D. (2009). Evaluating gender bias in ratings of university instructors' teaching effectiveness. International Journal of Scholarship of Teaching and Learning, 3, 1–14.

    Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Lillian MacNell.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

MacNell, L., Driscoll, A. & Hunt, A.N. What’s in a Name: Exposing Gender Bias in Student Ratings of Teaching. Innov High Educ 40, 291–303 (2015). https://doi.org/10.1007/s10755-014-9313-4

Download citation

Keywords

  • gender inequality
  • gender bias
  • student ratings of teaching
  • student evaluations of instruction