What’s in a Name: Exposing Gender Bias in Student Ratings of Teaching

MacNell, Lillian; Driscoll, Adam; Hunt, Andrea N.

doi:10.1007/s10755-014-9313-4

What’s in a Name: Exposing Gender Bias in Student Ratings of Teaching

Published: 05 December 2014

Volume 40, pages 291–303, (2015)
Cite this article

Innovative Higher Education Aims and scope Submit manuscript

Lillian MacNell¹,
Adam Driscoll² &
Andrea N. Hunt³

31k Accesses
408 Citations
703 Altmetric
81 Mentions
Explore all metrics

Abstract

Student ratings of teaching play a significant role in career outcomes for higher education instructors. Although instructor gender has been shown to play an important role in influencing student ratings, the extent and nature of that role remains contested. While difficult to separate gender from teaching practices in person, it is possible to disguise an instructor’s gender identity online. In our experiment, assistant instructors in an online class each operated under two different gender identities. Students rated the male identity significantly higher than the female identity, regardless of the instructor’s actual gender, demonstrating gender bias. Given the vital role that student ratings play in academic career trajectories, this finding warrants considerable attention.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

To clarify the language we use throughout the paper, we refer to all three persons responsible for grading and directly interacting with students as “instructors.” The course “professor” was the person responsible for course design and content preparation, while the two “assistant instructors” worked under the professor’s direction to manage and teach their respective discussion groups.
A one-way ANOVA test confirmed that there was no significant variation among all six groups’ discussion board grades and overall grades for the course.
We acknowledge that the application of parametric analytical techniques (ANOVA, MANOVA, and t-tests) to ordinal data (the Likert scale responses) remains controversial among social scientists and statisticians. (See Knapp (1990) for a relatively balanced review of the debate.) We side with the arguments of Gaito (1980) and Armstrong (1981) and argue that it is appropriate to do so in our case as the concept being measured is interval, even if the data labels are not. This practice is common within higher education research. (e.g. Centra & Gaubatz [2000] Young, Rush, & Shaw [2009]; Basow [1995]; and Knol et al. [2013])
While we acknowledge that a significance level of .05 is conventional in social science and higher education research, we side with Skipper, Guenther, and Nass (1967), Labovitz (1968), and Lai (1973) in pointing out the arbitrary nature of conventional significance levels. Considering our study design, we have used a significance level of .10 for some tests where: 1) the results support the hypothesis and we are consequently more willing to reject the null hypothesis of no difference; 2) our hypothesis is strongly supported theoretically and by empirical results in other studies that use lower significance levels; 3) our small n may be obscuring large differences; and 4) the gravity of an increased risk of Type I error is diminished in light of the benefit of decreasing the risk of a Type II error (Labovitz, 1968; Lai, 1973).

References

Abrami, P. C., d’Apollonia, S., & Rosenfield, S. (2007). The dimensionality of student ratings of instruction: What we know and what we do not. In R. P. Perry & J. C. Smart (Eds.), The scholarship of teaching and learning in higher education: An evidence-based perspective (pp. 385–445). Dordrecht, The Netherlands: Springer.
Chapter Google Scholar
Acker, J. (1990). Hierarchies, job, and bodies: A theory of gendered organizations. Gender and Society, 4, 81–95.
Article Google Scholar
Andersen, K., & Miller, E. D. (1997). Gender and student evaluations of teaching. Ps-Political Science and Politics, 30, 216–219.
Article Google Scholar
Armstrong, G. D. (1981). Parametric statistics and ordinal data: A pervasive misconception. Nursing Research, 30, 60–62.
Article Google Scholar
Bachen, C. M., McLoughlin, M. M., & Garcia, S. S. (1999). Assessing the role of gender in college students' evaluations of faculty. Communication Education, 48, 193–210.
Article Google Scholar
Basow, S. A. (1995). Student evaluations of college professors: When gender matters. Journal of Educational Psychology, 87, 656–665.
Article Google Scholar
Basow, S. A., & Montgomery, S. (2005). Student ratings and professor self-rating of college teaching: Effects of gender and divisional affiliation. Journal of Personnel Evaluation in Education, 18, 91–106.
Article Google Scholar
Basow, S. A., Phelan, J. E., & Capotosto, L. (2006). Gender patterns in college students' choices of their best and worst professors. Psychology of Women Quarterly, 30, 25–35.
Article Google Scholar
Basow, S. A., & Silberg, N. T. (1987). Student evaluations of college professors: Are female and male professors rated differently? Journal of Educational Psychology, 79, 308–314.
Article Google Scholar
Bennett, S. K. (1982). Student perceptions of and expectations for male and female instructors: Evidence relating to the question of gender bias in teaching evaluation. Journal of Educational Psychology, 74, 170–179.
Article Google Scholar
Benton, S. L., & Cashin, W. E. (2014). Student ratings of instruction in college and university courses. In M. B. Paulsen (Ed.), Higher education: Handbook of theory and research (pp. 279–326). Dordrecht, The Netherlands: Springer.
Chapter Google Scholar
Burns-Glover, A. L., & Veith, D. J. (1995). Revisiting gender and teaching evaluations: Sex still makes a difference. Journal of Social Behavior and Personality, 10, 69–80.
Google Scholar
Centra, J. A. (2007). Differences in responses to the student instructional report: Is it bias? Princeton, NJ: Educational Testing Service.
Google Scholar
Centra, J. A., & Gaubatz, N. B. (2000). Is there gender bias in student evaluations of teaching? Journal of Higher Education, 71, 17–33.
Article Google Scholar
Chamberlin, M. S., & Hickey, J. S. (2001). Student evaluations of faculty performance: The role of gender expectations in differential evaluations. Educational Research Quarterly, 25, 3–14.
Google Scholar
Curtis, J. W. (2011). Persistent inequity: Gender and academic employment. Report from the American Association of University Professors. Retrieved from http://www.aaup.org/NR/rdonlyres/08E023AB-E6D8-4DBD-99A0-24E5EB73A760/0/persistent_inequity.pdf
Dalmia, S., Giedeman, D. C., Klein, H. A., & Levenburg, N. M. (2005). Women in academia: An analysis of their expectations, performance and pay. Forum on Public Policy, 1, 160–177.
Google Scholar
Davis, B. G. (2009). Tools for teaching (2nd ed.). San Francisco, CA: Jossey-Bass.
Google Scholar
Feldman, K. A. (1992). College students’ views of male and female college teachers: Evidence from the social laboratory and experiments – Part 1. Research in Higher Education, 33, 317–375.
Article Google Scholar
Feldman, K. A. (1993). College students’ views of male and female college teachers: Evidence from the social laboratory and experiments – Part 2. Research in Higher Education, 34, 151–211.
Article Google Scholar
Gaito, J. (1980). Measurement scales and statistics: Resurgence of an old misconception. Psychological Bulletin, 87, 564–567.
Article Google Scholar
Garson, G. D. (2012). General linear models: Multivariate GLM & MANOVA/MANCOVA. Asheboro, NC: Statistical Associates.
Google Scholar
Goldberg, P. (1968). Are women prejudiced against women? Trans-action, 5, 28–30.
Google Scholar
Greenwald, A. G. (1997). Validity concerns and usefulness of student ratings of instruction. American Psychologist, 52, 1182–1186.
Article Google Scholar
Hair, J. F., Jr., Anderson, R. E., Tatham, R. L., & Black, W. C. (1998). Multivariate data analysis with readings (5th ed.). Englewood Cliffs, NJ: Prentice-Hall.
Google Scholar
Hampton, S. E., & Reiser, R. A. (2004). Effects of a theory-based feedback and consultation process on instruction and learning in college classrooms. Research in Higher Education, 45, 497–527.
Article Google Scholar
Johnson, V. E. (2003). Grade inflation: A crisis in college education. New York, NY: Springer.
Google Scholar
Johnson, A. (2006). Power, privilege, and difference. Boston, MA: McGraw-Hill.
Google Scholar
Knapp, T. R. (1990). Treating ordinal scales as interval scales: An attempt to resolve the controversy. Nursing Research, 39, 121–123.
Article Google Scholar
Knol, M. H., Veld, R., Vorst, H. C. M., van Driel, J. H., & Mellenbergh, G. J. (2013). Experimental effects of student evaluations coupled with collaborative consultation on college professors’ instructional skills. Research in Higher Education, 54, 825–850.
Article Google Scholar
Labovitz, S. (1968). Criteria for selecting a significance level: A note on the sacredness of.05. The American Sociologist, 3, 220–222.
Google Scholar
Lai, M.K. (1973). The case against tests of statistical significance. Report from the Teacher Education Division Publication Series. Retrieved from http://files.eric.ed.gov/fulltext/ED093926.pdf
Liu, O. L. (2012). Student evaluation of instruction: In the new paradigm of distance education. Research in Higher Education, 53, 471–486.
Article Google Scholar
Lorber, J. (1994). Paradoxes of gender. New Haven, CT: Yale University Press.
Google Scholar
Marsh, H. W. (2001). Distinguishing between good (useful) and bad workloads on students’ evaluations of teaching. American Educational Research Journal, 38, 183–212.
Article Google Scholar
Marsh, H. W. (2007). Students’ evaluations of university teaching: Dimensionality, reliability, validity, potential biases and usefulness. In R. P. Perry & J. C. Smart (Eds.), The scholarship of teaching and learning in higher education: An evidence-based perspective (pp. 319–383). Dordrecht, The Netherlands: Springer.
Chapter Google Scholar
Miller, J., & Chamberlin, M. (2000). Women are teachers, men are professors: A study of student perceptions. Teaching Sociology, 28, 283–298.
Article Google Scholar
Monroe, K., Ozyurt, S., Wrigley, T., & Alexander, A. (2008). Gender equality in academia: Bad news from the trenches, and some possible solutions. Perspectives on Politics, 6, 215–233.
Article Google Scholar
Morgan, S. L., & Winship, C. (2007). Counterfactuals and causal inference: Methods and principles for social research. Cambridge, MA: Cambridge University Press.
Book Google Scholar
Morris, L. V. (2011). Women in higher education: Access, success, and the future. Innovative Higher Education, 36, 145–147.
Article Google Scholar
Morrison, K., & Johnson, T. (2013). Editorial. Educational Research and Evaluation, 19, 579–584.
Article Google Scholar
Murray, H. G. (2007). Low-inference teaching behaviors and college teaching effectiveness: Recent developments and controversies. In R. P. Perry & J. C. Smart (Eds.), The scholarship of teaching and learning in higher education: An evidence-based perspective (pp. 145–183). Dordrecht, The Netherlands: Springer.
Chapter Google Scholar
O’Sullivan, P. D., Hunt, S. K., & Lippert, L. R. (2004). Mediated immediacy: A language of affiliation in a technological age. Journal of Language and Social Psychology, 23, 464–490.
Article Google Scholar
Paludi, M. A., & Strayer, L. A. (1985). What’s in an author’s name? Differential evaluations of performance as a function of author’s name. Sex Roles, 12, 353–361.
Article Google Scholar
Perry, R. P., & Smart, J. C. (Eds.). (2007). The scholarship of teaching and learning in higher education: An evidence-based perspective. Dordrecht, The Netherlands: Springer.
Google Scholar
Risman, B. J. (2004). Gender as a social structure: Theory wrestling with activism. Gender & Society, 18, 429–450.
Article Google Scholar
Rowden, G. V., & Carlson, R. E. (1996). Gender issues and students' perceptions of instructors' immediacy and evaluation of teaching and course. Psychological Reports, 78, 835–839.
Article Google Scholar
Sandler, B. R. (1991). Women faculty at work in the classroom, or, why it still hurts to be a woman in labor. Communication Education, 40, 6–15.
Article Google Scholar
Sidanius, J., & Crane, M. (1989). Job evaluation and gender: The case of university faculty. Journal of Applied Social Psychology, 19, 174–197.
Article Google Scholar
Simeone, A. (1987). Academic women: Working toward equality. South Hadley, MA: Bergin and Garvey.
Google Scholar
Skipper, J. K., Guenther, A. C., & Nass, G. (1967). The sacredness of.05: A note concerning the uses of statistical levels of significance in social science. The American Sociologist, 1, 16–18.
Google Scholar
Sprague, J., & Massoni, K. (2005). Student evaluations and gendered expectations: What we can't count can hurt us. Sex Roles, 53, 779–793.
Article Google Scholar
Statham, A., Richardson, L., & Cook, J. A. (1991). Gender and university teaching: A negotiated difference. Albany, NY: State University of New York Press.
Google Scholar
Subramanya, S. R. (2014). Toward a more effective and useful end-of-course evaluation scheme. Journal of Research in Innovative Teaching, 7, 143–157.
Google Scholar
Svanum, S., & Aigner, C. (2011). The influences of course effort, mastery and performance goals, grade expectancies, and earned course grades on student ratings of course satisfaction. British Journal of Educational Psychology, 81, 667–679.
Article Google Scholar
Svinicki, M., & McKeachie, W. J. (2010). McKeachie’s teaching tips: Strategies, research, and theory for college and university teachers (13th ed.). Belmont, CA: Wadsworth.
Google Scholar
Theall, M., Abrami, P. C., & Mets, L. A. (Eds.). (2001). The student ratings debate: Are they valid? How can we best use them? San Francisco, CA: Jossey-Bass.
Google Scholar
West, C., & Zimmerman, D. H. (1987). Doing gender. Gender & Society, 1, 125–151.
Article Google Scholar
Young, S., Rush, L., & Shaw, D. (2009). Evaluating gender bias in ratings of university instructors' teaching effectiveness. International Journal of Scholarship of Teaching and Learning, 3, 1–14.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Sociology and Anthropology, 334 1911 Building, Campus Box 8107, Raleigh, North Carolina 27695, USA
Lillian MacNell
University of Wisconsin-La Crosse, La Crosse, WI, USA
Adam Driscoll
University of North Alabama, Florence, AL, USA
Andrea N. Hunt

Authors

Lillian MacNell
View author publications
You can also search for this author in PubMed Google Scholar
Adam Driscoll
View author publications
You can also search for this author in PubMed Google Scholar
Andrea N. Hunt
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lillian MacNell.

Rights and permissions

Reprints and permissions

About this article

Cite this article

MacNell, L., Driscoll, A. & Hunt, A.N. What’s in a Name: Exposing Gender Bias in Student Ratings of Teaching. Innov High Educ 40, 291–303 (2015). https://doi.org/10.1007/s10755-014-9313-4

Download citation

Published: 05 December 2014
Issue Date: August 2015
DOI: https://doi.org/10.1007/s10755-014-9313-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

What’s in a Name: Exposing Gender Bias in Student Ratings of Teaching

Abstract

Access this article

Similar content being viewed by others

Are There Gender Differences in Quantitative Student Evaluations of Instructors?

A gender affinity effect: the role of gender in teaching evaluations at a Danish university

Gender-biased evaluation or actual differences? Fairness in the evaluation of faculty teaching

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

What’s in a Name: Exposing Gender Bias in Student Ratings of Teaching

Abstract

Access this article

Similar content being viewed by others

Are There Gender Differences in Quantitative Student Evaluations of Instructors?

A gender affinity effect: the role of gender in teaching evaluations at a Danish university

Gender-biased evaluation or actual differences? Fairness in the evaluation of faculty teaching

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation