Introduction

Despite national reform efforts dedicated to increasing diversity and inclusion in science, technology, engineering, and mathematics (STEM), these fields continue to witness a disproportionate loss of marginalized students (Seymour & Hunter, 2019). STEM education researchers have repeatedly identified unequal outcomes in traditional academic achievement measures across student groups by gender, race/ethnicity, and socioeconomic measures (Dika & D’Amico, 2016; Eddy & Brownell, 2016; Koester et al., 2016; Malespina & Singh, 2023; Matz et al., 2017; Mead et al., 2020; Whitcomb et al., 2021; Xie et al., 2015). Historically, STEM learning environments have been spaces in which systemic inequities (e.g., racism, sexism, and classism) create advantages, for example, for those who are white, wealthy, male, and continuing-generation, leading to a lack of diverse representation in STEM fields across degree programs, levels of education, and careers (Gin et al., 2022; McGee, 2020; National Center for Science & Engineering Statistics, 2021; Reinholz & Ridgway, 2021). However, most of this research has been concentrated at a sole institution or, if examined across institutions, is missing a coordinated methodology grounded in historical theoretical perspectives which position inequalities as a byproduct of university structures that perpetuate and exacerbate inequities rather than student demographics. Research on systemic, structural advantages across undergraduate STEM learning environments can uncover how these advantages persist across intersecting identities and positions. The goal of this study, then, is to explore how systemic advantages in STEM manifest across multiple large public research universities in the United States (U.S.) and highlight the need for institutions to focus on centering the experiences of students with historically marginalized identities as a step toward more equitable and inclusive learning environments for all students.

Identifying systemic inequities benefits from a comparison of student outcomes across multiple institutions using the same conceptual frameworks, methodologies, and definitions (e.g., Matz et al., 2017) as these commonly vary across studies. We argue that for a comparison to be constructive and support transformative change in STEM education, analyses must be grounded in an asset-based model using critical frameworks to examine the complex relationships between institutional practices and student identities, backgrounds, and outcomes. We use such an approach here to explore whether data from introductory STEM courses at multiple research universities support the claim that demographic-based grade differences are driven by systemic inequities. Prior to reviewing literature about systemic inequities within foundational STEM courses and their positioning as propagators of inequity, we situate what we mean by equity and how we apply critical frameworks to begin examining this variation. We then present our approach to exploring a dataset of more than 200,000 students across 60 STEM courses at 6 universities over 10 years.

Background

Conceptualization of equity in education

Despite our different positions, privileges, histories, and commitments as individual authors, we share a common goal of dismantling oppressive systems and pursuing equity within higher education. Before exploring related literature and theory, it is important that we define our conception of equity, specifically with respect to educational settings (Levinson et al., 2022; Russo-Tait, 2023; Wolbring & Nguyen, 2023). Drawing from Gutiérrez’s (2012) sociopolitical framework and Black feminist theory (Collins, 2000; Crenshaw, 1991; Davis, 1983; Hooks, 1981; Lorde, 1984), we view equity as acknowledgment of how historical events have created power imbalances within higher education and working to dismantle these disparities, so all students are empowered.

It follows that we do not view equity in education as a static goal (e.g., parity across grade distributions) as this approach does not guarantee equity. Rather, we position course grade disparities as a product of underlying inequitable systems and as a proxy for inequities. Two students who receive the same grade have not necessarily had equitable experiences, as fairness does not mean sameness (Gutiérrez, 2012). Indeed, the outcome of similar course grades does not necessarily redress the systems that advantage some students over others. Further, the focus on student empowerment within this conceptualization of equity is a critical dimension not addressed with the outcome of similar course grades. Students who are marginalized and minoritized often have their voices silenced, identities suppressed, and knowledge devalued. Therefore, considering identity as a precursor to power (Gutiérrez, 2012), equity focuses on addressing these power dynamics in a way that empowers students to authentically be themselves rather than conforming to inequitable and oppressive structures.

With this framing in mind, the goal is not simply to observe inequities in grades, rather it is to use this information to dismantle disparities and empower all individuals given that part of equity is grappling with how structural and political systems cause harm and produce academic disparities. In an academic context, equity could look like acknowledging how marginalized groups have had their power diminished within the classroom and academia as a whole as well as working to dismantle oppressive systems, providing resources and structures to enable empowerment. Simply put, awareness that emerges from observing inequities in grades is not the end goal; adjustment and response to power imbalances should be continuous work (Rehrey et al., 2020). Thus, equity as a process informs our theoretical approach to studying intersectional inequalities in foundational STEM courses, operationalized as large-enrollment, stable, gateway courses that serve a wide variety of students (Center for Research on Learning & Teaching, n.d.). From this conceptualization of equity, we now take a historical perspective to examine how the foundations of higher education fostered the systemic inequities that persist today.

Historical foundation of systemic inequities

The first colleges and universities in the U.S. were made in the nation’s image (Dancy II et al., 2018; Mustaffa, 2017; Patton, 2016); like constitutional rights, access was primarily permitted to white men from wealthy Christian backgrounds (Byrd, 2021; Renn & Reason, 2021; Thelin, 2019; Wilder, 2013). Women and those who were not white or lacked wealth were denied the opportunity to pursue higher education (Renn & Reason, 2021). Many of these higher education institutions were built on land violently seized from Native Americans (Nash, 2019; Wolfe, 1999) and enslaved Black people were forced to attend to architectural developments and caretaking duties (Mustaffa, 2017; Patton, 2016; Wilder, 2013).

During the mid-1900s, U.S. colleges and universities began to significantly alter their demographic composition. At the time, social movements coincided with national pressures to make the country more globally competitive (Bell, 1980). Women and racially minoritized people had advocated throughout the 1900s on the front lines for greater access to higher education (Johnston, 2018; McCammon et al., 2001), but this was not realized until the 1950s and 1960s during the global space race. The U.S. was seeking to expand its workforce educated in STEM disciplines, thereby aligning the interests of those advocating for access to higher education with that of the majority, finally spurring action (Bell, 1980). Civil rights legislation was developed shortly after to increase access to higher education for an array of historically minoritized people including women and low-income and racially minoritized populations (Renn & Reason, 2021; Warikoo & Allen, 2020). Increasing access was not done purely for equity-oriented reasons; as a result, we still see exclusion and harm in STEM higher education contexts.

Contemporary systemic inequities in American higher education

Although higher education admissions policy changes in the ensuing decades have substantially expanded access, colleges and universities in the U.S. remain exclusionary spaces (e.g., Dancy & Hodari, 2023). The original practice of selecting a limited few for higher education has enabled campus environments and academic structures to continue to be non-inclusive of the lived experiences of minoritized populations (Patton, 2016; Renn & Reason, 2021). For example, the valued classroom structures and discourse patterns within undergraduate mathematics have greater benefits for men than women (Johnson et al., 2020). Racially minoritized students express experiencing discrimination when navigating office hours or securing research opportunities (Masta, 2019; Tichavakunda, 2021), which negatively influences their sense of belonging and academic confidence (Duran et al., 2020; Jack, 2021; Strayhorn, 2018). Students report feeling like they must over-achieve academically in STEM environments to counteract negative stereotypes about their social group (Jack, 2021; McGee & Martin, 2011; Seymour & Hunter, 2019; Squire et al., 2018), and the psychological distress of navigating exclusionary environments has negative effects on minoritized students’ wellbeing and academic outcomes (McGee, 2021; Seymour & Hunter, 2019). Studies repeatedly show that learning environments in STEM disciplines are unwelcoming and unsupportive of students from marginalized populations, ultimately hindering their degree progression (Fiorini et al., 2023; Leyva et al., 2021; McCoy et al., 2017; McGee, 2021). Colleges and universities often fail to implement support structures that could assist minoritized students, particularly students whose secondary education falls short (Engle & Tinto, 2008; Kenyon & Reschovsky, 2014; Meatto, 2019).

Introductory STEM courses are key barriers for minoritized students (Seymour & Hunter, 2019), notable for consistently yielding grade performance differences across student populations, with women and underrepresented and racially minoritized, low-income, and first-generation students generally receiving lower grades than white, wealthy, continuing-generation men, even after accounting for students’ pre-college and family background characteristics (Dika & D’Amico, 2016; Koester et al., 2016; Malespina & Singh, 2023; Matz et al., 2017; Whitcomb et al., 2021; Wright et al., 2016; Xie et al., 2015). Inequalities at these initial stages in STEM majors hinder diversity in the STEM workforce, as lower academic performance in these courses has been shown to significantly increase the probability of students leaving STEM degree programs (King, 2015; Witteveen & Attewell, 2020). Introductory STEM courses arguably operate as key sites of intersected inequalities in higher education, and the reliance on such courses at research universities is one way that racialized, class-based, and gendered inequalities are reproduced within higher education generally, and in STEM fields particularly.

Theoretical perspectives

Intersectionality and organizational theory

We are guided by intersectionality and organizational theories that reveal how organizations such as colleges and universities are gendered, racialized, and classed (Acker, 1990; Armstrong et al., 2014; Bourdieu & Passeron, 1990; Byrd, 2017, 2021; Collins, 2000; Collins & Bilge, 2020; Lee, 2016; Ray, 2019).

Intersectionality describes how people’s experiences relate to multiple, intertwined social positions that reflect systems of power, including sexism, racism, and classism, influencing their everyday lives (Carbado, 2013; Cho et al., 2013; Collins, 2000; Collins & Bilge, 2020; Crenshaw, 1991). Different contexts and situations can heighten one identity compared to others even within the same spaces and places, such as moving from one classroom to another on a campus (Carbado, 2013; Collins, 2000). Marginalized identities are reified when people face structural barriers to mobility, stability, and success, and when combined with their other identities, existing at a particular intersection can yield qualitatively different experiences. The concept of intersectionality helps researchers grapple with the complexity of a person’s experiences and outcomes within an unequal opportunity structure (Bowleg, 2008; Collins & Bilge, 2020; Crenshaw, 1991). Within higher education, it is important to understand opportunity structures and the ways that the purposes and goals of organizations bring about differential resources and opportunities for students to pursue along their degree pathways.

It is also important to avoid oversimplifying intersectionality, placing inequalities within social identities rather than reflecting unequal opportunity structures (Carbado & Harris, 2019; Cho et al., 2013; Collins & Bilge, 2020; Harris & Patton, 2018; Haynes et al., 2020). Decoupling a person’s experiences and outcomes from interlocking systems of power individualizes inequalities, leaving unaddressed organizational features that perpetuate inequalities long-documented in higher education and particularly in STEM fields (Allen & Jewell, 2002; Armstrong & Jovanovic, 2017; Bauer et al., 2021; Brunn-Bevel et al., 2019; Byrd, 2021; Fiorini et al., 2023; Griffin, 2019; Thelin, 2019). Further, this disconnect can misinform initiatives and policies aiming to address such inequities by missing how organizations differentially distribute resources and opportunities through policies, practices, and everyday interactions (Acker, 1990; Byrd, 2021; Ray, 2019).

Persistent inequities across STEM fields hamper the capacity to uphold institutional missions and ideals of supporting students with an array of different identities and from different backgrounds. Although STEM disciplines notably show greater inequities compared to other fields of study (Matz et al., 2017; Riegle-Crumb et al., 2019), research continues to find that systemic inequities are a common feature of higher education across academic fields, levels of degree programs, and employment (Blair-Loy & Cech, 2022; Byrd, 2021; Byrd et al., 2019; Espinosa et al., 2019; McGee, 2020; Posselt, 2018, 2020; Stewart & Valian, 2018; Zambrana, 2018). Therefore, we must be mindful in interpreting research findings to avoid deficit framing based on presumptions of meritocracy that can limit interventions and policy changes by focusing on individuals and not universities (Blair-Loy & Cech, 2022; Carnevale et al., 2020; Castillo & Gillborn, 2022; Harper, 2010). Given that inequities are so intertwined with higher education, and STEM fields in particular, a theoretical framework that situates how individual academic performance reflects organizational inequity is warranted.

Intersectionality-informed explorations of quantitative STEM education data

The application of intersectionality alone to quantitative studies can further disconnect inequities and inequalities from the contexts surrounding people if not paired with a critical lens that seeks to emphasize the societal context of a phenomenon (Pearson et al., 2022). Therefore, we couple the guidance of intersectional and organizational theories with a critical approach to raise questions about the orientation to and purposes of quantitative methods for understanding marginalization within higher education. This disposition helps to (1) avoid assumptions about the value neutrality of quantitative analyses; (2) limit deficit perspectives of individuals and groups that essentialize people; (3) support consideration of the contexts that lead to particular outcomes; (4) reconsider sociodemographic groupings and categories; and (5) recognize that interpretations of quantitative data are privileged in society. We explicitly highlight these dispositional tenants throughout the paper to show where they guided decisions and interpretations. As researchers, we must carefully consider how we assign meaning to interpretations of data that can silence the voices and experiences of marginalized communities as data do not speak for themselves (Byrd, 2021; Castillo & Gillborn, 2022; Covarrubias & Vélez, 2013; D’Ignazio & Klein, 2020; Garcia et al., 2018; Gillborn et al., 2018; López, et al., 2018; Pearson et al., 2022).

While tending to each of these critical quantitative tenets can enhance research to speak to the systemic realities of inequities in STEM learning environments, lingering issues must be addressed when applying intersectionality to quantitative data. Researchers often use individual-level variables without attending to historical context or the systems in which individuals operate, thereby hindering the visibility of intersected experiences, processes, and outcomes of systemic phenomena (Bauer et al., 2021; Bowleg, 2008; Hancock, 2007, 2013; Lopez et al., 2018). It is important to note how such exploration is constrained by the variables used in analyses. For example, the ways that institutions and researchers place students in sociodemographic categories by gender, race/ethnicity, and socioeconomic position (i.e., income and first-generation status) are not always documented (Byrd, 2021), nor can we assume identity processes of students (i.e., how someone self-identifies and why). In this study, we explicitly attempt to link our analysis to such historical context.

In relation to interpretations of sociodemographic variables, this framing means that a coefficient for students identifying with a particular racial/ethnic group in a multivariate model should not be interpreted by researchers as the social construct of race directly influencing a student’s outcome such as grades (Bonilla-Silva, 2006). Rather, a racial disparity in course grades reflects the racially inequitable learning contexts of that course that impact students along racial lines. Through additive approaches, researchers might explore the average combined effects of multiple systems of inequity and power, but also should recognize that, of course, qualitatively different experiences underlie overarching quantitative patterns and are not fully captured. Grounded in these theoretical perspectives, our approach interrogates the entrenched nature of systemic oppression in postsecondary STEM courses.

Research questions

Our critical theoretical perspectives, informed by intersectionality and organizational theory, provide the foundation for this study about how introductory STEM courses operate as key sites of intersected inequalities in higher education. We investigate grade disparities as a reflection of existing institutional practices and policies that perpetuate inequalities among students by sex, race/ethnicity, and socioeconomic status. To explore such systemic inequalities and expand our scope of cross-institutional analyses beyond prior work focused solely on sex (Matz et al., 2017), we define and use the systemic advantage index (SAI), which represents the total number of advantages that characterize students within institutions according to sex, race/ethnicity, income, and first-generation status. To explore the extent to which historically based (dis)advantages exist across institutions and to document the manifestation of systemic inequities in introductory STEM courses, we ask the following research questions:

  1. 1.

    What is the distribution of students by systemic advantage at each institution?

  2. 2.

    What is the relationship between systemic advantage and course outcomes?

Methods

Institutional context

This study examines data from six four-year public universities in the U.S.: the flagship campuses of Arizona State University, Indiana University, Michigan State University, Purdue University, the University of Michigan, and the University of Pittsburgh, randomly deidentified herein as Institutions A through F. These six institutions are similar in that they are doctoral universities with very high research activity, they have large enrollments (from 19,000 to 42,000 undergraduate students in Fall 2018; National Center for Education Statistics, n.d.), and they serve populations that are primarily residential (Carnegie Classification of Institutions of Higher Education, n.d.).

Other key characteristics vary; for example, in 2018–2019, the admissions rate ranged from 23 to 85% across the institutions and the 6-year graduation rate ranged from 63 to 93% (National Center for Education Statistics, n.d.). The percentages of undergraduate students who are women (42 to 52%) and non-white (30% to 50%) also vary (Fall 2018 figures; National Center for Education Statistics, n.d.). The data used in this study were gathered and derived from the student information systems maintained at each institution following Institutional Review Board requirements. The student-level data were held locally, not shared between researchers.

Data collection

Defining the sample: student population (step 1)

We selected all undergraduate students who received a numeric grade in at least one STEM course (defined as those with biology, chemistry, engineering, mathematics, physics, or statistics course codes, excluding laboratory courses as they tend to be secondary to lecture courses) within their first academic year at one of the participating universities spanning a period of 10 academic years (Fall 2009 through Spring 2019, excluding summer terms as we were interested in students’ first interactions with STEM courses). All data were thus collected before the instructional, grading, and policy changes that resulted from the COVID-19 pandemic beginning in March 2020. A summary of the criteria and exclusions applied to the sample in this and the following steps is shown in Fig. 1.

Fig. 1
figure 1

Summary of criteria and exclusions for the data sample

We excluded transfer students as our goal was to examine students’ first interactions with post-secondary STEM courses. Though a subset of transfer students did take their first STEM course at one of our institutions and could have been included in principle, the transfer credit records we accessed were not consistently specific enough at the course level to parse these groups; we assumed that some transfer students had prior STEM courses transfer only as general education or elective credit, and thus that we would not be able to reliably determine if a student had or had not taken a post-secondary STEM course before transferring. International students were also excluded because race is a social construct defined differently according to each country’s history and culture. Given that our data came from entirely U.S.-based institutions, we did not want to enforce a potentially inappropriate U.S.-based construct on our international student populations. With these sample limitations, we do not assert that transfer and international students do not experience inequitable learning environments. Rather, we aimed to explore different axes of systemic advantage here.

Defining the sample: the largest STEM courses (step 2)

For this study, we defined the 10 largest STEM courses at each university as the courses with the greatest number of enrollments from this student population across the 10-year period of interest. We included courses without regard to level; in particular, the number and type of mathematics courses offered prior to calculus differs across these institutions. In this way, we focused our attention on the largest STEM courses unique to each institution and its student population. The distribution of courses by discipline is shown in Table 1. Mathematics courses are the most prominent, comprising more than half of the courses overall. No upper-division courses, that is, those at the “300-level” or above, were evident in the sample (though disparities have been shown to persist at the upper-division level; see Farrar et al., 2023).

Table 1 The distribution of the 10 largest STEM courses at each institution by discipline

Defining the sample: enrollment exclusions (step 3)

Two final adjustments to the sample were made. First, the enrollment patterns for students across these 10 largest courses varied; that is, each student fell into one of three categories: (1) the student took one or more of the 10 largest STEM courses in the fall term and none in the spring term; (2) the student took one or more of the 10 largest STEM courses in the spring term and none in the fall term (for example, some students began their undergraduate careers in the spring); or (3) the student took one or more of the 10 largest STEM courses in both the fall and spring terms. Because we intended to focus on students’ first experience with large STEM courses, we limited our enrollments of interest for those in group 3—the “both fall and spring” students—to only those that occurred in the students’ fall term.

Second, we adjusted the sample in the cases where students were enrolled in two or more of the 10 largest STEM courses within the same term. To support the robustness of the analyses, among the students that were represented with multiple enrollments across these 10 largest STEM courses within the same term, we randomly sampled a single STEM course enrollment for each student. This step was necessary because the GPAOs (defined below) for these course enrollments for each student would be highly correlated. This exclusion had the effect of reducing the sample of 386,035 enrollments across the 10 largest STEM courses to 227,413 enrollments. In this way, the final sample [n = 227,413 students (Institution A: 59,412, B: 43,287, C: 22,056, D: 34,967, E: 47,373, F: 20,318); 60 courses; 6 institutions] includes only one unique enrollment per student.

Demographic variables

We used four demographic variables in this study: sex, race/ethnicity, income, and first- versus continuing-generation status.

Though gender, not sex, was our construct of interest with respect to systemic advantage in undergraduate STEM courses, gender data were not available in the student records at every institution, a byproduct of the non-neutrality of data as noted in dispositional tenant 1. Thus, we used a binary sex classification (female students and male students) and recognize these data as limiting (D’Ignazio & Klein, 2020), particularly with respect to non-binary and genderqueer students.

Race/ethnicity was coded as white or non-white, where non-white included students who indicated they were American Indian or Alaska Native, Asian or Asian American, Black or African American, Hispanic, Native Hawaiian or Other Pacific Islander, or two or more races. We recognize that the choice to group students of non-white races and ethnicities into one category has the potential to perpetuate a centering of whiteness and encourage the monolithic view of non-white individuals, an implication with which we disagree. This choice was based on history, that is, U.S. universities being designed solely for white students and implementing exclusionary practices for non-white students. Our binary coding of race and ethnicity is intended to reflect this historical structural perspective.

Low-income students were defined as those eligible for federal Pell grants; however, at two institutions, these data were unavailable to researchers. In these cases, the low-income category was defined by the median income level associated with students’ high school zip code based on data from the U.S. Census Bureau; zip code has been found in prior research to be a reasonable proxy for socioeconomic status (Berkowitz et al., 2015; Link-Gelles et al., 2016). We used a median income of $46,435 or less as a conservative estimate for low income as this income level is twice the average federal poverty guideline for a family of four persons within the U.S. over our period of interest (Office of the Assistant Secretary for Planning and Evaluation, n.d.).

First-generation students were defined as those reporting that no parent or guardian had earned a bachelor’s degree.

When students were missing information for any demographic variable, they were categorized with the advantaged group in order to be conservative with the analyses.

Course grades and metrics

We collected students' final course grades, excluding non-numeric grades like lapsed incomplete grades, grades earned under pass/fail policies, and grades representing course or term withdrawals. The numeric grading scale was the same across three institutions, while the other three institutions’ scales each varied slightly (Table 2). Though these variations certainly impact the precise magnitude of our estimates (indeed, grading schemes have been shown to vary even across sections of the same course within a single institution; James, 2023), we maintain that the overall trends and resulting interpretations are insensitive to them.

Table 2 The numeric grading scales represented across the six institutions

We used grade point average in other courses (GPAO) as a control metric for academic performance, defined as a student’s cumulative GPA across all courses (including non-STEM courses) and all terms excluding only the STEM course of interest (Huberth et al., 2015; Koester et al., 2016). For example, if a student enrolled in five courses during their second term, their GPAO for one of these second-term courses is calculated as their average GPA across the other four courses from that term plus all their courses from the first term. That is, GPAO is a way of describing the grades that students typically earn across all their courses while retaining the ability to compare to a student’s grade in a particular course of interest.

Because GPAO is calculated relative to each course enrollment, a student’s GPAO for one course can be different from their GPAO for another course. We selected GPAO as a control metric for academic performance because prior studies have shown its power in predicting academic outcomes over and above high school GPA and standardized exam scores (Huberth et al., 2015; Koester et al., 2016) and in highlighting inequities in STEM courses (Matz et al., 2017). Using GPAO also facilitates cross-institutional studies because it easily accounts for the variance in grading across universities; indeed, four grading scales are represented here among only six universities. With GPAO, grade comparisons are made relative to how students usually perform at their specific institution.

Analytical framework

We calculated a metric that we call the systemic advantage index (SAI) (Castle et al., 2021) for each student based on selected demographic information as a measure that partially represents systemic oppression within the U.S. higher education system. The SAI is derived from the historical academic structures of inequity within this system and is therefore intended to be a structural measure rather than a deficit-oriented measure, aligning with dispositional tenants 2, 3, and 4.

We define SAI as the number of advantages that a student has based on their sex, race/ethnicity, income, and first- versus continuing-generation status, where male students, white students, higher-income students, and continuing-generation students, respectively, are considered advantaged. Though privilege in higher education operates through other dimensions (e.g., ableism; Reinholz & Ridgway, 2021), we limited the current study to these four characteristics based on data availability and consistency. Herein, students range from having zero advantages (i.e., first-generation, low-income, non-white female students) to four advantages (i.e., continuing-generation, higher-income, white male students).

Within this range, there are 16 mutually exclusive groups of students (Table 3); students with the same SAI have the same number of systemic advantages, but the precise advantages can differ. While each advantage contributes equally to a student’s SAI, we acknowledge that the advantages are not necessarily equivalent in how they relate to the outcomes in a course, a key limitation in line with dispositional tenant three. The same index value represents different intersectional combinations of student characteristics and, hence, conflates different experiences. However, the SAI as a coarse estimate facilitates analysis of how these axes of advantage manifest systematically within students’ introductory STEM course outcomes across institutions, providing an advantage over studies of single institutions and single axes of diversity and even meta-analyses that entail methodological variation. That is, our focus is on the system of advantages inherited and persisting in our educational systems, not on individual students or advantages, nor on how students with specific combinations of characteristics experience systemic inequities.

Table 3 The percentage of students by systemic advantage index (SAI) within each institution

Our analysis follows Matz et al. (2017), using students’ grades and their grades in other courses (their GPAO) together to calculate a metric called grade anomaly. Grade anomaly is the difference between course grade and GPAO. A positive grade anomaly indicates that the student received a higher grade in the sampled course compared to their other courses—a grade “bonus”. A negative grade anomaly indicates that the student received a lower grade in the sampled course compared to their other courses—a grade “penalty”. This simple comparison is easy to compute, widely available, and informative both as a kind of control for academic performance at the university level and in providing a measure of the feedback to the student about how well they did in their course compared to their typical performance. We used ordinary least squares regression to evaluate the relationship between SAI and grade anomaly. All analyses and visualizations were carried out using R Statistical Software (R Core Team, 2023).

Results

RQ1: What is the distribution of students by systemic advantage at each institution?

We first evaluated the percentage of students in each SAI group by institution (Table 3; Fig. 2) to understand the distribution of students by SAI groups and to explore similarities and differences in the population at each university. At every institution, the majority of students were represented by groups with three or four advantages. Institution A had the smallest proportion of students (57%) with three or four advantages; though not a statistical outlier, it was furthest from the other institutions. Institutions B, D, E, and F all had greater than 70% of students in groups with three or four advantages and, in particular, Institution E had very few students (< 1%) in SAI group 0. Institution A, and Institution C to a lesser extent, showed a broader distribution over the five SAI groups in general in comparison to the other institutions.

Fig. 2
figure 2

The percentage of students by systemic advantage index (SAI) within each institution

The percentages of students within each SAI subgroup—for example, the percentage of first-generation, lower-income, white female students relative to the three other SAI = 1 subgroups—are provided in Table 4 and Fig. 3. Across the institutions, the SAI = 3 group is dominated by continuing-generation, higher-income, white female students and secondarily by continuing-generation, higher-income, non-white male students. The SAI = 2 group is dominated by continuing-generation, higher-income, non-white female students whereas the SAI = 1 subgroups are more evenly distributed. Considering all these data, an overall lack of low-income students is especially apparent at Institution E.

Table 4 The percentage of students by systemic advantage index (SAI) subgroup within each institution and SAI
Fig. 3
figure 3

The percentage of students by systemic advantage index (SAI) subgroup with SAI = 1 (top), SAI = 2 (middle), or SAI = 3 (bottom) at each institution; see Table 4 for the description of each subgroup

RQ2: What is the relationship between systemic advantage and course outcomes?

We then explored the relationship between systemic advantage and two academic performance variables in the sampled STEM courses at each institution: course grade (Table 5; Fig. 4) and grade anomaly (Table 6; Fig. 5). At every institution, both relationships showed that more favorable course outcomes were generally associated with more systemic advantages. The trends were strikingly similar across universities even though we did not control for differences in grading practices, contexts, and instructor and student populations. One point of contrast is that for all but one institution (Institution E), the average course grade for SAI group 4 was the same or less than that for SAI group 3. However, when considered relative to students’ performance in other courses (Fig. 5), the trend of more favorable course outcomes for those with more advantages was still apparent.

Table 5 Mean course grade for each systemic advantage index group (0 to 4) by institution (A to F)
Fig. 4
figure 4

Students’ mean course grade ± 1 standard error in the sampled STEM course by systemic advantage index (SAI)

Table 6 Mean grade anomaly for each systemic advantage index group (0 to 4) by institution (A to F)
Fig. 5
figure 5

Students’ mean grade anomaly ± 1 standard error in the sampled STEM course by systemic advantage index (SAI)

In particular, regression models (Table 7) showed that increasing SAI had a significantly positive relationship with grade anomaly; a one-unit increase in SAI was associated with an increase in grade anomaly of between 0.04 and 0.10 points, depending on the institution, on average (see the Estimate column in Table 7; the lowest estimate is 0.04 for Institution D and the highest estimate is 0.10 for Institution E). Further, grade anomaly in these early STEM courses was significantly greater for the most advantaged group of students (SAI = 4) versus all other students at every institution (Table 8; differences in grade anomaly for all possible comparisons of SAI groups are provided in Additional file 1).

Table 7 Estimates, standard errors of the mean (SEM), and t values from generalized linear regression models for the effect of systemic advantage index (SAI) on grade anomaly
Table 8 Welch’s two-sample t-test for differences in grade anomaly for SAI = 0, 1, 2, or 3 versus SAI = 4 at each institution

We note that across all SAI groups and institutions, the average grade anomaly for students in their first STEM course, as we have defined them here, was negative. All SAI groups on average received a grade penalty—a lower grade in the sampled course compared to their other courses—ranging between 0.24 and 0.94 grade points (see the Mean columns in Table 6; the lowest mean is − 0.94 for SAI group 0 at Institution E and the highest mean is − 0.24 for SAI group 4 at Institution D), but the penalties were more amplified on average for students in SAI groups with fewer systemic advantages.

Discussion

This study uses a simple method, grounded in a critical historical perspective, to highlight how early university STEM courses provide more favorable course outcomes to students with more systemic advantages, sustaining and increasing disparities between different student populations. Though we found that all students received lower course grades on average in introductory STEM courses relative to their other courses, the most disadvantaged groups of students, as defined by the number of disadvantages, received the largest penalties. The relationship between greater advantage and less grade penalty was significant at each institution, resonating across the broad set of disciplines and contexts represented in students’ first post-secondary interactions with STEM. Although some might argue that the grade anomaly differences result more from student effort than from the learning environment, using GPAO as a control for academic performance mitigates this alternative explanation as it accounts for students’ general study habits across a range of other courses. Further, we contend that these analyses are conservative because withdrawals were excluded, and withdrawals contribute substantially to differences in course outcomes between student groups based on gender, race/ethnicity, income, and first- versus continuing-generation status (Michaels & Milner, 2021). Withdrawals represent another source of information for examining inequities in future work.

Mathematics accounted for approximately half of the courses in the sample. Unlike other disciplines, mathematics courses are often pre- and co-requisite for many STEM degree pathways and are often a general education requirement. Therefore, the trend across institutions of more favorable introductory STEM course outcomes for those with a greater number of systemic advantages is highly concerning. An unfavorable grade in an initial mathematics course can bar students from other discipline-specific courses, forcing them to potentially extend the time to completion of their degree, or spurring them to exit their STEM program. When considering enacting structural change across institutions, it is important to note that STEM itself is not a monolith and the variations in disciplinary cultures are essential to consider when advocating for structural reform across departmental and institutional levels (Reinholz et al., 2019). Therefore, this study adds to the call for a larger conversation about undergraduate mathematics education (Reinholz et al., 2020), with a specific emphasis on the earlier mathematics courses, that centers on structures and challenges deficit discourse (Adiredtja & Louie, 2020). It is critical that equity practices promote a reconfiguration of university mathematics practices, and that researchers and educators grapple with the institutional factors that can hinder change (Ching & Roberts, 2022). As this pattern was not confined to a sole institution, it is important for future work to consider the enactment of disciplines within the university context and examine the discipline through a lens focused on systemic inequity in addition to the course- and institution-level analyses.

Initiatives both national and local to our universities have promoted diversity within STEM disciplines specifically regarding retention of students typically disadvantaged by higher education (Asai, 2020). But given that early STEM course grades are a key factor in STEM retention (Byars-Winston et al., 2010; Dika & D’Amico, 2016; King, 2015; Seymour & Hunter, 2019; Stinebrickner & Stinebrickner, 2014; Witteveen & Attewell, 2020), our findings are concerning and suggest that continued scrutiny for structural inequities in STEM is necessary. Indeed, higher grades in beginning STEM courses, especially relative to other courses, is a predictor of retention (Griffith, 2010), though patterns in persistence differ by demographic characteristics (Costello et al., 2023). Past research provides mechanisms for how inequities in early STEM courses can reflect unequal learning environments rather than student abilities. For example, Black students face implicit and explicit messages that they do not belong in STEM spaces (Basile & Black, 2019; McGee, 2021) and peers, mentors, and instructors are influenced by prevalent stereotypes about women’s STEM ability (Eddy & Brownell, 2016). Matias Dizon and colleagues (2023) also recently observed that women and Black students have stronger negative relationships than their peers between discouragement for speaking in class and GPA. Many other institutional and sociocultural contexts shape how systemic advantages contribute to grade disparities in STEM environments (Griffin, 2019; McGee, 2020).

We contend that three aspects of the current study are significant. First, this study formally introduces the SAI, showing that the selective origins of the university system still function in STEM courses today through structural inequities in course performance. Because the SAI reflects advantage in a broad sense, the index is a structural measure rather than one framed by student deficits. At the same time, the SAI (as any index) has the potential to be misused as it is not inherently anti-deficit; researchers using the SAI must actively pursue asset-based and anti-deficit framing. While we defined SAI here in terms of four categories, other dimensions of advantage (e.g., transfer status, disability status, LGBTQ + status, or English language learning status, among others) can be added and explored according to the history of a particular educational context and data availability. Indeed, we are aware of similar work at the K-12 level based on a six-factor advantage index that uses the four dimensions included here alongside English language status and disability status (Stevens, 2023), though we encourage researchers to report results with the current SAI model in future studies to support broader comparison. An example further afield, the University of California Davis has used a “disadvantage index” for more than a decade to help parse applicants to the medical school (Saul, 2023).

Second, the study is a valuable example of parallel analysis across multiple institutions; as such, the focus of the study is students’ early STEM courses writ large across research-intensive public universities. The consistent trends observed here point to systemic inequities within early STEM courses as a whole, not limited to individual instructors, courses, departments, or even institutions. Indeed, we are not aware of any work that uses data from so many large public institutions, representing various institutional contexts (e.g., land grant origin, minority-serving institution status, relative “eliteness”), to show the pervasive nature of the systemic inequities discussed herein. We note that the SAI factors included here were in part selected as most any institution would have access to similar data, facilitating further comparison. We recognize the privilege that quantitative studies have within research, as noted in dispositional tenant 5, and use this study to not only advocate for systemic change through structural and policy change (Lubienski, 2008), but also urge researchers to delve more deeply into the structural mechanisms that produce these inequities.

Third, we included courses without regard to level, which is important especially with respect to mathematics because students identifying with historically marginalized racial/ethnic groups are more likely to have introductory mathematics course placements that do not align with their aptitudes due to underestimation (Larnell, 2016). Segregation at the K-12 level also contributes to such students receiving fewer educational resources on average than white students in the U.S. (Meatto, 2019). Even when educational resources are available, discrimination from school officials has been shown to prevent racial and ethnic minorities from accessing higher-level coursework (Lewis & Diamond, 2015; Tyson, 2011). By including all course levels, our attention was focused on all early STEM courses that have traditionally played a gatekeeping role.

Limitations

Several limitations are salient, and it is critical to grasp these as a manifestation of the choices that the research team made with respect to the study design, analyses, and interpretations. Indeed, quantitative analyses are not neutral (as stated within dispositional tenant 1) and it is imperative to understand how our choices yield limitations along different axes.

First, comparing students across universities using prescriptive administrative data (meaning data that are limited and narrow) comes at the expense of allowing individuals to articulate important factors in relation to their identities and experiences within their particular context (Lubienski & Gutiérrez, 2008). The SAI is based on the historical foundations of U.S. universities and implemented herein based on the availability of institutional data. We implemented proxies that are limited in order to make comparisons across institutions—sex rather than gender and median income by high school zip code rather than Pell grant eligibility due to lack of access to this data at two institutions. Further, the SAI does not include structural advantages related to disability, sexuality, and other elements of students’ identities because much of this data is not regularized across institutions or available for research. Higher education systems are not neutral to these identities; students experience structural advantages along these different dimensions of identity, and these are critical areas that need more research and system-level reform to strive toward equity. At the same time, though institutional data are limited, they facilitate cross-institutional analyses that can reveal broad inequities, supporting more refined future studies.

Second, the choices we made in constructing the SAI index can errantly promote binary thinking, especially with regard to race/ethnicity. Collapsing race/ethnicity into a binary category based on a structural perspective comes at the expense of seeing challenges faced by specific racial/ethnic groups, and there is the potential to misconstrue this choice as centering whiteness. These specific limitations and the implications of this approach are critical to acknowledge and discuss as the goal of this work is to highlight inequity within undergraduate STEM across institutions as a byproduct of historical structural inequities from the origins of U.S. higher education.

Finally, student groups with different identity profiles are combined herein based on their SAI number, which confounds the challenges faced by specific groups of students. It is important to reiterate that the dimensions of social identities collapsed into systemic advantages are not interchangeable, even though we are using SAI to group students by number of advantages. This also means that for SAI indices comprising multiple groups, the anomalies for some specific subsets are an underreport of their experiences given that the overall SAI results are an average of multiple groups. We collapsed the number of associated advantages that a student brings to their first STEM course to examine the manifestation of systemic inequity. In this way, the SAI is a conservative measure, yet it is able to reveal the presence of systemic inequities across institutions. We contend that the current approach is useful for describing how systemic advantages manifest across students’ intersecting backgrounds within higher education.

Future work

To identify and change the structures and policies that enforce inequities and inequalities, future work should focus on sources of variation between institutions. Comparing outcomes in more specifically aligned courses across institutions in particular disciplines could yield exemplars of structures and systems that enable grade equity. These comparisons would help build understanding of contexts where students experience larger inequalities and where inequities do not manifest in introductory grades, pressing evermore toward equitable experiences for students in foundational STEM courses. Generalization to the introductory STEM context within other types of institutions of higher education beyond the large universities represented here is also a rich path for future studies, as is tying these analyses to qualitative data that richly cover how inequities affect students in STEM.

Conclusion

This study adds to the robust literature on equity in STEM by showing the persistent relationship between advantage and course outcomes for students in early STEM courses. We maintain that the grade penalties observed herein reflect systemic inequities in STEM fields and show that STEM is uniquely inequitable when compared to other higher education disciplines. This study helps continue to shift conversation about student success in STEM from student-based performance differences to a metric that describes advantage broadly; clearly, the language used to address differential outcomes matters (Quinn & Desruisseaux, 2022). Addressing systemic inequity requires changing the learning environment around students. Prior research advocates for course-level changes such as working with instructors to counter ideas that students have fixed ability and intelligence (Canning et al., 2019) and fostering approaches to increase students’ growth mindset and sense of belonging (Chen et al., 2021). Importantly, within the broader pattern of inequity observed herein, campus leaders should identify how their institution contributes and work to remedy the broader systemic problems both locally and across institutions. If the goal is to support marginalized students and promote their academic excellence, we must explore how to identify but more importantly seek to correct inequities within early undergraduate STEM courses. The current study provides one approach for identifying the extent of systemic inequity present in foundational STEM courses.