1 Introduction

The pervasive nature of gender stereotypes is a well-known issue in academic research on gender differences. In the case of the underrepresentation of women in STEM (Science, Technology, Engineering, and Math), stereotypes about the different abilities of women and men in math-related tasks (B. A. Nosek et al. 2009) are recognized as playing a pivotal role in determining women’s abandonment of this sector (Wang and Degol 2017).

While there is a large consensus in academia on the fact that observed gender differences in performance, interests, and attitudes toward STEM are ascribable to social explanations (Eagly and Steffen 1984; Miller et al. 2015), less is known about whether outside academia people still attribute gender differences in STEM domains to other factors, such as biological causes (Kersey et al. 2019; Spelke 2005). However, this distinction may have relevant consequences on attitudes and behaviors.

Studies on the perceived causes underlying gender stereotypes are few and mainly date back to the 90s and the first decade of the twenty-first century. For instance, in a study on beliefs about power-related gender traits of 264 young adults, Neff and Terry-Schmitt (2002) found that those who attributed gender differences in those traits to social causes were linked to egalitarian attitudes toward women, as measured by the Attitudes toward Women ScaleFootnote 1 (Spence and Helmreich 1972), while those who attributed differences to biological causes were linked to traditional attitudes toward women. However, the latter association was found only among male participants. Further research on this is pivotal, as the attribution of gender differences to, for instance, biological –thus immutable – causes is likely to have significantly different consequences than the attribution of these differences to social causes.

This study aims to verify the association between the endorsement of gender-science stereotypes and the causes to which differences between women and men in STEM are attributed. Usually, the instruments used to test peoples' beliefs about women in STEM do not specify the causes that could explain gender differences. However, Project Implicit (B. A. Nosek et al. 1998), a platform where users can test their implicit gender stereotypes on science, collects data on stereotypes endorsement, attitudes toward science, and gender bias attribution.

While previous studies on the STEM gender gap used the data from Project Implicit (Lewis and Lupyan 2020; Miller et al. 2015; Smyth and Nosek 2015), its dataset is usually valued for the availability of a measure of both implicit and explicit stereotypes for several countries. To our knowledge, this is the first time in which the instrument on gender bias attribution is used.

The paper proceeds as follows. First, it reviews the literature on gender-science stereotypes and attribution theory, showing how the two are related. It, then, describes the objective of the study and the methodology used to test it. Finally, it shows the outcomes of the structural equation modeling and discusses these results.

2 Gender-science stereotypes

Stereotypes can be defined as “general expectations about members of particular social groups […] that lead people to overemphasize differences between groups and underestimate variations within groups” (Ellemers 2018). As regards STEM, gender stereotypes on science and math are multiple and concern several gender differences (De Gioannis 2022).

Men are considered to have higher abilities than women in math- and science-related tasks (Schmader et al. 2004), and to be more suitable (Farrell et al. 2015) and interested (Ertl et al. 2017) in science-related fields. STEM, in turn, is usually perceived as a masculine field (del Río et al. 2019; Liu et al. 2010), both because it is considered to require characteristics that have been traditionally attributed to men, e.g., agentic traits (Abele and Wojciszke 2019; Diekman et al. 2017; Sczesny et al. 2018), and because women are still underrepresented in STEM academic and career paths (Alam and Sanchez Tapia 2020). This, in turn, even reinforces the association of the scientific sector with boys and men.

Research on the STEM gender gap has long emphasized that gender stereotypes play a pivotal role in determining both the existence and persistence of the underrepresentation of women in STEM. Previous studies found that the endorsement of gender stereotypes is associated with lower performance in math (Cvencek et al. 2015; Kiefer and Sekaquaptewa 2007; Ramsey and Sekaquaptewa 2011; Sanchis-Segura et al. 2018) and negative attitudes toward math (B. A. Nosek et al. 2002; B. A. Nosek and Smyth 2011). The influence of gender stereotypes crosses the borders of school and education, as it extends also to career intentions and aspirations (Schuster and Martiny 2017; Steffens et al. 2010).

Gender stereotypes can be endorsed both at the implicit and explicit levels. In the first case, they are automatic associations of men with STEM-related fields, whereas in the second case, they are self-reported, thus conscious, beliefs (Whitley and Kite 2016). There is a long debate on whether and to what extent the two capture similar constructs (Greenwald et al. 1998; Hofmann et al. 2005; Zitelny et al. 2017), which is however beyond the scope of this study. Given the explorative nature of the study, here, both implicit and explicit gender stereotypes were taken into account, as explained in the following sections.

3 Attribution theory and stereotypes

The attribution theory, initially developed by Heider (1944) and then refined by Weiner (1979, 1985), describes how people explain the causes of behaviors and events. In particular, it examines how information is gathered and then used to form a causal judgment (Fiske and Taylor 1991). According to this theory, there are three possible attributional dimensions, i.e., locus, stability, and controllability. The locus refers to the type of causes, i.e., the locus of attribution is internal when the cause of a behavior is assigned to a person’s characteristic, while it is external when it is assigned to situations or contextual factors that are, thus, outside a person’s control. The attribution can be also characterized as either stable or unstable and as either controllable or uncontrollable.

These attributions are associated with several factors, among which are stereotypes. As argued in the model on stereotypes and attribution developed by Reyna (2000), in the case of stereotypes, their attribution falls into three potential categories. Using the classification of the attribution theory, the first category assigns the cause to internal but controllable factors, the second to internal and uncontrollable factors, while the third assigns the cause to external, thus uncontrollable, factors. The type of attribution endorsed by the individual is relevant as “each attributional signature is associated with specific emotions and behavioral responses following either desirable or undesirable events” (Reyna 2000, p. 91).

Empirical evidence confirms that differences in the causes to which gender differences are attributed have heterogeneous consequences on behavior. For instance, Dar-Nimrod and Heine (2006) tested the effect of sharing attribution information about gender differences in math on 133 female college students. Those who were either informed that gender differences in math performance are due to experiential – thus malleable – causes or that there are no gender differences outperformed participants who were informed that these differences are due to genetic – thus fixed – causes. Thoman et al. (2008) replicated the same experiment on 66 undergraduate female students. In that case, participants were either informed that men are better than women in math because of natural ability or because of different levels of effort, i.e., higher for men. Results showed that participants exposed to the effort condition completed fewer math problems but correctly solved a higher percentage of them compared to those exposed to the natural ability condition or not explicitly informed about the gender difference.

Furthermore, a few studies tested whether the strength of endorsement of gender stereotypes is related to different beliefs about gender differences’ attribution. Results suggest that the endorsement of strong stereotypes tends to be associated with essentialist thinking, i.e., the belief that social categories are fixed and natural (Bastian and Haslam 2006; Pauker et al. 2010). In a study conducted by Cundiff and Vescio (2016), female undergraduate students who endorsed strong explicit gender stereotypes – measured as the difference in the assignment of stereotypic traits and the assignment of counter-stereotypic traits to women and men – were less likely to attribute gender disparities in the workplace to discrimination.

Empirical evidence seems, thus, to confirm that the association between the attribution of gender differences and gender stereotypes’ endorsement is relevant. Still, contributions to this theme are scarce. Studies on gender-science stereotypes only rarely investigated participants’ beliefs about the causes of STEM gender bias (De Gioannis 2022). Some instruments of gender-science stereotypes’ endorsement include items that specify the cause to which gender differences are attributed, e.g., “Compared to girls, boys mostly increase their mathematical achievement, because of the support of their teachers” (Nurlu 2017), “There are innate biological differences in math abilities of women and men” (Carlana 2019). Unfortunately, testing the difference between these different attributions was outside the scope of those studies.

4 Project implicit

Project Implicit is a non-profit organization of researchers investigating implicit social cognition (B. A. Nosek et al. 1998). The website hosts several implicit association tests (IAT), a test designed by Greenwald et al. (1998) to measure individual differences in implicit cognition and, in particular, implicit stereotypes. In the case of gender and science, the test evaluates the automatic association between two dimensions, e.g., gender and science/humanities sectors.

Along with the IAT, Project Implicit asks users to complete a questionnaire that includes sociodemographic questions, a measure of explicit gender stereotypes, and a question on the causal attribution of women's underrepresentation in STEM. Since the sample of this dataset is self-selected, it cannot be considered a representative sample of a definable population other than that of volunteer visitors to the Project Implicit site. However, it collects data across multiple countries and multiple years, and it is also characterized by a higher variation in age and educational levels than more traditional samples of college students (Smyth and Nosek 2015). Furthermore, the quality and validity of this dataset have already been tested and proven (Charlesworth et al. 2022; Hehman et al. 2019).

The Project Implicit dataset has already been used in previous studies, also on the theme of the STEM gender gap. Just to mention a few, the study by Miller et al. (2015) used Project Implicit data to show the association between implicit gender-science stereotypes and women's representation in STEM in 66 countries. Similarly, Smyth and Nosek (2015) tested the difference in the strength of gender-science stereotypes' endorsement based on whether women belonged to a female- vs. a male-dominated sector. However, to our knowledge, this is the first time that the instrument on gender bias attribution has been used.

5 Research questions and hypotheses

The current study exploited the fact that Project Implicit includes both instruments to measure gender-science stereotypes and questions on gender differences’ attribution to verify the association between the two.

The objective of the study was to test whether and to what extent the two components of gender differences’ attribution proposed by the attribution theory – one related to causes assigned to personal characteristics and the other related to causes assigned to social/contextual factors –were associated with the endorsement of implicit and explicit gender-science stereotypes.

This was tested by taking into account potential differences related to participants’ gender. Based on previous studies on essentialist beliefs and stereotypes, the hypothesis was that attribution is associated with gender stereotypes’ endorsement.

6 Materials and methods

6.1 The sample

Following previous studies using the dataFootnote 2 of Project Implicit (Xu et al. 2020), the sample included data collected from 2007 to 2019 on the gender-science IAT (Charlesworth and Banaji 2019). Only cases from the U.S. were retained, who both performed the test and answered self-reported questions in the survey, and who were at least 18 years old and no more than 30 years old, to be coherent with college samples usually used in studies on gender-science stereotypes (Smyth and Nosek 2015). The final sample consisted of 150,749 individuals (Mage = 22, SDage = 3.36). The majority of participants were female (70%), identified as White (68%), and attended at least some years of college (42%).

6.2 Measures

Explicit gender-science stereotypes' endorsement was measured by the question “Please rate how much you associate science with males or females”, measured on a 7-point Likert scale where 1 corresponded to “Strongly female”, 4 to “Neither female nor male”, and 7 to “Strongly male”.

Implicit gender stereotypes' endorsement was inferred by the D-score resulting from the IAT (Greenwald et al. 2003). The test evaluates the automatic association between two dimensions, each consisting of two categories, by measuring the difference in the time needed to do an association in the case of compatible constructs and the time needed in the case of incompatible constructs. The version proposed by Project Implicit includes gender (e.g., man, son, woman, daughter) as the target and science and liberal arts majors (e.g., astronomy, math, history, arts) as categories. In this case, the compatible association is that of men with science majors and women with liberal arts majors, while the incompatible association is that of men with science majors and women with liberal arts majors.

As regards attribution to gender bias in science, in the questionnaire participants were presented with the following statement “Women hold a smaller portion of the science and engineering faculty positions at top research universities than do men. The following factors were typically included to explore possible explanations of these differences”. They were then asked to rate each of the following six items on a 5-point Likert scale, where 1 corresponded to “Not at all important” and 5 to “Extremely important”.

  • Different proportions of men and women are found among people with the very highest levels of math ability (item ability).

  • On average, men and women differ in their willingness to devote the time required by such high-powered positions (item power).

  • On average, men and women differ naturally in their scientific interest (item interest).

  • On average, men and women differ in their willingness to spend time away from their families (item family).

  • Directly or indirectly, boys and girls tend to receive different levels of encouragement for developing scientific interest (item encouragement).

  • On average, whether consciously or unconsciously, men are favored in hiring and promotion (item discrimination).

Table 1 reports the correlation between those variables, their means, and standard deviations, as well as the kurtosis and skewness scores. By looking at the correlation between the six items of gender bias attribution, there seems to be two distinct components, one grouping ability, interest, family, and power, and the other grouping encouragement and discrimination. As regards gender stereotypes' endorsement, both implicit and explicit gender-science stereotypes are not correlated with the identified causes for gender differences in STEM, while there is a negligible correlation between explicit and implicit stereotypes.

Table 1 Descriptive statistics and correlations between variables

6.3 Analytical methods

The study’s objective was tested using structural equation modeling (SEM). SEM is a family of statistical techniques used to estimate the relationships among constructs, as it is a combination of factor analysis and path analysis (Weston and Gore 2006). It consists of two components: a measurement model describing the relationship between observed variables and latent constructs, and a structural model describing the interrelationship among constructs (Weston and Gore 2006, p. 724). Compared to regression analysis, SEM is, thus, suitable in those cases in which the interest is in the interrelationships between both observed and unobserved, latent variables.

In this case, the measurement model captured the relationship between the six attribution items and the hypothesized two latent constructs, i.e., internal and external causes for gender differences in STEM. The structural model tested the relationship between the two latent constructs and the endorsement of implicit and explicit gender stereotypes.

Following Weston and Gore (2006), the measurement model was tested using factor analysis and the structural model using path analysis. More details on the factorability of the data can be found in the Appendix. After randomly splitting the sample into two subsamples, an exploratory factor analysis (EFA) on the first subsample assessed the number of latent components, while a confirmatory factor analysis (CFA) tested the measurement model with the suggested number of latent constructs on the other subsample.

Finally, the fit of the structural model was tested using path analysis. It was also checked whether there were differences in the model related to the respondents’ gender, a property known as “measurement invariance” (Van De Schoot et al. 2015). Measurement invariance was not reached, meaning that the association between the items and the latent factors (i.e., factor loadings, item intercepts, and item residual variances) varied depending on the gender of the respondents. The structural model was, thus, estimated on the male and female samples separately.

The analysis was conducted in R (R Core Team 2013), using the package psych (Revelle 2020) to perform the EFA, and the package lavaan (Rosseel 2012) to perform the CFA and the path analysis. The package semTools (Jorgensen et al. 2020) was used to compare the male and female groups. The categorical nature of the data imposed some conditions on the application of SEM to the dataset. In particular, the factor analysis was based on the polychoric correlation matrix rather than the Pearson correlation matrix (Holgado-Tello et al. 2010). In the EFA, the Principal Axis extraction method was used. Furthermore, the use of a diagonally weighted least square (DWLS) estimator, robust standard errors, and mean- and variance-adjusted tests, minimized bias on estimates and standard errors (Finney and DiStefano 2013; Koğar and Yilmaz Koğar 2015; Li 2016b, 2016a).

Recently, some researchers have cast doubt on the computation and validity of fit indices in the case of categorical data. Research on the correction of fit indices and new cut-off values for ordinal data (Savalei 2018, 2020; Shi and Maydeu-Olivares 2020) recommends the use of the Standardized Root Mean Square Residual (SRMR) as a fit index, with the traditional thresholds of 0.08 (Hu and Bentler 1999). Indeed, this fit index is the only one insensitive to the choices of estimators (Shi and Maydeu-Olivares 2020) and was thus used to evaluate the model's goodness of fit.

7 Results

7.1 Results from the factor analysis

The exploratory factor analysis compared a one-factor, two-factor, and three-factor solution. Coherently with the initial hypothesis, results suggested that the six items can be grouped into two distinct factors, the former including discrimination and encouragement – called external factor – the latter including power, family, interest, and abilities – called internal factor (more details on the scree test can be found in the Appendix). Table 2 shows the factor loadings for each item in the two-factor solution.

Table 2 Exploratory factor analysis—output of the two-factor solution

The fit of the two-factor model was tested in confirmatory factor analysis. The CFA confirmed the result of the EFA, as the two-factor model had a better fit compared to the one-factor model (SRMRone-factor = 0.158, SRMRtwo-factor = 0.017). The fit of the model as indicated by the SRMR is acceptable (Hu and Bentler 1999). Figure 1 shows the factor loadings and the correlation between the latent constructs from the CFA. Note that the external and internal latent components seemed uncorrelated, while there was a negative and weak correlation between the item interest and the item power. Reliability analysis reported an acceptable Cronbach alpha for both the internal (alpha = 0.76) and external (alpha = 0.73) components.

Fig. 1
figure 1

Results from the confirmatory factor analysis

7.2 Results from path analysis

The Fig. 2 and 3 show the structural and measurement models for the female group and the male group, respectively. As regards the female sample (SRMR = 0.014), both the external and internal components were positively associated with the endorsement of gender-science stereotypes. However, path coefficients suggest that this association was almost null except for that of the external component with explicit gender stereotypes.

Fig. 2
figure 2

Results from the structural and measurement model (female group)

Fig. 3
figure 3

Results from the structural and measurement model (male group)

As regards the male sample (SRMR = 0.022), the internal component was positively associated with gender stereotypes’ endorsement, while the external component was positively associated with explicit gender stereotypes and negatively with implicit gender stereotypes. However, path coefficients suggest that the association was almost null except for that of the external component with explicit gender stereotypes.

While all coefficients were statistically significant, this could be due to the large size of the sample. To check for the robustness of the results, the same model was estimated on a smaller sample randomly selected from that used for the CFA. The results are reported in the Appendix. Only the largest coefficients remained statistically significant, i.e., the association between external and internal components and explicit stereotypes for women, and the association between the internal component and explicit stereotypes for men.

8 Discussion and conclusions

The study tested whether the instrument proposed by Project Implicit, assessing the causes attributed to the gender gap in science, can be decomposed into two components, coherent with the attribution theory, i.e., a personal, internal component and a contextual, external component. Furthermore, it also verified whether and how the two components of attribution were associated with the endorsement of both explicit and implicit gender-science stereotypes.

Results confirmed the initial hypothesis that attribution to the gender gap in science is decomposed into two distinct components, one internal and the other external, as suggested by Reyna (2000). The internal component of attribution included causes pertaining to individual characteristics, while the external component included those referring to social or contextual events. It was, then, tested whether and how the two components were associated with the endorsement of gender-science stereotypes.

Results suggested that this configuration was not equivalent for men and women. Indeed, when looking at the association between the two components and explicit and implicit gender stereotypes, there was an important difference. On the one hand, in both groups, neither the external nor the internal component had a strong – although statistically significant – association with implicit gender-science stereotypes. The robustness check found that the statistical significance of these coefficients was likely due to the large size of the sample. On the other hand, in the female group, explicit gender-science stereotypes had a small and significant association with the external component, whereas, in the male group, they had a significant association with the internal component. The results were confirmed in the robustness check.

This suggests that the instrument measuring gender bias attribution proposed by Project Implicit can be decomposed based on the locus dimension as proposed by the attribution theory. The first component groups those causes with an internal locus as they are characteristics or behaviors of the individual, e.g., interest in science, or performance. Conversely, the second component groups causes that have an external locus, e.g., receiving encouragement for pursuing scientific interests. While the model proposed by Reyna (2000) suggests that stereotypes’ attribution is classified along a second dimension, i.e., controllability, in this case, the factor analysis suggested retaining only two factors, and the internal component was not further classified into characteristics that are controllable or uncontrollable – i.e., that have biological roots. However, this is likely to be due to the low number of items. In the instrument used by Project Implicit, only one of the potential causes of gender differences explicitly referred to biological reasons, i.e., “On average, men and women differ naturally in their scientific interest”.

Even if the Project Implicit instruments do not allow for distinguishing between controllable and uncontrollable attributions, results from the structural model revealed an interesting pattern. Men and women seem to differ in the type of attribution they associate with the evaluation of science as either a feminine or masculine sector. When associating science with males women tended to refer to external, contextual causes, while men tended to refer to internal, individual causes. This difference may be relevant when assessing the endorsement of gender-science stereotypes.

To assess the level of gender stereotypes’ endorsement, researchers tend to use instruments that do not measure why that gender bias is observed. However, as suggested by the application of the attribution theory to stereotypes, this is relevant and should be taken into account. Results from this study seem to suggest that when considering science as a strongly masculine field women have in mind social reasons for this polarization, while men have personal reasons for it. In the first case, it means that while they perceive the sector as masculine, they are also more inclined to believe that it should not be so, as the reason for the overrepresentation of men in this sector is due to contextual factors, such as discrimination. Conversely, men who perceive the sector as masculine are more inclined to believe that this is because of gender differences in attitudes and inclinations toward the scientific sector.

As shown in previous studies, if the cause is perceived as immutable, stereotypes may have a different consequence compared to when it is perceived as not fixed (Dar-Nimrod and Heine 2006; Thoman et al. 2008). In this regard, it is worth mentioning the discussion by Donovan et al. (2019, p. 742). In their field experiment, they found that girls who thought that science ability is innate exhibited also lower interest in STEM. As hypothesized by the authors, "if this fatalistic-essentialist schema is activated within the sociocultural context of the science classroom, then it could be more damaging to girls' future interest in science as messages that increase the belief that science ability is innate and unchangeable could make some girls feel like they do not belong in science".

The consequences of this association between stereotypes’ endorsement and attribution deserve further investigation. Since instruments measuring gender-science stereotypes’ endorsement are highly heterogeneous (De Gioannis 2022), it would be necessary to verify whether this association changes when using other explicit gender stereotypes instruments. The one proposed in the Project Implicit survey is quite generic, as it does not specify what is meant by “association of men/women with science”. This could refer to the percentage of women and men working in this sector or the preference of women and men toward scientific topics.

Finally, note that the results suggest that the automatic association between men/women and scientific/humanistic majors was not associated with gender differences’ attribution, as the path coefficients were almost zero. This heterogeneity in the results may be related to the fact that the “IAT taps into different constructs than those tapped by the explicit measures used in research on the gender-science stereotype” (Zitelny et al. 2017:6). The study conducted by Cundiff and Vescio (2016) tested the association between stereotypes’ attribution and explicit but not implicit gender stereotypes. Similarly, an experiment conducted on a mixed-gender sample of 127 undergraduates (Brescoll and LaFrance 2004) found that exposure to biological explanations for sex differences in the ability to identify plants caused participants to endorse more gender stereotypes. Also in that case gender stereotypes were measured at the explicit level as the difference in the attribution of stereotypical traits to men and women. Therefore, there is a lack of evidence on the association between stereotypes’ attribution and implicit, rather than explicit stereotypes.

Having noted this, the research does have certain limitations. First, due to the cross-sectional nature of data, it is not possible to determine whether beliefs about stereotypes’ attribution influence gender stereotypes’ endorsement or whether the latter influences beliefs about stereotypes’ attribution. For instance, in the above-mentioned study by Cundiff and Vescio (2016), the experiment tested the effect of stereotypes’ endorsement on the attribution of gender disparities in the workplace to discrimination. Here, we are limited to assessing that there is an association and cannot discuss causality.

Furthermore, the sample on which the analysis was performed has some limitations. The Project Implicit website does not impose any restriction on access to the tests, which can be performed on the website for free at any time. This greatly limits the control over participants, who are likely to be already interested in, at least, the notion of implicit bias. This makes the sample likely to be skewed in some ways and not representative of the population. Moreover, note that here it was decided to limit the sample only to U.S. participants, because of the high rate of participation in Project Implicit from this country. This choice allows for a more homogenous sample but limits the possibility of assessing potential heterogeneity. In interpreting the results, it is necessary to keep in mind that this may not hold for individuals coming from other countries and regions of the world. Finally, as mentioned before, the instrument used by Project Implicit to measure the attribution of gender differences in STEM did not allow us to test the difference between internal but controllable causes and internal but uncontrollable ones. Further research is needed, as the dimension of control may be relevant in this context. For instance, while the willingness to spend time away from families is, indeed, a cause related to individual, thus internal, characteristics, it could be, in turn, influenced by social norms and traditional gender roles. However, to test the difference between biological and other controllable internal causes it would be necessary to design a new, ad-hoc instrument as the existing ones cannot serve this purpose.

9 Conclusion

This study suggested that explicit gender-science stereotypes were associated with the attribution of gender differences to external – social, and contextual – factors for women, while they were associated with the attribution of those differences to internal – personal – characteristics for men. This may have relevant implications and further research would be necessary to understand the consequences on choices and behaviors.

As already emerged in previous studies, the consequences of attributing gender differences to biological rather than social factors can be damaging as they, on the one hand, justify the existence of these disparities and, on the other hand, can discourage women from entering the STEM sector. As suggested by Cundiff and Vescio (2016, p. 135), "providing non-essentialist views of gender that emphasize gender differences as malleable and due to social factors, rather than fixed and rooted biology, may offer an avenue for potential attitude change and increased support for strategies designed to address bias".

To do so, it is necessary to conduct further research on the theme. This will require the use of new, more refined, instruments that – contrary to existing ones measuring gender stereotypes – are specially designed to measure beliefs about causal attribution of gender differences. As proven in this study, the instrument included in the Project Implicit dataset may be a good starting point.